arxiv.org.rss.091.xml - sfeed_tests - sfeed tests and RSS and Atom files
(HTM) git clone git://git.codemadness.org/sfeed_tests
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
arxiv.org.rss.091.xml (806587B)
---
1 <?xml version="1.0" encoding="UTF-8"?>
2
3 <!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
4 "http://www.rssboard.org/rss-0.91.dtd">
5
6 <rss version="0.91">
7
8 <channel>
9 <title>cs updates on arXiv.org</title>
10 <link>http://fr.arxiv.org/</link>
11 <description>Computer Science (cs) updates on the arXiv.org e-print archive</description>
12 <language>en-us</language>
13 <pubDate>Fri, 30 Oct 2020 00:30:00 GMT</pubDate>
14 <lastBuildDate>Fri, 30 Oct 2020 00:30:00 GMT</lastBuildDate>
15 <managingEditor>www-admin@arxiv.org</managingEditor>
16
17 <image>
18 <title>arXiv.org</title>
19 <url>http://fr.arxiv.org/icons/sfx.gif</url>
20 <link>http://fr.arxiv.org/</link>
21 </image>
22 <item>
23 <title>Raw Audio for Depression Detection Can Be More Robust Against Gender Imbalance than Mel-Spectrogram Features. (arXiv:2010.15120v1 [cs.SD])</title>
24 <link>http://fr.arxiv.org/abs/2010.15120</link>
25 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bailey_A/0/1/0/all/0/1">Andrew Bailey</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plumbley_M/0/1/0/all/0/1">Mark D. Plumbley</a></p>
26
27 <p>Depression is a large-scale mental health problem and a challenging area for
28 machine learning researchers in terms of the detection of depression. Datasets
29 such as the Distress Analysis Interview Corpus - Wizard of Oz have been created
30 to aid research in this area. However, on top of the challenges inherent in
31 accurately detecting depression, biases in datasets may result in skewed
32 classification performance. In this paper we examine gender bias in the
33 DAIC-WOZ dataset using audio-based deep neural networks. We show that gender
34 biases in DAIC-WOZ can lead to an overreporting of performance, which has been
35 overlooked in the past due to the same gender biases being present in the test
36 set. By using raw audio and different concepts from Fair Machine Learning, such
37 as data re-distribution, we can mitigate against the harmful effects of bias.
38 </p>
39 </description>
40 </item>
41 <item>
42 <title>papaya2: 2D Irreducible Minkowski Tensor computation. (arXiv:2010.15138v1 [cs.GR])</title>
43 <link>http://fr.arxiv.org/abs/2010.15138</link>
44 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Schaller_F/0/1/0/all/0/1">Fabian M. Schaller</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wagner_J/0/1/0/all/0/1">Jenny Wagner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kapfer_S/0/1/0/all/0/1">Sebastian C. Kapfer</a></p>
45
46 <p>A common challenge in scientific and technical domains is the quantitative
47 description of geometries and shapes, e.g. in the analysis of microscope
48 imagery or astronomical observation data. Frequently, it is desirable to go
49 beyond scalar shape metrics such as porosity and surface to volume ratios
50 because the samples are anisotropic or because direction-dependent quantities
51 such as conductances or elasticity are of interest. Minkowski Tensors are a
52 systematic family of versatile and robust higher-order shape descriptors that
53 allow for shape characterization of arbitrary order and promise a path to
54 systematic structure-function relationships for direction-dependent properties.
55 Papaya2 is a software to calculate 2D higher-order shape metrics with a library
56 interface, support for Irreducible Minkowski Tensors and interpolated marching
57 squares. Extensions to Matlab, JavaScript and Python are provided as well.
58 While the tensor of inertia is computed by many tools, we are not aware of
59 other open-source software which provides higher-rank shape characterization in
60 2D.
61 </p>
62 </description>
63 </item>
64 <item>
65 <title>DeSMOG: Detecting Stance in Media On Global Warming. (arXiv:2010.15149v1 [cs.CL])</title>
66 <link>http://fr.arxiv.org/abs/2010.15149</link>
67 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Luo_Y/0/1/0/all/0/1">Yiwei Luo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Card_D/0/1/0/all/0/1">Dallas Card</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jurafsky_D/0/1/0/all/0/1">Dan Jurafsky</a></p>
68
69 <p>Citing opinions is a powerful yet understudied strategy in argumentation. For
70 example, an environmental activist might say, "Leading scientists agree that
71 global warming is a serious concern," framing a clause which affirms their own
72 stance ("that global warming is serious") as an opinion endorsed ("[scientists]
73 agree") by a reputable source ("leading"). In contrast, a global warming denier
74 might frame the same clause as the opinion of an untrustworthy source with a
75 predicate connoting doubt: "Mistaken scientists claim [...]." Our work studies
76 opinion-framing in the global warming (GW) debate, an increasingly partisan
77 issue that has received little attention in NLP. We introduce DeSMOG, a dataset
78 of stance-labeled GW sentences, and train a BERT classifier to study novel
79 aspects of argumentation in how different sides of a debate represent their own
80 and each other's opinions. From 56K news articles, we find that similar
81 linguistic devices for self-affirming and opponent-doubting discourse are used
82 across GW-accepting and skeptic media, though GW-skeptical media shows more
83 opponent-doubt. We also find that authors often characterize sources as
84 hypocritical, by ascribing opinions expressing the author's own view to source
85 entities known to publicly endorse the opposing view. We release our stance
86 dataset, model, and lexicons of framing devices for future work on
87 opinion-framing and the automatic detection of GW stance.
88 </p>
89 </description>
90 </item>
91 <item>
92 <title>On the Optimality and Convergence Properties of the Learning Model Predictive Controller. (arXiv:2010.15153v1 [math.OC])</title>
93 <link>http://fr.arxiv.org/abs/2010.15153</link>
94 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Rosolia_U/0/1/0/all/0/1">Ugo Rosolia</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lian_Y/0/1/0/all/0/1">Yingzhao Lian</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Maddalena_E/0/1/0/all/0/1">Emilio T. Maddalena</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ferrari_Trecate_G/0/1/0/all/0/1">Giancarlo Ferrari-Trecate</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Jones_C/0/1/0/all/0/1">Colin N. Jones</a></p>
95
96 <p>In this technical note we analyse the performance improvement and optimality
97 properties of the Learning Model Predictive Control (LMPC) strategy for linear
98 deterministic systems. The LMPC framework is a policy iteration scheme where
99 closed-loop trajectories are used to update the control policy for the next
100 execution of the control task. We show that, when a Linear Independence
101 Constraint Qualification (LICQ) condition holds, the LMPC scheme guarantees
102 strict iterative performance improvement and optimality, meaning that the
103 closed-loop cost evaluated over the entire task converges asymptotically to the
104 optimal cost of the infinite-horizon control problem. Compared to previous
105 works this sufficient LICQ condition can be easily checked, it holds for a
106 larger class of systems and it can be used to adaptively select the prediction
107 horizon of the controller, as demonstrated by a numerical example.
108 </p>
109 </description>
110 </item>
111 <item>
112 <title>Kernel Aggregated Fast Multipole Method: Efficient summation of Laplace and Stokes kernel functions. (arXiv:2010.15155v1 [math.NA])</title>
113 <link>http://fr.arxiv.org/abs/2010.15155</link>
114 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Yan_W/0/1/0/all/0/1">Wen Yan</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Blackwell_R/0/1/0/all/0/1">Robert Blackwell</a></p>
115
116 <p>Many different simulation methods for Stokes flow problems involve a common
117 computationally intense task---the summation of a kernel function over $O(N^2)$
118 pairs of points. One popular technique is the Kernel Independent Fast Multipole
119 Method (KIFMM), which constructs a spatial adaptive octree and places a small
120 number of equivalent multipole and local points around each octree box, and
121 completes the kernel sum with $O(N)$ performance. However, the KIFMM cannot be
122 used directly with nonlinear kernels, can be inefficient for complicated linear
123 kernels, and in general is difficult to implement compared to less-efficient
124 alternatives such as Ewald-type methods. Here we present the Kernel Aggregated
125 Fast Multipole Method (KAFMM), which overcomes these drawbacks by allowing
126 different kernel functions to be used for specific stages of octree traversal.
127 In many cases a simpler linear kernel suffices during the most extensive stage
128 of octree traversal, even for nonlinear kernel summation problems. The KAFMM
129 thereby improves computational efficiency in general and also allows efficient
130 evaluation of some nonlinear kernel functions such as the regularized
131 Stokeslet. We have implemented our method as an open-source software library
132 STKFMM with support for Laplace kernels, the Stokeslet, regularized Stokeslet,
133 Rotne-Prager-Yamakawa (RPY) tensor, and the Stokes double-layer and traction
134 operators. Open and periodic boundary conditions are supported for all kernels,
135 and the no-slip wall boundary condition is supported for the Stokeslet and RPY
136 tensor. The package is designed to be ready-to-use as well as being readily
137 extensible to additional kernels. Massive parallelism is supported with mixed
138 OpenMP and MPI.
139 </p>
140 </description>
141 </item>
142 <item>
143 <title>Diagnostic data integration using deep neural networks for real-time plasma analysis. (arXiv:2010.15156v1 [physics.comp-ph])</title>
144 <link>http://fr.arxiv.org/abs/2010.15156</link>
145 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Garola_A/0/1/0/all/0/1">A. Rigoni Garola</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Cavazzana_R/0/1/0/all/0/1">R. Cavazzana</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Gobbin_M/0/1/0/all/0/1">M. Gobbin</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Delogu_R/0/1/0/all/0/1">R.S. Delogu</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Manduchi_G/0/1/0/all/0/1">G. Manduchi</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Taliercio_C/0/1/0/all/0/1">C. Taliercio</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Luchetta_A/0/1/0/all/0/1">A. Luchetta</a></p>
146
147 <p>Recent advances in acquisition equipment is providing experiments with
148 growing amounts of precise yet affordable sensors. At the same time an improved
149 computational power, coming from new hardware resources (GPU, FPGA, ACAP), has
150 been made available at relatively low costs. This led us to explore the
151 possibility of completely renewing the chain of acquisition for a fusion
152 experiment, where many high-rate sources of data, coming from different
153 diagnostics, can be combined in a wide framework of algorithms. If on one hand
154 adding new data sources with different diagnostics enriches our knowledge about
155 physical aspects, on the other hand the dimensions of the overall model grow,
156 making relations among variables more and more opaque. A new approach for the
157 integration of such heterogeneous diagnostics, based on composition of deep
158 \textit{variational autoencoders}, could ease this problem, acting as a
159 structural sparse regularizer. This has been applied to RFX-mod experiment
160 data, integrating the soft X-ray linear images of plasma temperature with the
161 magnetic state.
162 </p>
163 <p>However to ensure a real-time signal analysis, those algorithmic techniques
164 must be adapted to run in well suited hardware. In particular it is shown that,
165 attempting a quantization of neurons transfer functions, such models can be
166 modified to create an embedded firmware. This firmware, approximating the deep
167 inference model to a set of simple operations, fits well with the simple logic
168 units that are largely abundant in FPGAs. This is the key factor that permits
169 the use of affordable hardware with complex deep neural topology and operates
170 them in real-time.
171 </p>
172 </description>
173 </item>
174 <item>
175 <title>Panoster: End-to-end Panoptic Segmentation of LiDAR Point Clouds. (arXiv:2010.15157v1 [cs.CV])</title>
176 <link>http://fr.arxiv.org/abs/2010.15157</link>
177 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gasperini_S/0/1/0/all/0/1">Stefano Gasperini</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mahani_M/0/1/0/all/0/1">Mohammad-Ali Nikouei Mahani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marcos_Ramiro_A/0/1/0/all/0/1">Alvaro Marcos-Ramiro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Navab_N/0/1/0/all/0/1">Nassir Navab</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tombari_F/0/1/0/all/0/1">Federico Tombari</a></p>
178
179 <p>Panoptic segmentation has recently unified semantic and instance
180 segmentation, previously addressed separately, thus taking a step further
181 towards creating more comprehensive and efficient perception systems. In this
182 paper, we present Panoster, a novel proposal-free panoptic segmentation method
183 for point clouds. Unlike previous approaches relying on several steps to group
184 pixels or points into objects, Panoster proposes a simplified framework
185 incorporating a learning-based clustering solution to identify instances. At
186 inference time, this acts as a class-agnostic semantic segmentation, allowing
187 Panoster to be fast, while outperforming prior methods in terms of accuracy.
188 Additionally, we showcase how our approach can be flexibly and effectively
189 applied on diverse existing semantic architectures to deliver panoptic
190 predictions.
191 </p>
192 </description>
193 </item>
194 <item>
195 <title>CNN Profiler on Polar Coordinate Images for Tropical Cyclone Structure Analysis. (arXiv:2010.15158v1 [cs.CV])</title>
196 <link>http://fr.arxiv.org/abs/2010.15158</link>
197 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1">Boyo Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1">Buo-Fu Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hsiao_C/0/1/0/all/0/1">Chun-Min Hsiao</a></p>
198
199 <p>Convolutional neural networks (CNN) have achieved great success in analyzing
200 tropical cyclones (TC) with satellite images in several tasks, such as TC
201 intensity estimation. In contrast, TC structure, which is conventionally
202 described by a few parameters estimated subjectively by meteorology
203 specialists, is still hard to be profiled objectively and routinely. This study
204 applies CNN on satellite images to create the entire TC structure profiles,
205 covering all the structural parameters. By utilizing the meteorological domain
206 knowledge to construct TC wind profiles based on historical structure
207 parameters, we provide valuable labels for training in our newly released
208 benchmark dataset. With such a dataset, we hope to attract more attention to
209 this crucial issue among data scientists. Meanwhile, a baseline is established
210 with a specialized convolutional model operating on polar-coordinates. We
211 discovered that it is more feasible and physically reasonable to extract
212 structural information on polar-coordinates, instead of Cartesian coordinates,
213 according to a TC's rotational and spiral natures. Experimental results on the
214 released benchmark dataset verified the robustness of the proposed model and
215 demonstrated the potential for applying deep learning techniques for this
216 barely developed yet important topic.
217 </p>
218 </description>
219 </item>
220 <item>
221 <title>Sizeless: Predicting the optimal size of serverless functions. (arXiv:2010.15162v1 [cs.DC])</title>
222 <link>http://fr.arxiv.org/abs/2010.15162</link>
223 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Eismann_S/0/1/0/all/0/1">Simon Eismann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bui_L/0/1/0/all/0/1">Long Bui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Grohmann_J/0/1/0/all/0/1">Johannes Grohmann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Abad_C/0/1/0/all/0/1">Cristina L. Abad</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Herbst_N/0/1/0/all/0/1">Nikolas Herbst</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kounev_S/0/1/0/all/0/1">Samuel Kounev</a></p>
224
225 <p>Serverless functions are a cloud computing paradigm that reduces operational
226 overheads for developers, because the cloud provider takes care of resource
227 management tasks such as resource provisioning, deployment, and auto-scaling.
228 The only resource management task that developers are still in charge of is
229 resource sizing, that is, selecting how much resources are allocated to each
230 worker instance. However, due to the challenging nature of resource sizing,
231 developers often neglect it despite its significant cost and performance
232 benefits. Existing approaches aiming to automate serverless functions resource
233 sizing require dedicated performance tests, which are time consuming to
234 implement and maintain.
235 </p>
236 <p>In this paper, we introduce Sizeless -- an approach to predict the optimal
237 resource size of a serverless function using monitoring data from a single
238 resource size. As our approach requires only production monitoring data,
239 developers no longer need to implement and maintain representative performance
240 tests. Furthermore, it enables cloud providers, which cannot engage in testing
241 the performance of user functions, to implement resource sizing on a platform
242 level and automate the last resource management task associated with serverless
243 functions. In our evaluation, Sizeless was able to predict the execution time
244 of the serverless functions of a realistic server-less application with a
245 median prediction accuracy of 93.1%. Using Sizeless to optimize the memory size
246 of this application results in a speedup of 16.7% while simultaneously
247 decreasing costs by 2.5%.
248 </p>
249 </description>
250 </item>
251 <item>
252 <title>Polymer Informatics with Multi-Task Learning. (arXiv:2010.15166v1 [cond-mat.mtrl-sci])</title>
253 <link>http://fr.arxiv.org/abs/2010.15166</link>
254 <description><p>Authors: <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Kunneth_C/0/1/0/all/0/1">Christopher K&#xfc;nneth</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Rajan_A/0/1/0/all/0/1">Arunkumar Chitteth Rajan</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Tran_H/0/1/0/all/0/1">Huan Tran</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Chen_L/0/1/0/all/0/1">Lihua Chen</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Kim_C/0/1/0/all/0/1">Chiho Kim</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Ramprasad_R/0/1/0/all/0/1">Rampi Ramprasad</a></p>
255
256 <p>Modern data-driven tools are transforming application-specific polymer
257 development cycles. Surrogate models that can be trained to predict the
258 properties of new polymers are becoming commonplace. Nevertheless, these models
259 do not utilize the full breadth of the knowledge available in datasets, which
260 are oftentimes sparse; inherent correlations between different property
261 datasets are disregarded. Here, we demonstrate the potency of multi-task
262 learning approaches that exploit such inherent correlations effectively,
263 particularly when some property dataset sizes are small. Data pertaining to 36
264 different properties of over $13, 000$ polymers (corresponding to over $23,000$
265 data points) are coalesced and supplied to deep-learning multi-task
266 architectures. Compared to conventional single-task learning models (that are
267 trained on individual property datasets independently), the multi-task approach
268 is accurate, efficient, scalable, and amenable to transfer learning as more
269 data on the same or different properties become available. Moreover, these
270 models are interpretable. Chemical rules, that explain how certain features
271 control trends in specific property values, emerge from the present work,
272 paving the way for the rational design of application specific polymers meeting
273 desired property or performance objectives.
274 </p>
275 </description>
276 </item>
277 <item>
278 <title>Semi-Grant-Free NOMA: Ergodic Rates Analysis with Random Deployed Users. (arXiv:2010.15169v1 [cs.IT])</title>
279 <link>http://fr.arxiv.org/abs/2010.15169</link>
280 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1">Chao Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1">Yuanwei Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yi_W/0/1/0/all/0/1">Wenqiang Yi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_Z/0/1/0/all/0/1">Zhijin Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_Z/0/1/0/all/0/1">Zhiguo Ding</a></p>
281
282 <p>Semi-grant-free (Semi-GF) non-orthogonal multiple access (NOMA) enables
283 grant-free (GF) and grant-based (GB) users to share the same resource blocks,
284 thereby balancing the connectivity and stability of communications. This letter
285 analyzes ergodic rates of Semi-GF NOMA systems. First, this paper exploits a
286 Semi-GF protocol, denoted as dynamic protocol, for selecting GF users into the
287 occupied GB channels via the GB user's instantaneous received power. Under this
288 protocol, the closed-form analytical and approximated expressions for ergodic
289 rates are derived. The numerical results illustrate that the GF user (weak NOMA
290 user) has a performance upper limit, while the ergodic rate of the GB user
291 (strong NOMA user) increases linearly versus the transmit signal-to-noise
292 ratio.
293 </p>
294 </description>
295 </item>
296 <item>
297 <title>Slicing a single wireless collision channel among throughput- and timeliness-sensitive services. (arXiv:2010.15171v1 [cs.IT])</title>
298 <link>http://fr.arxiv.org/abs/2010.15171</link>
299 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Leyva_Mayorga_I/0/1/0/all/0/1">Israel Leyva-Mayorga</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chiariotti_F/0/1/0/all/0/1">Federico Chiariotti</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stefanovic_C/0/1/0/all/0/1">&#x10c;edomir Stefanovi&#x107;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kalor_A/0/1/0/all/0/1">Anders E. Kal&#xf8;r</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Popovski_P/0/1/0/all/0/1">Petar Popovski</a></p>
300
301 <p>The fifth generation (5G) wireless system has a platform-driven approach,
302 aiming to support heterogeneous connections with very diverse requirements. The
303 shared wireless resources should be sliced in a way that each user perceives
304 that its requirement has been met. Heterogeneity challenges the traditional
305 notion of resource efficiency, as the resource usage has cater for, e.g. rate
306 maximization for one user and timeliness requirement for another user. This
307 paper treats a model for radio access network (RAN) uplink, where a
308 throughput-demanding broadband user shares wireless resources with an
309 intermittently active user that wants to optimize the timeliness, expressed in
310 terms of latency-reliability or Age of Information (AoI). We evaluate the
311 trade-offs between throughput and timeliness for Orthogonal Multiple Access
312 (OMA) as well as Non-Orthogonal Multiple Access (NOMA) with successive
313 interference cancellation (SIC). We observe that NOMA with SIC, in a
314 conservative scenario with destructive collisions, is just slightly inferior to
315 that of OMA, which indicates that it may offer significant benefits in
316 practical deployments where the capture effect is frequently encountered. On
317 the other hand, finding the optimal configuration of NOMA with SIC depends on
318 the activity pattern of the intermittent user, to which OMA is insensitive.
319 </p>
320 </description>
321 </item>
322 <item>
323 <title>Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement. (arXiv:2010.15174v1 [cs.SD])</title>
324 <link>http://fr.arxiv.org/abs/2010.15174</link>
325 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hsieh_T/0/1/0/all/0/1">Tsun-An Hsieh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_C/0/1/0/all/0/1">Cheng Yu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_S/0/1/0/all/0/1">Szu-Wei Fu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lu_X/0/1/0/all/0/1">Xugang Lu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tsao_Y/0/1/0/all/0/1">Yu Tsao</a></p>
326
327 <p>Speech enhancement (SE) aims to improve speech quality and intelligibility,
328 which are both related to a smooth transition in speech segments that may carry
329 linguistic information, e.g. phones and syllables. In this study, we took
330 phonetic characteristics into account in the SE training process. Hence, we
331 designed a phone-fortified perceptual (PFP) loss, and the training of our SE
332 model was guided by PFP loss. In PFP loss, phonetic characteristics are
333 extracted by wav2vec, an unsupervised learning model based on the contrastive
334 predictive coding (CPC) criterion. Different from previous deep-feature-based
335 approaches, the proposed approach explicitly uses the phonetic information in
336 the deep feature extraction process to guide the SE model training. To test the
337 proposed approach, we first confirmed that the wav2vec representations carried
338 clear phonetic information using a t-distributed stochastic neighbor embedding
339 (t-SNE) analysis. Next, we observed that the proposed PFP loss was more
340 strongly correlated with the perceptual evaluation metrics than point-wise and
341 signal-level losses, thus achieving higher scores for standardized quality and
342 intelligibility evaluation metrics in the Voice Bank--DEMAND dataset.
343 </p>
344 </description>
345 </item>
346 <item>
347 <title>A Study on Efficiency in Continual Learning Inspired by Human Learning. (arXiv:2010.15187v1 [cs.LG])</title>
348 <link>http://fr.arxiv.org/abs/2010.15187</link>
349 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ball_P/0/1/0/all/0/1">Philip J. Ball</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yingzhen Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lamb_A/0/1/0/all/0/1">Angus Lamb</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1">Cheng Zhang</a></p>
350
351 <p>Humans are efficient continual learning systems; we continually learn new
352 skills from birth with finite cells and resources. Our learning is highly
353 optimized both in terms of capacity and time while not suffering from
354 catastrophic forgetting. In this work we study the efficiency of continual
355 learning systems, taking inspiration from human learning. In particular,
356 inspired by the mechanisms of sleep, we evaluate popular pruning-based
357 continual learning algorithms, using PackNet as a case study. First, we
358 identify that weight freezing, which is used in continual learning without
359 biological justification, can result in over $2\times$ as many weights being
360 used for a given level of performance. Secondly, we note the similarity in
361 human day and night time behaviors to the training and pruning phases
362 respectively of PackNet. We study a setting where the pruning phase is given a
363 time budget, and identify connections between iterative pruning and multiple
364 sleep cycles in humans. We show there exists an optimal choice of iteration
365 v.s. epochs given different tasks.
366 </p>
367 </description>
368 </item>
369 <item>
370 <title>Explicit stabilized multirate method for stiff stochastic differential equations. (arXiv:2010.15193v1 [math.NA])</title>
371 <link>http://fr.arxiv.org/abs/2010.15193</link>
372 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Abdulle_A/0/1/0/all/0/1">Assyr Abdulle</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Souza_G/0/1/0/all/0/1">Giacomo Rosilho de Souza</a></p>
373
374 <p>Stabilized explicit methods are particularly efficient for large systems of
375 stiff stochastic differential equations (SDEs) due to their extended stability
376 domain. However, they loose their efficiency when a severe stiffness is induced
377 by very few "fast" degrees of freedom, as the stiff and nonstiff terms are
378 evaluated concurrently. Therefore, inspired by [A. Abdulle, M. J. Grote, and G.
379 Rosilho de Souza, Preprint (2020), <a href="/abs/2006.00744">arXiv:2006.00744</a>] we introduce a stochastic
380 modified equation whose stiffness depends solely on the "slow" terms. By
381 integrating this modified equation with a stabilized explicit scheme we devise
382 a multirate method which overcomes the bottleneck caused by a few severely
383 stiff terms and recovers the efficiency of stabilized schemes for large systems
384 of nonlinear SDEs. The scheme is not based on any scale separation assumption
385 of the SDE and therefore it is employable for problems stemming from the
386 spatial discretization of stochastic parabolic partial differential equations
387 on locally refined grids. The multirate scheme has strong order 1/2, weak order
388 1 and its stability is proved on a model problem. Numerical experiments confirm
389 the efficiency and accuracy of the scheme.
390 </p>
391 </description>
392 </item>
393 <item>
394 <title>Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments. (arXiv:2010.15195v1 [cs.LG])</title>
395 <link>http://fr.arxiv.org/abs/2010.15195</link>
396 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Carvalho_W/0/1/0/all/0/1">Wilka Carvalho</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liang_A/0/1/0/all/0/1">Anthony Liang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_K/0/1/0/all/0/1">Kimin Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sohn_S/0/1/0/all/0/1">Sungryull Sohn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_H/0/1/0/all/0/1">Honglak Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lewis_R/0/1/0/all/0/1">Richard L. Lewis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_S/0/1/0/all/0/1">Satinder Singh</a></p>
397
398 <p>First-person object-interaction tasks in high-fidelity, 3D, simulated
399 environments such as the AI2Thor virtual home-environment pose significant
400 sample-efficiency challenges for reinforcement learning (RL) agents learning
401 from sparse task rewards. To alleviate these challenges, prior work has
402 provided extensive supervision via a combination of reward-shaping,
403 ground-truth object-information, and expert demonstrations. In this work, we
404 show that one can learn object-interaction tasks from scratch without
405 supervision by learning an attentive object-model as an auxiliary task during
406 task learning with an object-centric relational RL agent. Our key insight is
407 that learning an object-model that incorporates object-attention into forward
408 prediction provides a dense learning signal for unsupervised representation
409 learning of both objects and their relationships. This, in turn, enables faster
410 policy learning for an object-centric relational RL agent. We demonstrate our
411 agent by introducing a set of challenging object-interaction tasks in the
412 AI2Thor environment where learning with our attentive object-model is key to
413 strong performance. Specifically, we compare our agent and relational RL agents
414 with alternative auxiliary tasks to a relational RL agent equipped with
415 ground-truth object-information, and show that learning with our object-model
416 best closes the performance gap in terms of both learning speed and maximum
417 success rate. Additionally, we find that incorporating object-attention into an
418 object-model's forward predictions is key to learning representations which
419 capture object-category and object-state.
420 </p>
421 </description>
422 </item>
423 <item>
424 <title>A fast and scalable computational framework for large-scale and high-dimensional Bayesian optimal experimental design. (arXiv:2010.15196v1 [math.NA])</title>
425 <link>http://fr.arxiv.org/abs/2010.15196</link>
426 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Wu_K/0/1/0/all/0/1">Keyi Wu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Chen_P/0/1/0/all/0/1">Peng Chen</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ghattas_O/0/1/0/all/0/1">Omar Ghattas</a></p>
427
428 <p>We develop a fast and scalable computational framework to solve large-scale
429 and high-dimensional Bayesian optimal experimental design problems. In
430 particular, we consider the problem of optimal observation sensor placement for
431 Bayesian inference of high-dimensional parameters governed by partial
432 differential equations (PDEs), which is formulated as an optimization problem
433 that seeks to maximize an expected information gain (EIG). Such optimization
434 problems are particularly challenging due to the curse of dimensionality for
435 high-dimensional parameters and the expensive solution of large-scale PDEs. To
436 address these challenges, we exploit two essential properties of such problems:
437 the low-rank structure of the Jacobian of the parameter-to-observable map to
438 extract the intrinsically low-dimensional data-informed subspace, and the high
439 correlation of the approximate EIGs by a series of approximations to reduce the
440 number of PDE solves. We propose an efficient offline-online decomposition for
441 the optimization problem: an offline stage of computing all the quantities that
442 require a limited number of PDE solves independent of parameter and data
443 dimensions, and an online stage of optimizing sensor placement that does not
444 require any PDE solve. For the online optimization, we propose a swapping
445 greedy algorithm that first construct an initial set of sensors using leverage
446 scores and then swap the chosen sensors with other candidates until certain
447 convergence criteria are met. We demonstrate the efficiency and scalability of
448 the proposed computational framework by a linear inverse problem of inferring
449 the initial condition for an advection-diffusion equation, and a nonlinear
450 inverse problem of inferring the diffusion coefficient of a log-normal
451 diffusion equation, with both the parameter and data dimensions ranging from a
452 few tens to a few thousands.
453 </p>
454 </description>
455 </item>
456 <item>
457 <title>Forecasting Hamiltonian dynamics without canonical coordinates. (arXiv:2010.15201v1 [cs.LG])</title>
458 <link>http://fr.arxiv.org/abs/2010.15201</link>
459 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Choudhary_A/0/1/0/all/0/1">Anshul Choudhary</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lindner_J/0/1/0/all/0/1">John F. Lindner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Holliday_E/0/1/0/all/0/1">Elliott G. Holliday</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miller_S/0/1/0/all/0/1">Scott T. Miller</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sinha_S/0/1/0/all/0/1">Sudeshna Sinha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ditto_W/0/1/0/all/0/1">William L. Ditto</a></p>
460
461 <p>Conventional neural networks are universal function approximators, but
462 because they are unaware of underlying symmetries or physical laws, they may
463 need impractically many training data to approximate nonlinear dynamics.
464 Recently introduced Hamiltonian neural networks can efficiently learn and
465 forecast dynamical systems that conserve energy, but they require special
466 inputs called canonical coordinates, which may be hard to infer from data. Here
467 we significantly expand the scope of such networks by demonstrating a simple
468 way to train them with any set of generalised coordinates, including easily
469 observable ones.
470 </p>
471 </description>
472 </item>
473 <item>
474 <title>Micromobility in Smart Cities: A Closer Look at Shared Dockless E-Scooters via Big Social Data. (arXiv:2010.15203v1 [cs.SI])</title>
475 <link>http://fr.arxiv.org/abs/2010.15203</link>
476 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_Y/0/1/0/all/0/1">Yunhe Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhong_D/0/1/0/all/0/1">Dong Zhong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sun_P/0/1/0/all/0/1">Peng Sun</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zheng_W/0/1/0/all/0/1">Weijian Zheng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_Q/0/1/0/all/0/1">Qinglei Cao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Luo_X/0/1/0/all/0/1">Xi Luo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lu_Z/0/1/0/all/0/1">Zheng Lu</a></p>
477
478 <p>The micromobility is shaping first- and last-mile travels in urban areas.
479 Recently, shared dockless electric scooters (e-scooters) have emerged as a
480 daily alternative to driving for short-distance commuters in large cities due
481 to the affordability, easy accessibility via an app, and zero emissions.
482 Meanwhile, e-scooters come with challenges in city management, such as traffic
483 rules, public safety, parking regulations, and liability issues. In this paper,
484 we collected and investigated 5.8 million scooter-tagged tweets and 144,197
485 images, generated by 2.7 million users from October 2018 to March 2020, to take
486 a closer look at shared e-scooters via crowdsourcing data analytics. We
487 profiled e-scooter usages from spatial-temporal perspectives, explored
488 different business roles (i.e., riders, gig workers, and ridesharing
489 companies), examined operation patterns (e.g., injury types, and parking
490 behaviors), and conducted sentiment analysis. To our best knowledge, this paper
491 is the first large-scale systematic study on shared e-scooters using big social
492 data.
493 </p>
494 </description>
495 </item>
496 <item>
497 <title>Rosella: A Self-Driving Distributed Scheduler for Heterogeneous Clusters. (arXiv:2010.15206v1 [cs.DC])</title>
498 <link>http://fr.arxiv.org/abs/2010.15206</link>
499 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Q/0/1/0/all/0/1">Qiong Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Manandhar_S/0/1/0/all/0/1">Sunil Manandhar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Z/0/1/0/all/0/1">Zhenming Liu</a></p>
500
501 <p>Large-scale interactive web services and advanced AI applications make
502 sophisticated decisions in real-time, based on executing a massive amount of
503 computation tasks on thousands of servers. Task schedulers, which often operate
504 in heterogeneous and volatile environments, require high throughput, i.e.,
505 scheduling millions of tasks per second, and low latency, i.e., incurring
506 minimal scheduling delays for millisecond-level tasks. Scheduling is further
507 complicated by other users' workloads in a shared system, other background
508 activities, and the diverse hardware configurations inside datacenters.
509 </p>
510 <p>We present Rosella, a new self-driving, distributed approach for task
511 scheduling in heterogeneous clusters. Our system automatically learns the
512 compute environment and adjust its scheduling policy in real-time. The solution
513 provides high throughput and low latency simultaneously, because it runs in
514 parallel on multiple machines with minimum coordination and only performs
515 simple operations for each scheduling decision. Our learning module monitors
516 total system load, and uses the information to dynamically determine optimal
517 estimation strategy for the backends' compute-power. Our scheduling policy
518 generalizes power-of-two-choice algorithms to handle heterogeneous workers,
519 reducing the max queue length of $O(\log n)$ obtained by prior algorithms to
520 $O(\log \log n)$. We implement a Rosella prototype and evaluate it with a
521 variety of workloads. Experimental results show that Rosella significantly
522 reduces task response times, and adapts to environment changes quickly.
523 </p>
524 </description>
525 </item>
526 <item>
527 <title>Ground Roll Suppression using Convolutional Neural Networks. (arXiv:2010.15209v1 [eess.IV])</title>
528 <link>http://fr.arxiv.org/abs/2010.15209</link>
529 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Oliveira_D/0/1/0/all/0/1">Dario Augusto Borges Oliveira</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Semin_D/0/1/0/all/0/1">Daniil Semin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zaytsev_S/0/1/0/all/0/1">Semen Zaytsev</a></p>
530
531 <p>Seismic data processing plays a major role in seismic exploration as it
532 conditions much of the seismic interpretation performance. In this context,
533 generating reliable post-stack seismic data depends also on disposing of an
534 efficient pre-stack noise attenuation tool. Here we tackle ground roll noise,
535 one of the most challenging and common noises observed in pre-stack seismic
536 data. Since ground roll is characterized by relative low frequencies and high
537 amplitudes, most commonly used approaches for its suppression are based on
538 frequency-amplitude filters for ground roll characteristic bands. However, when
539 signal and noise share the same frequency ranges, these methods usually deliver
540 also signal suppression or residual noise. In this paper we take advantage of
541 the highly non-linear features of convolutional neural networks, and propose to
542 use different architectures to detect ground roll in shot gathers and
543 ultimately to suppress them using conditional generative adversarial networks.
544 Additionally, we propose metrics to evaluate ground roll suppression, and
545 report strong results compared to expert filtering. Finally, we discuss
546 generalization of trained models for similar and different geologies to better
547 understand the feasibility of our proposal in real applications.
548 </p>
549 </description>
550 </item>
551 <item>
552 <title>On Linearizability and the Termination of Randomized Algorithms. (arXiv:2010.15210v1 [cs.DC])</title>
553 <link>http://fr.arxiv.org/abs/2010.15210</link>
554 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hadzilacos_V/0/1/0/all/0/1">Vassos Hadzilacos</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_X/0/1/0/all/0/1">Xing Hu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Toueg_S/0/1/0/all/0/1">Sam Toueg</a></p>
555
556 <p>We study the question of whether the "termination with probability 1"
557 property of a randomized algorithm is preserved when one replaces the atomic
558 registers that the algorithm uses with linearizable (implementations of)
559 registers. We show that in general this is not so: roughly speaking, every
560 randomized algorithm A has a corresponding algorithm A' that solves the same
561 problem if the registers that it uses are atomic or strongly-linearizable, but
562 does not terminate if these registers are replaced with "merely" linearizable
563 ones. Together with a previous result shown in [15], this implies that one
564 cannot use the well-known ABD implementation of registers in message-passing
565 systems to automatically transform any randomized algorithm that works in
566 shared-memory systems into a randomized algorithm that works in message-passing
567 systems: with a strong adversary the resulting algorithm may not terminate.
568 </p>
569 </description>
570 </item>
571 <item>
572 <title>Safety-Aware Cascade Controller Tuning Using Constrained Bayesian Optimization. (arXiv:2010.15211v1 [eess.SY])</title>
573 <link>http://fr.arxiv.org/abs/2010.15211</link>
574 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Konig_C/0/1/0/all/0/1">Christopher K&#xf6;nig</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Khosravi_M/0/1/0/all/0/1">Mohammad Khosravi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Maier_M/0/1/0/all/0/1">Markus Maier</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Smith_R/0/1/0/all/0/1">Roy S. Smith</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Rupenyan_A/0/1/0/all/0/1">Alisa Rupenyan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lygeros_J/0/1/0/all/0/1">John Lygeros</a></p>
575
576 <p>This paper presents an automated, model-free, data-driven method for the safe
577 tuning of PID cascade controller gains based on Bayesian optimization. The
578 optimization objective is composed of data-driven performance metrics and
579 modeled using Gaussian processes. We further introduce a data-driven constraint
580 that captures the stability requirements from system data. Numerical evaluation
581 shows that the proposed approach outperforms relay feedback autotuning and
582 quickly converges to the global optimum, thanks to a tailored stopping
583 criterion. We demonstrate the performance of the method in simulations and
584 experiments on a linear axis drive of a grinding machine. For experimental
585 implementation, in addition to the introduced safety constraint, we integrate a
586 method for automatic detection of the critical gains and extend the
587 optimization objective with a penalty depending on the proximity of the current
588 candidate points to the critical gains. The resulting automated tuning method
589 optimizes system performance while ensuring stability and standardization.
590 </p>
591 </description>
592 </item>
593 <item>
594 <title>Away from Trolley Problems and Toward Risk Management. (arXiv:2010.15217v1 [cs.CY])</title>
595 <link>http://fr.arxiv.org/abs/2010.15217</link>
596 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goodall_N/0/1/0/all/0/1">Noah J. Goodall</a></p>
597
598 <p>As automated vehicles receive more attention from the media, there has been
599 an equivalent increase in the coverage of the ethical choices a vehicle may be
600 forced to make in certain crash situations with no clear safe outcome. Much of
601 this coverage has focused on a philosophical thought experiment known as the
602 "trolley problem," and substituting an automated vehicle for the trolley and
603 the car's software for the bystander. While this is a stark and straightforward
604 example of ethical decision making for an automated vehicle, it risks
605 marginalizing the entire field if it is to become the only ethical problem in
606 the public's mind. In this chapter, I discuss the shortcomings of the trolley
607 problem, and introduce more nuanced examples that involve crash risk and
608 uncertainty. Risk management is introduced as an alternative approach, and its
609 ethical dimensions are discussed.
610 </p>
611 </description>
612 </item>
613 <item>
614 <title>StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems. (arXiv:2010.15218v1 [cs.DC])</title>
615 <link>http://fr.arxiv.org/abs/2010.15218</link>
616 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Licht_J/0/1/0/all/0/1">Johannes de Fine Licht</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kuster_A/0/1/0/all/0/1">Andreas Kuster</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Matteis_T/0/1/0/all/0/1">Tiziano De Matteis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ben_Nun_T/0/1/0/all/0/1">Tal Ben-Nun</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hofer_D/0/1/0/all/0/1">Dominic Hofer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hoefler_T/0/1/0/all/0/1">Torsten Hoefler</a></p>
617
618 <p>Spatial computing devices have been shown to significantly accelerate stencil
619 computations, but have so far relied on unrolling the iterative dimension of a
620 single stencil operation to increase temporal locality. This work considers the
621 general case of mapping directed acyclic graphs of heterogeneous stencil
622 computations to spatial computing systems, assuming large input programs
623 without an iterative component. StencilFlow maximizes temporal locality and
624 ensures deadlock freedom in this setting, providing end-to-end analysis and
625 mapping from a high-level program description to distributed hardware. We
626 evaluate the generated architectures on an FPGA testbed, demonstrating the
627 highest single-device and multi-device performance recorded for stencil
628 programs on FPGAs to date, then leverage the framework to study a complex
629 stencil program from a production weather simulation application. Our work
630 enables productively targeting distributed spatial computing systems with large
631 stencil programs, and offers insight into architecture characteristics required
632 for their efficient execution in practice.
633 </p>
634 </description>
635 </item>
636 <item>
637 <title>Geometric Sampling of Networks. (arXiv:2010.15221v1 [math.DG])</title>
638 <link>http://fr.arxiv.org/abs/2010.15221</link>
639 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Barkanass_V/0/1/0/all/0/1">Vladislav Barkanass</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Jost_J/0/1/0/all/0/1">J&#xfc;rgen Jost</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Saucan_E/0/1/0/all/0/1">Emil Saucan</a></p>
640
641 <p>Motivated by the methods and results of manifold sampling based on Ricci
642 curvature, we propose a similar approach for networks. To this end we make
643 appeal to three types of discrete curvature, namely the graph Forman-, full
644 Forman- and Haantjes-Ricci curvatures for edge-based and node-based sampling.
645 We present the results of experiments on real life networks, as well as for
646 square grids arising in Image Processing. Moreover, we consider fitting Ricci
647 flows and we employ them for the detection of networks' backbone. We also
648 develop embedding kernels related to the Forman-Ricci curvatures and employ
649 them for the detection of the coarse structure of networks, as well as for
650 network visualization with applications to SVM. The relation between the Ricci
651 curvature of the original manifold and that of a Ricci curvature driven
652 discretization is also studied.
653 </p>
654 </description>
655 </item>
656 <item>
657 <title>Exploring complex networks with the ICON R package. (arXiv:2010.15222v1 [cs.SI])</title>
658 <link>http://fr.arxiv.org/abs/2010.15222</link>
659 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wadhwa_R/0/1/0/all/0/1">Raoul R. Wadhwa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Scott_J/0/1/0/all/0/1">Jacob G. Scott</a></p>
660
661 <p>We introduce ICON, an R package that contains 1075 complex network datasets
662 in a standard edgelist format. All provided datasets have associated citations
663 and have been indexed by the Colorado Index of Complex Networks - also referred
664 to as ICON. In addition to supplying a large and diverse corpus of useful
665 real-world networks, ICON also implements an S3 generic to work with the
666 network and ggnetwork R packages for network analysis and visualization,
667 respectively. Sample code in this report also demonstrates how ICON can be used
668 in conjunction with the igraph package. Currently, the Comprehensive R Archive
669 Network hosts ICON v0.4.0. We hope that ICON will serve as a standard corpus
670 for complex network research and prevent redundant work that would be otherwise
671 necessary by individual research groups. The open source code for ICON and for
672 this reproducible report can be found at https://github.com/rrrlw/ICON.
673 </p>
674 </description>
675 </item>
676 <item>
677 <title>A Visuospatial Dataset for Naturalistic Verb Learning. (arXiv:2010.15225v1 [cs.CL])</title>
678 <link>http://fr.arxiv.org/abs/2010.15225</link>
679 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ebert_D/0/1/0/all/0/1">Dylan Ebert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pavlick_E/0/1/0/all/0/1">Ellie Pavlick</a></p>
680
681 <p>We introduce a new dataset for training and evaluating grounded language
682 models. Our data is collected within a virtual reality environment and is
683 designed to emulate the quality of language data to which a pre-verbal child is
684 likely to have access: That is, naturalistic, spontaneous speech paired with
685 richly grounded visuospatial context. We use the collected data to compare
686 several distributional semantics models for verb learning. We evaluate neural
687 models based on 2D (pixel) features as well as feature-engineered models based
688 on 3D (symbolic, spatial) features, and show that neither modeling approach
689 achieves satisfactory performance. Our results are consistent with evidence
690 from child language acquisition that emphasizes the difficulty of learning
691 verbs from naive distributional data. We discuss avenues for future work on
692 cognitively-inspired grounded language learning, and release our corpus with
693 the intent of facilitating research on the topic.
694 </p>
695 </description>
696 </item>
697 <item>
698 <title>Speech-Based Emotion Recognition using Neural Networks and Information Visualization. (arXiv:2010.15229v1 [cs.HC])</title>
699 <link>http://fr.arxiv.org/abs/2010.15229</link>
700 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Almahmoud_J/0/1/0/all/0/1">Jumana Almahmoud</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kikkeri_K/0/1/0/all/0/1">Kruthika Kikkeri</a></p>
701
702 <p>Emotions recognition is commonly employed for health assessment. However, the
703 typical metric for evaluation in therapy is based on patient-doctor appraisal.
704 This process can fall into the issue of subjectivity, while also requiring
705 healthcare professionals to deal with copious amounts of information. Thus,
706 machine learning algorithms can be a useful tool for the classification of
707 emotions. While several models have been developed in this domain, there is a
708 lack of userfriendly representations of the emotion classification systems for
709 therapy. We propose a tool which enables users to take speech samples and
710 identify a range of emotions (happy, sad, angry, surprised, neutral, clam,
711 disgust, and fear) from audio elements through a machine learning model. The
712 dashboard is designed based on local therapists' needs for intuitive
713 representations of speech data in order to gain insights and informative
714 analyses of their sessions with their patients.
715 </p>
716 </description>
717 </item>
718 <item>
719 <title>Construction Payment Automation Using Blockchain-Enabled Smart Contracts and Reality Capture Technologies. (arXiv:2010.15232v1 [cs.CR])</title>
720 <link>http://fr.arxiv.org/abs/2010.15232</link>
721 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hamledari_H/0/1/0/all/0/1">Hesam Hamledari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fischer_M/0/1/0/all/0/1">Martin Fischer</a></p>
722
723 <p>This paper presents a smart contract-based solution for autonomous
724 administration of construction progress payments. It bridges the gap between
725 payments (cash flow) and the progress assessments at job sites (product flow)
726 enabled by reality capture technologies and building information modeling
727 (BIM). The approach eliminates the reliance on the centralized and heavily
728 intermediated mechanisms of existing payment applications. The construction
729 progress is stored in a distributed manner using content addressable file
730 sharing; it is broadcasted to a smart contract which automates the on-chain
731 payment settlements and the transfer of lien rights. The method was
732 successfully used for processing payments to 7 subcontractors in two commercial
733 construction projects where progress monitoring was performed using a
734 camera-equipped unmanned aerial vehicle (UAV) and an unmanned ground vehicle
735 (UGV) equipped with a laser scanner. The results show promise for the method's
736 potential for increasing the frequency, granularity, and transparency of
737 payments. The paper is concluded with a discussion of implications for project
738 management, introducing a new model of project as a singleton state machine.
739 </p>
740 </description>
741 </item>
742 <item>
743 <title>Accurate Prostate Cancer Detection and Segmentation on Biparametric MRI using Non-local Mask R-CNN with Histopathological Ground Truth. (arXiv:2010.15233v1 [eess.IV])</title>
744 <link>http://fr.arxiv.org/abs/2010.15233</link>
745 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Dai_Z/0/1/0/all/0/1">Zhenzhen Dai</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jambor_I/0/1/0/all/0/1">Ivan Jambor</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Taimen_P/0/1/0/all/0/1">Pekka Taimen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pantelic_M/0/1/0/all/0/1">Milan Pantelic</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Elshaikh_M/0/1/0/all/0/1">Mohamed Elshaikh</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Rogers_C/0/1/0/all/0/1">Craig Rogers</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ettala_O/0/1/0/all/0/1">Otto Ettala</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bostrom_P/0/1/0/all/0/1">Peter Bostr&#xf6;m</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Aronen_H/0/1/0/all/0/1">Hannu Aronen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Merisaari_H/0/1/0/all/0/1">Harri Merisaari</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wen_N/0/1/0/all/0/1">Ning Wen</a></p>
746
747 <p>Purpose: We aimed to develop deep machine learning (DL) models to improve the
748 detection and segmentation of intraprostatic lesions (IL) on bp-MRI by using
749 whole amount prostatectomy specimen-based delineations. We also aimed to
750 investigate whether transfer learning and self-training would improve results
751 with small amount labelled data.
752 </p>
753 <p>Methods: 158 patients had suspicious lesions delineated on MRI based on
754 bp-MRI, 64 patients had ILs delineated on MRI based on whole mount
755 prostatectomy specimen sections, 40 patients were unlabelled. A non-local Mask
756 R-CNN was proposed to improve the segmentation accuracy. Transfer learning was
757 investigated by fine-tuning a model trained using MRI-based delineations with
758 prostatectomy-based delineations. Two label selection strategies were
759 investigated in self-training. The performance of models was evaluated by 3D
760 detection rate, dice similarity coefficient (DSC), 95 percentile Hausdrauff (95
761 HD, mm) and true positive ratio (TPR).
762 </p>
763 <p>Results: With prostatectomy-based delineations, the non-local Mask R-CNN with
764 fine-tuning and self-training significantly improved all evaluation metrics.
765 For the model with the highest detection rate and DSC, 80.5% (33/41) of lesions
766 in all Gleason Grade Groups (GGG) were detected with DSC of 0.548[0.165], 95 HD
767 of 5.72[3.17] and TPR of 0.613[0.193]. Among them, 94.7% (18/19) of lesions
768 with GGG &gt; 2 were detected with DSC of 0.604[0.135], 95 HD of 6.26[3.44] and
769 TPR of 0.580[0.190].
770 </p>
771 <p>Conclusion: DL models can achieve high prostate cancer detection and
772 segmentation accuracy on bp-MRI based on annotations from histologic images. To
773 further improve the performance, more data with annotations of both MRI and
774 whole amount prostatectomy specimens are required.
775 </p>
776 </description>
777 </item>
778 <item>
779 <title>Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions. (arXiv:2010.15234v1 [cs.LG])</title>
780 <link>http://fr.arxiv.org/abs/2010.15234</link>
781 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ahuja_K/0/1/0/all/0/1">Kartik Ahuja</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shanmugam_K/0/1/0/all/0/1">Karthikeyan Shanmugam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dhurandhar_A/0/1/0/all/0/1">Amit Dhurandhar</a></p>
782
783 <p>Recently, invariant risk minimization (IRM) (Arjovsky et al.) was proposed as
784 a promising solution to address out-of-distribution (OOD) generalization. In
785 Ahuja et al., it was shown that solving for the Nash equilibria of a new class
786 of "ensemble-games" is equivalent to solving IRM. In this work, we extend the
787 framework in Ahuja et al. for linear regressions by projecting the
788 ensemble-game on an $\ell_{\infty}$ ball. We show that such projections help
789 achieve non-trivial OOD guarantees despite not achieving perfect invariance.
790 For linear models with confounders, we prove that Nash equilibria of these
791 games are closer to the ideal OOD solutions than the standard empirical risk
792 minimization (ERM) and we also provide learning algorithms that provably
793 converge to these Nash Equilibria. Empirical comparisons of the proposed
794 approach with the state-of-the-art show consistent gains in achieving OOD
795 solutions in several settings involving anti-causal variables and confounders.
796 </p>
797 </description>
798 </item>
799 <item>
800 <title>SD-Access: Practical Experiences in Designing and Deploying Software Defined Enterprise Networks. (arXiv:2010.15236v1 [cs.NI])</title>
801 <link>http://fr.arxiv.org/abs/2010.15236</link>
802 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Paillisse_J/0/1/0/all/0/1">Jordi Paillisse</a> (1 and 2), <a href="http://fr.arxiv.org/find/cs/1/au:+Portoles_M/0/1/0/all/0/1">Marc Portoles</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Lopez_A/0/1/0/all/0/1">Albert Lopez</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Rodriguez_Natal_A/0/1/0/all/0/1">Alberto Rodriguez-Natal</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Iacobacci_D/0/1/0/all/0/1">David Iacobacci</a> (3), <a href="http://fr.arxiv.org/find/cs/1/au:+Leong_J/0/1/0/all/0/1">Johnson Leong</a> (4), <a href="http://fr.arxiv.org/find/cs/1/au:+Moreno_V/0/1/0/all/0/1">Victor Moreno</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Cabellos_A/0/1/0/all/0/1">Albert Cabellos</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Maino_F/0/1/0/all/0/1">Fabio Maino</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Hooda_S/0/1/0/all/0/1">Sanjay Hooda</a> (2) ((1) UPC-BarcelonaTech, Barcelona, Spain, (2) Cisco, San Jose, USA, (3) BMP LLP, (4) Uber Technologies Inc., San Francisco, USA)</p>
803
804 <p>Enterprise Networks, over the years, have become more and more complex trying
805 to keep up with new requirements that challenge traditional solutions. Just to
806 mention one out of many possible examples, technologies such as Virtual LANs
807 (VLANs) struggle to address the scalability and operational requirements
808 introduced by Internet of Things (IoT) use cases. To keep up with these
809 challenges we have identified four main requirements that are common across
810 modern enterprise networks: (i) scalable mobility, (ii) endpoint segmentation,
811 (iii) simplified administration, and (iv) resource optimization. To address
812 these challenges we designed SDA (Software Defined Access), a solution for
813 modern enterprise networks that leverages Software-Defined Networking (SDN) and
814 other state of the art techniques. In this paper we present the design,
815 implementation and evaluation of SDA. Specifically, SDA: (i) leverages a
816 combination of an overlay approach with an event-driven protocol (LISP) to
817 dynamically adapt to traffic and mobility patterns while preserving resources,
818 and (ii) enforces dynamic endpoint groups for scalable segmentation with low
819 operational burden. We present our experience with deploying SDA in two
820 real-life scenarios: an enterprise campus, and a large warehouse with mobile
821 robots. Our evaluation shows that SDA, when compared with traditional
822 enterprise networks, can (i) reduce overall data plane forwarding state up to
823 70% thanks to a reactive protocol using a centralized routing server, and (ii)
824 reduce by an order of magnitude the handover delays in scenarios of massive
825 mobility with respect to other approaches. Finally, we discuss lessons learned
826 while deploying and operating SDA, and possible optimizations regarding the use
827 of an event-driven protocol and group-based segmentation.
828 </p>
829 </description>
830 </item>
831 <item>
832 <title>Bandit Policies for Reliable Cellular Network Handovers in Extreme Mobility. (arXiv:2010.15237v1 [cs.LG])</title>
833 <link>http://fr.arxiv.org/abs/2010.15237</link>
834 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yuanjie Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Datta_E/0/1/0/all/0/1">Esha Datta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_J/0/1/0/all/0/1">Jiaxin Ding</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shroff_N/0/1/0/all/0/1">Ness Shroff</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_X/0/1/0/all/0/1">Xin Liu</a></p>
835
836 <p>The demand for seamless Internet access under extreme user mobility, such as
837 on high-speed trains and vehicles, has become a norm rather than an exception.
838 However, the 4G/5G mobile network is not always reliable to meet this demand,
839 with non-negligible failures during the handover between base stations. A
840 fundamental challenge of reliability is to balance the exploration of more
841 measurements for satisfactory handover, and exploitation for timely handover
842 (before the fast-moving user leaves the serving base station's radio coverage).
843 This paper formulates this trade-off in extreme mobility as a composition of
844 two distinct multi-armed bandit problems. We propose Bandit and Threshold
845 Tuning (BATT) to minimize the regret of handover failures in extreme mobility.
846 BATT uses $\epsilon$-binary-search to optimize the threshold of the serving
847 cell's signal strength to initiate the handover procedure with
848 $\mathcal{O}(\log J \log T)$ regret.It further devises opportunistic Thompson
849 sampling, which optimizes the sequence of the target cells to measure for
850 reliable handover with $\mathcal{O}(\log T)$ regret.Our experiment over a real
851 LTE dataset from Chinese high-speed rails validates significant regret
852 reduction and a 29.1% handover failure reduction.
853 </p>
854 </description>
855 </item>
856 <item>
857 <title>Cloud-Based Dynamic Programming for an Electric City Bus Energy Management Considering Real-Time Passenger Load Prediction. (arXiv:2010.15239v1 [eess.SY])</title>
858 <link>http://fr.arxiv.org/abs/2010.15239</link>
859 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Shi_J/0/1/0/all/0/1">Junzhe Shi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_B/0/1/0/all/0/1">Bin Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhou_X/0/1/0/all/0/1">Xingyu Zhou</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Hou_J/0/1/0/all/0/1">Jun Hou</a></p>
860
861 <p>Electric city bus gains popularity in recent years for its low greenhouse gas
862 emission, low noise level, etc. Different from a passenger car, the weight of a
863 city bus varies significantly with different amounts of onboard passengers,
864 which is not well studied in existing literature. This study proposes a
865 passenger load prediction model using day-of-week, time-of-day, weather,
866 temperatures, wind levels, and holiday information as inputs. The average
867 model, Regression Tree, Gradient Boost Decision Tree, and Neural Networks
868 models are compared in the passenger load prediction. The Gradient Boost
869 Decision Tree model is selected due to its best accuracy and high stability.
870 Given the predicted passenger load, dynamic programming algorithm determines
871 the optimal power demand for supercapacitor and battery by optimizing the
872 battery aging and energy usage in the cloud. Then rule extraction is conducted
873 on dynamic programming results, and the rule is real-time loaded to onboard
874 controllers of vehicles. The proposed cloud-based dynamic programming and rule
875 extraction framework with the passenger load prediction shows 4% and 11% fewer
876 bus operating costs in off-peak and peak hours, respectively. The operating
877 cost by the proposed framework is less than 1% shy of the dynamic programming
878 with the true passenger load information.
879 </p>
880 </description>
881 </item>
882 <item>
883 <title>Test Set Optimization by Machine Learning Algorithms. (arXiv:2010.15240v1 [cs.LG])</title>
884 <link>http://fr.arxiv.org/abs/2010.15240</link>
885 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_K/0/1/0/all/0/1">Kaiming Fu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jin_Y/0/1/0/all/0/1">Yulu Jin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Z/0/1/0/all/0/1">Zhousheng Chen</a></p>
886
887 <p>Diagnosis results are highly dependent on the volume of test set. To derive
888 the most efficient test set, we propose several machine learning based methods
889 to predict the minimum amount of test data that produces relatively accurate
890 diagnosis. By collecting outputs from failing circuits, the feature matrix and
891 label vector are generated, which involves the inference information of the
892 test termination point. Thus we develop a prediction model to fit the data and
893 determine when to terminate testing. The considered methods include LASSO and
894 Support Vector Machine(SVM) where the relationship between goals(label) and
895 predictors(feature matrix) are considered to be linear in LASSO and nonlinear
896 in SVM. Numerical results show that SVM reaches a diagnosis accuracy of 90.4%
897 while deducting the volume of test set by 35.24%.
898 </p>
899 </description>
900 </item>
901 <item>
902 <title>A marine radioisotope gamma-ray spectrum analysis method based on Monte Carlo simulation and MLP neural network. (arXiv:2010.15245v1 [physics.ins-det])</title>
903 <link>http://fr.arxiv.org/abs/2010.15245</link>
904 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Dai_W/0/1/0/all/0/1">Wenhan Dai</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Zeng_Z/0/1/0/all/0/1">Zhi Zeng</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Dou_D/0/1/0/all/0/1">Daowei Dou</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Ma_H/0/1/0/all/0/1">Hao Ma</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Chen_J/0/1/0/all/0/1">Jianping Chen</a> (1 and 2), <a href="http://fr.arxiv.org/find/physics/1/au:+Li_J/0/1/0/all/0/1">Junli Li</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Zhang_H/0/1/0/all/0/1">Hui Zhang</a> (1) ((1) Department of Engineering Physics, Tsinghua University, Beijing, China, (2) College of Nuclear Science and Technology, Beijing Normal University, Beijing, China)</p>
905
906 <p>A multilayer perceptron (MLP) neural network is built to analyze the Cs-137
907 concentration in seawater via gamma-ray spectrums measured by a LaBr3 detector.
908 The MLP is trained and tested by a large data set generated by combining
909 measured and Monte Carlo simulated spectrums under the assumption that all the
910 measured spectrums have 0 Cs-137 concentration. And the performance of MLP is
911 evaluated and compared with the traditional net-peak area method. The results
912 show an improvement of 7% in accuracy and 0.036 in the ROC-curve area compared
913 to those of the net peak area method. And the influence of the assumption of
914 Cs-137 concentration in the training data set on the classifying performance of
915 MLP is evaluated.
916 </p>
917 </description>
918 </item>
919 <item>
920 <title>Semantic video segmentation for autonomous driving. (arXiv:2010.15250v1 [cs.CV])</title>
921 <link>http://fr.arxiv.org/abs/2010.15250</link>
922 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chau_M/0/1/0/all/0/1">Minh Triet Chau</a></p>
923
924 <p>We aim to solve semantic video segmentation in autonomous driving, namely
925 road detection in real time video, using techniques discussed in (Shelhamer et
926 al., 2016a). While fully convolutional network gives good result, we show that
927 the speed can be halved while preserving the accuracy. The test dataset being
928 used is KITTI, which consists of real footage from Germany's streets.
929 </p>
930 </description>
931 </item>
932 <item>
933 <title>Fusion Models for Improved Visual Captioning. (arXiv:2010.15251v1 [cs.CV])</title>
934 <link>http://fr.arxiv.org/abs/2010.15251</link>
935 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kalimuthu_M/0/1/0/all/0/1">Marimuthu Kalimuthu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mogadala_A/0/1/0/all/0/1">Aditya Mogadala</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mosbach_M/0/1/0/all/0/1">Marius Mosbach</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Klakow_D/0/1/0/all/0/1">Dietrich Klakow</a></p>
936
937 <p>Visual captioning aims to generate textual descriptions given images.
938 Traditionally, the captioning models are trained on human annotated datasets
939 such as Flickr30k and MS-COCO, which are limited in size and diversity. This
940 limitation hinders the generalization capabilities of these models while also
941 rendering them to often make mistakes. Language models can, however, be trained
942 on vast amounts of freely available unlabelled data and have recently emerged
943 as successful language encoders and coherent text generators. Meanwhile,
944 several unimodal and multimodal fusion techniques have been proven to work well
945 for natural language generation and automatic speech recognition. Building on
946 these recent developments, and with an aim of improving the quality of
947 generated captions, the contribution of our work in this paper is two-fold:
948 First, we propose a generic multimodal model fusion framework for caption
949 generation as well as emendation where we utilize different fusion strategies
950 to integrate a pretrained Auxiliary Language Model (AuxLM) within the
951 traditional encoder-decoder visual captioning frameworks. Next, we employ the
952 same fusion strategies to integrate a pretrained Masked Language Model (MLM),
953 namely BERT, with a visual captioning model, viz. Show, Attend, and Tell, for
954 emending both syntactic and semantic errors in captions. Our caption emendation
955 experiments on three benchmark image captioning datasets, viz. Flickr8k,
956 Flickr30k, and MSCOCO, show improvements over the baseline, indicating the
957 usefulness of our proposed multimodal fusion strategies. Further, we perform a
958 preliminary qualitative analysis on the emended captions and identify error
959 categories based on the type of corrections.
960 </p>
961 </description>
962 </item>
963 <item>
964 <title>Model Minimization For Online Predictability. (arXiv:2010.15255v1 [cs.AI])</title>
965 <link>http://fr.arxiv.org/abs/2010.15255</link>
966 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gopalakrishnan_S/0/1/0/all/0/1">Sriram Gopalakrishnan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kambhampati_S/0/1/0/all/0/1">Subbarao Kambhampati</a></p>
967
968 <p>For humans in a teaming scenario, context switching between reasoning about a
969 teammate's behavior and thinking about thier own task can slow us down,
970 especially if the cognitive cost of predicting the teammate's actions is high.
971 So if we can make the prediction of a robot-teammate's actions quicker, then
972 the human can be more productive. In this paper we present an approach to
973 constrain the actions of a robot so as to increase predictability (specifically
974 online predictability) while keeping the plan costs of the robot within
975 acceptable limits. Existing works on human-robot interaction do not consider
976 the computational cost for predictability, which we consider in our approach.
977 We approach this problem from the perspective of directed graph minimization,
978 and we connect the concept of predictability to the out-degree of vertices. We
979 present an algorithm to minimize graphs for predictability, and contrast this
980 with minimization for legibility (goal inference) and optimality.
981 </p>
982 </description>
983 </item>
984 <item>
985 <title>DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors. (arXiv:2010.15258v1 [cs.SD])</title>
986 <link>http://fr.arxiv.org/abs/2010.15258</link>
987 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Reddy_C/0/1/0/all/0/1">Chandan K A Reddy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gopal_V/0/1/0/all/0/1">Vishak Gopal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cutler_R/0/1/0/all/0/1">Ross Cutler</a></p>
988
989 <p>Human subjective evaluation is the gold standard to evaluate speech quality
990 optimized for human perception. Perceptual objective metrics serve as a proxy
991 for subjective scores. The conventional and widely used metrics require a
992 reference clean speech signal, which is unavailable in real recordings. The
993 no-reference approaches correlate poorly with human ratings and are not widely
994 adopted in the research community. One of the biggest use cases of these
995 perceptual objective metrics is to evaluate noise suppression algorithms. This
996 paper introduces a multi-stage self-teaching based perceptual objective metric
997 that is designed to evaluate noise suppressors. The proposed method generalizes
998 well in challenging test conditions with a high correlation to human ratings.
999 </p>
1000 </description>
1001 </item>
1002 <item>
1003 <title>Object sieving and morphological closing to reduce false detections in wide-area aerial imagery. (arXiv:2010.15260v1 [cs.CV])</title>
1004 <link>http://fr.arxiv.org/abs/2010.15260</link>
1005 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gao_X/0/1/0/all/0/1">Xin Gao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ram_S/0/1/0/all/0/1">Sundaresh Ram</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rodriguez_J/0/1/0/all/0/1">Jeffrey J. Rodriguez</a></p>
1006
1007 <p>For object detection in wide-area aerial imagery, post-processing is usually
1008 needed to reduce false detections. We propose a two-stage post-processing
1009 scheme which comprises an area-thresholding sieving process and a morphological
1010 closing operation. We use two wide-area aerial videos to compare the
1011 performance of five object detection algorithms in the absence and in the
1012 presence of our post-processing scheme. The automatic detection results are
1013 compared with the ground-truth objects. Several metrics are used for
1014 performance comparison.
1015 </p>
1016 </description>
1017 </item>
1018 <item>
1019 <title>Deep Shells: Unsupervised Shape Correspondence with Optimal Transport. (arXiv:2010.15261v1 [cs.CV])</title>
1020 <link>http://fr.arxiv.org/abs/2010.15261</link>
1021 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Eisenberger_M/0/1/0/all/0/1">Marvin Eisenberger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Toker_A/0/1/0/all/0/1">Aysim Toker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leal_Taixe_L/0/1/0/all/0/1">Laura Leal-Taix&#xe9;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cremers_D/0/1/0/all/0/1">Daniel Cremers</a></p>
1022
1023 <p>We propose a novel unsupervised learning approach to 3D shape correspondence
1024 that builds a multiscale matching pipeline into a deep neural network. This
1025 approach is based on smooth shells, the current state-of-the-art axiomatic
1026 correspondence method, which requires an a priori stochastic search over the
1027 space of initial poses. Our goal is to replace this costly preprocessing step
1028 by directly learning good initializations from the input surfaces. To that end,
1029 we systematically derive a fully differentiable, hierarchical matching pipeline
1030 from entropy regularized optimal transport. This allows us to combine it with a
1031 local feature extractor based on smooth, truncated spectral convolution
1032 filters. Finally, we show that the proposed unsupervised method significantly
1033 improves over the state-of-the-art on multiple datasets, even in comparison to
1034 the most recent supervised methods. Moreover, we demonstrate compelling
1035 generalization results by applying our learned filters to examples that
1036 significantly deviate from the training set.
1037 </p>
1038 </description>
1039 </item>
1040 <item>
1041 <title>CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence Models. (arXiv:2010.15266v1 [cs.CL])</title>
1042 <link>http://fr.arxiv.org/abs/2010.15266</link>
1043 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_A/0/1/0/all/0/1">Abhinav Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xia_P/0/1/0/all/0/1">Patrick Xia</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_G/0/1/0/all/0/1">Guanghui Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yarmohammadi_M/0/1/0/all/0/1">Mahsa Yarmohammadi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Durme_B/0/1/0/all/0/1">Benjamin Van Durme</a></p>
1044
1045 <p>Copy mechanisms are employed in sequence to sequence models (seq2seq) to
1046 generate reproductions of words from the input to the output. These frameworks,
1047 operating at the lexical type level, fail to provide an explicit alignment that
1048 records where each token was copied from. Further, they require contiguous
1049 token sequences from the input (spans) to be copied individually. We present a
1050 model with an explicit token-level copy operation and extend it to copying
1051 entire spans. Our model provides hard alignments between spans in the input and
1052 output, allowing for nontraditional applications of seq2seq, like information
1053 extraction. We demonstrate the approach on Nested Named Entity Recognition,
1054 achieving near state-of-the-art accuracy with an order of magnitude increase in
1055 decoding speed.
1056 </p>
1057 </description>
1058 </item>
1059 <item>
1060 <title>Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning. (arXiv:2010.15268v1 [cs.LG])</title>
1061 <link>http://fr.arxiv.org/abs/2010.15268</link>
1062 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Young_K/0/1/0/all/0/1">Kenny Young</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sutton_R/0/1/0/all/0/1">Richard S. Sutton</a></p>
1063
1064 <p>Despite empirical success, the theory of reinforcement learning (RL) with
1065 value function approximation remains fundamentally incomplete. Prior work has
1066 identified a variety of pathological behaviours that arise in RL algorithms
1067 that combine approximate on-policy evaluation and greedification. One prominent
1068 example is policy oscillation, wherein an algorithm may cycle indefinitely
1069 between policies, rather than converging to a fixed point. What is not well
1070 understood however is the quality of the policies in the region of oscillation.
1071 In this paper we present simple examples illustrating that in addition to
1072 policy oscillation and multiple fixed points -- the same basic issue can lead
1073 to convergence to the worst possible policy for a given approximation. Such
1074 behaviours can arise when algorithms optimize evaluation accuracy weighted by
1075 the distribution of states that occur under the current policy, but greedify
1076 based on the value of states which are rare or nonexistent under this
1077 distribution. This means the values used for greedification are unreliable and
1078 can steer the policy in undesirable directions. Our observation that this can
1079 lead to the worst possible policy shows that in a general sense such algorithms
1080 are unreliable. The existence of such examples helps to narrow the kind of
1081 theoretical guarantees that are possible and the kind of algorithmic ideas that
1082 are likely to be helpful. We demonstrate analytically and experimentally that
1083 such pathological behaviours can impact a wide range of RL and dynamic
1084 programming algorithms; such behaviours can arise both with and without
1085 bootstrapping, and with linear function approximation as well as with more
1086 complex parameterized functions like neural networks.
1087 </p>
1088 </description>
1089 </item>
1090 <item>
1091 <title>GloFlow: Global Image Alignment for Creation of Whole Slide Images for Pathology from Video. (arXiv:2010.15269v1 [eess.IV])</title>
1092 <link>http://fr.arxiv.org/abs/2010.15269</link>
1093 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Krishna_V/0/1/0/all/0/1">Viswesh Krishna</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Joshi_A/0/1/0/all/0/1">Anirudh Joshi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bulterys_P/0/1/0/all/0/1">Philip L. Bulterys</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Yang_E/0/1/0/all/0/1">Eric Yang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ng_A/0/1/0/all/0/1">Andrew Y. Ng</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Rajpurkar_P/0/1/0/all/0/1">Pranav Rajpurkar</a></p>
1094
1095 <p>The application of deep learning to pathology assumes the existence of
1096 digital whole slide images of pathology slides. However, slide digitization is
1097 bottlenecked by the high cost of precise motor stages in slide scanners that
1098 are needed for position information used for slide stitching. We propose
1099 GloFlow, a two-stage method for creating a whole slide image using optical
1100 flow-based image registration with global alignment using a computationally
1101 tractable graph-pruning approach. In the first stage, we train an optical flow
1102 predictor to predict pairwise translations between successive video frames to
1103 approximate a stitch. In the second stage, this approximate stitch is used to
1104 create a neighborhood graph to produce a corrected stitch. On a simulated
1105 dataset of video scans of WSIs, we find that our method outperforms known
1106 approaches to slide-stitching, and stitches WSIs resembling those produced by
1107 slide scanners.
1108 </p>
1109 </description>
1110 </item>
1111 <item>
1112 <title>A globally convergent modified Newton method for the direct minimization of the Ohta-Kawasaki energy with application to the directed self-assembly of diblock copolymers. (arXiv:2010.15271v1 [physics.comp-ph])</title>
1113 <link>http://fr.arxiv.org/abs/2010.15271</link>
1114 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Cao_L/0/1/0/all/0/1">Lianghao Cao</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Ghattas_O/0/1/0/all/0/1">Omar Ghattas</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Oden_J/0/1/0/all/0/1">J. Tinsley Oden</a></p>
1115
1116 <p>We propose a fast and robust scheme for the direct minimization of the
1117 Ohta-Kawasaki energy that characterizes the microphase separation of diblock
1118 copolymer melts. The scheme employs a globally convergent modified Newton
1119 method with line search which is shown to be mass-conservative,
1120 energy-descending, asymptotically quadratically convergent, and three orders of
1121 magnitude more efficient than the commonly-used gradient flow approach. The
1122 regularity and the first-order condition of minimizers are analyzed. A
1123 numerical study of the chemical substrate guided directed self-assembly of
1124 diblock copolymer melts, based on a novel polymer-substrate interaction model
1125 and the proposed scheme, is provided.
1126 </p>
1127 </description>
1128 </item>
1129 <item>
1130 <title>The distribution of inhibitory neurons in the C. elegans connectome facilitates self-optimization of coordinated neural activity. (arXiv:2010.15272v1 [q-bio.NC])</title>
1131 <link>http://fr.arxiv.org/abs/2010.15272</link>
1132 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-bio/1/au:+Morales_A/0/1/0/all/0/1">Alejandro Morales</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Froese_T/0/1/0/all/0/1">Tom Froese</a></p>
1133
1134 <p>The nervous system of the nematode soil worm Caenorhabditis elegans exhibits
1135 remarkable complexity despite the worm's small size. A general challenge is to
1136 better understand the relationship between neural organization and neural
1137 activity at the system level, including the functional roles of inhibitory
1138 connections. Here we implemented an abstract simulation model of the C. elegans
1139 connectome that approximates the neurotransmitter identity of each neuron, and
1140 we explored the functional role of these physiological differences for neural
1141 activity. In particular, we created a Hopfield neural network in which all of
1142 the worm's neurons characterized by inhibitory neurotransmitters are assigned
1143 inhibitory outgoing connections. Then, we created a control condition in which
1144 the same number of inhibitory connections are arbitrarily distributed across
1145 the network. A comparison of these two conditions revealed that the biological
1146 distribution of inhibitory connections facilitates the self-optimization of
1147 coordinated neural activity compared with an arbitrary distribution of
1148 inhibitory connections.
1149 </p>
1150 </description>
1151 </item>
1152 <item>
1153 <title>Representation learning for improved interpretability and classification accuracy of clinical factors from EEG. (arXiv:2010.15274v1 [cs.LG])</title>
1154 <link>http://fr.arxiv.org/abs/2010.15274</link>
1155 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Honke_G/0/1/0/all/0/1">Garrett Honke</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Higgins_I/0/1/0/all/0/1">Irina Higgins</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Thigpen_N/0/1/0/all/0/1">Nina Thigpen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miskovic_V/0/1/0/all/0/1">Vladimir Miskovic</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Link_K/0/1/0/all/0/1">Katie Link</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_P/0/1/0/all/0/1">Pramod Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Klawohn_J/0/1/0/all/0/1">Julia Klawohn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hajcak_G/0/1/0/all/0/1">Greg Hajcak</a></p>
1156
1157 <p>Despite extensive standardization, diagnostic interviews for mental health
1158 disorders encompass substantial subjective judgment. Previous studies have
1159 demonstrated that EEG-based neural measures can function as reliable objective
1160 correlates of depression, or even predictors of depression and its course.
1161 However, their clinical utility has not been fully realized because of 1) the
1162 lack of automated ways to deal with the inherent noise associated with EEG data
1163 at scale, and 2) the lack of knowledge of which aspects of the EEG signal may
1164 be markers of a clinical disorder. Here we adapt an unsupervised pipeline from
1165 the recent deep representation learning literature to address these problems by
1166 1) learning a disentangled representation using $\beta$-VAE to denoise the
1167 signal, and 2) extracting interpretable features associated with a sparse set
1168 of clinical labels using a Symbol-Concept Association Network (SCAN). We
1169 demonstrate that our method is able to outperform the canonical hand-engineered
1170 baseline classification method on a number of factors, including participant
1171 age and depression diagnosis. Furthermore, our method recovers a representation
1172 that can be used to automatically extract denoised Event Related Potentials
1173 (ERPs) from novel, single EEG trajectories, and supports fast supervised
1174 re-mapping to various clinical labels, allowing clinicians to re-use a single
1175 EEG representation regardless of updates to the standardized diagnostic system.
1176 Finally, single factors of the learned disentangled representations often
1177 correspond to meaningful markers of clinical factors, as automatically detected
1178 by SCAN, allowing for human interpretability and post-hoc expert analysis of
1179 the recommendations made by the model.
1180 </p>
1181 </description>
1182 </item>
1183 <item>
1184 <title>A direct method for solving inverse Sturm-Liouville problems. (arXiv:2010.15275v1 [math.NA])</title>
1185 <link>http://fr.arxiv.org/abs/2010.15275</link>
1186 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Kravchenko_V/0/1/0/all/0/1">Vladislav V. Kravchenko</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Torba_S/0/1/0/all/0/1">Sergii M. Torba</a></p>
1187
1188 <p>We consider two main inverse Sturm-Liouville problems: the problem of
1189 recovery of the potential and the boundary conditions from two spectra or from
1190 a spectral density function. A simple method for practical solution of such
1191 problems is developed, based on the transmutation operator approach, new
1192 Neumann series of Bessel functions representations for solutions and the
1193 Gelfand-Levitan equation. The method allows one to reduce the inverse
1194 Sturm-Liouville problem directly to a system of linear algebraic equations,
1195 such that the potential is recovered from the first element of the solution
1196 vector. We prove the stability of the method and show its numerical efficiency
1197 with several numerical examples.
1198 </p>
1199 </description>
1200 </item>
1201 <item>
1202 <title>Class-incremental learning: survey and performance evaluation. (arXiv:2010.15277v1 [cs.LG])</title>
1203 <link>http://fr.arxiv.org/abs/2010.15277</link>
1204 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Masana_M/0/1/0/all/0/1">Marc Masana</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_X/0/1/0/all/0/1">Xialei Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Twardowski_B/0/1/0/all/0/1">Bartlomiej Twardowski</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Menta_M/0/1/0/all/0/1">Mikel Menta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bagdanov_A/0/1/0/all/0/1">Andrew D. Bagdanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weijer_J/0/1/0/all/0/1">Joost van de Weijer</a></p>
1205
1206 <p>For future learning systems incremental learning is desirable, because it
1207 allows for: efficient resource usage by eliminating the need to retrain from
1208 scratch at the arrival of new data; reduced memory usage by preventing or
1209 limiting the amount of data required to be stored -- also important when
1210 privacy limitations are imposed; and learning that more closely resembles human
1211 learning. The main challenge for incremental learning is catastrophic
1212 forgetting, which refers to the precipitous drop in performance on previously
1213 learned tasks after learning a new one. Incremental learning of deep neural
1214 networks has seen explosive growth in recent years. Initial work focused on
1215 task incremental learning, where a task-ID is provided at inference time.
1216 Recently we have seen a shift towards class-incremental learning where the
1217 learner must classify at inference time between all classes seen in previous
1218 tasks without recourse to a task-ID. In this paper, we provide a complete
1219 survey of existing methods for incremental learning, and in particular we
1220 perform an extensive experimental evaluation on twelve class-incremental
1221 methods. We consider several new experimental scenarios, including a comparison
1222 of class-incremental methods on multiple large-scale datasets, investigation
1223 into small and large domain shifts, and comparison on various network
1224 architectures.
1225 </p>
1226 </description>
1227 </item>
1228 <item>
1229 <title>Specification description and verification of multitask hybrid systems in the OTS/CafeOBJ method. (arXiv:2010.15280v1 [cs.SE])</title>
1230 <link>http://fr.arxiv.org/abs/2010.15280</link>
1231 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nakamura_M/0/1/0/all/0/1">Masaki Nakamura</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sakakibara_K/0/1/0/all/0/1">Kazutoshi Sakakibara</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ogata_K/0/1/0/all/0/1">Kazuhiro Ogata</a></p>
1232
1233 <p>To develop IoT and/or CSP systems, we need consider both continuous data from
1234 physical world and discrete data in computer systems. Such a system is called a
1235 hybrid system. Because of density of continuous data, it is not easy to do
1236 software testing to ensure reliability of hybrid systems. Moreover, the size of
1237 the state space increases exponentially for multitask systems. Formal
1238 descriptions of hybrid systems may help us to verify desired properties of a
1239 given system formally with computer supports. In this paper, we propose a way
1240 to describe a formal specification of a given multitask hybrid system as an
1241 observational transition system in CafeOBJ algebraic specification language and
1242 verify it by the proof score method based on equational reasoning implemented
1243 in CafeOBJ interpreter.
1244 </p>
1245 </description>
1246 </item>
1247 <item>
1248 <title>GENs: Generative Encoding Networks. (arXiv:2010.15283v1 [cs.LG])</title>
1249 <link>http://fr.arxiv.org/abs/2010.15283</link>
1250 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Saha_S/0/1/0/all/0/1">Surojit Saha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Elhabian_S/0/1/0/all/0/1">Shireen Elhabian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Whitaker_R/0/1/0/all/0/1">Ross T. Whitaker</a></p>
1251
1252 <p>Mapping data from and/or onto a known family of distributions has become an
1253 important topic in machine learning and data analysis. Deep generative models
1254 (e.g., generative adversarial networks ) have been used effectively to match
1255 known and unknown distributions. Nonetheless, when the form of the target
1256 distribution is known, analytical methods are advantageous in providing robust
1257 results with provable properties. In this paper, we propose and analyze the use
1258 of nonparametric density methods to estimate the Jensen-Shannon divergence for
1259 matching unknown data distributions to known target distributions, such
1260 Gaussian or mixtures of Gaussians, in latent spaces. This analytical method has
1261 several advantages: better behavior when training sample quantity is low,
1262 provable convergence properties, and relatively few parameters, which can be
1263 derived analytically. Using the proposed method, we enforce the latent
1264 representation of an autoencoder to match a target distribution in a learning
1265 framework that we call a {\em generative encoding network}. Here, we present
1266 the numerical methods; derive the expected distribution of the data in the
1267 latent space; evaluate the properties of the latent space, sample
1268 reconstruction, and generated samples; show the advantages over the adversarial
1269 counterpart; and demonstrate the application of the method in real world.
1270 </p>
1271 </description>
1272 </item>
1273 <item>
1274 <title>Speech-Image Semantic Alignment Does Not Depend on Any Prior Classification Tasks. (arXiv:2010.15288v1 [cs.LG])</title>
1275 <link>http://fr.arxiv.org/abs/2010.15288</link>
1276 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mortazavi_M/0/1/0/all/0/1">Masood S. Mortazavi</a></p>
1277
1278 <p>Semantically-aligned $(speech, image)$ datasets can be used to explore
1279 "visually-grounded speech". In a majority of existing investigations, features
1280 of an image signal are extracted using neural networks "pre-trained" on other
1281 tasks (e.g., classification on ImageNet). In still others, pre-trained networks
1282 are used to extract audio features prior to semantic embedding. Without
1283 "transfer learning" through pre-trained initialization or pre-trained feature
1284 extraction, previous results have tended to show low rates of recall in $speech
1285 \rightarrow image$ and $image \rightarrow speech$ queries.
1286 </p>
1287 <p>Choosing appropriate neural architectures for encoders in the speech and
1288 image branches and using large datasets, one can obtain competitive recall
1289 rates without any reliance on any pre-trained initialization or feature
1290 extraction: $(speech,image)$ semantic alignment and $speech \rightarrow image$
1291 and $image \rightarrow speech$ retrieval are canonical tasks worthy of
1292 independent investigation of their own and allow one to explore other
1293 questions---e.g., the size of the audio embedder can be reduced significantly
1294 with little loss of recall rates in $speech \rightarrow image$ and $image
1295 \rightarrow speech$ queries.
1296 </p>
1297 </description>
1298 </item>
1299 <item>
1300 <title>Link inference of noisy delay-coupled networks: Machine learning and opto-electronic experimental tests. (arXiv:2010.15289v1 [nlin.AO])</title>
1301 <link>http://fr.arxiv.org/abs/2010.15289</link>
1302 <description><p>Authors: <a href="http://fr.arxiv.org/find/nlin/1/au:+Banerjee_A/0/1/0/all/0/1">Amitava Banerjee</a>, <a href="http://fr.arxiv.org/find/nlin/1/au:+Hart_J/0/1/0/all/0/1">Joseph D. Hart</a>, <a href="http://fr.arxiv.org/find/nlin/1/au:+Roy_R/0/1/0/all/0/1">Rajarshi Roy</a>, <a href="http://fr.arxiv.org/find/nlin/1/au:+Ott_E/0/1/0/all/0/1">Edward Ott</a></p>
1303
1304 <p>We devise a machine learning technique to solve the general problem of
1305 inferring network links that have time-delays. The goal is to do this purely
1306 from time-series data of the network nodal states. This task has applications
1307 in fields ranging from applied physics and engineering to neuroscience and
1308 biology. To achieve this, we first train a type of machine learning system
1309 known as reservoir computing to mimic the dynamics of the unknown network. We
1310 formulate and test a technique that uses the trained parameters of the
1311 reservoir system output layer to deduce an estimate of the unknown network
1312 structure. Our technique, by its nature, is non-invasive, but is motivated by
1313 the widely-used invasive network inference method whereby the responses to
1314 active perturbations applied to the network are observed and employed to infer
1315 network links (e.g., knocking down genes to infer gene regulatory networks). We
1316 test this technique on experimental and simulated data from delay-coupled
1317 opto-electronic oscillator networks. We show that the technique often yields
1318 very good results particularly if the system does not exhibit synchrony. We
1319 also find that the presence of dynamical noise can strikingly enhance the
1320 accuracy and ability of our technique, especially in networks that exhibit
1321 synchrony.
1322 </p>
1323 </description>
1324 </item>
1325 <item>
1326 <title>Fact or Factitious? Contextualized Opinion Spam Detection. (arXiv:2010.15296v1 [cs.AI])</title>
1327 <link>http://fr.arxiv.org/abs/2010.15296</link>
1328 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kennedy_S/0/1/0/all/0/1">Stefan Kennedy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Walsh_N/0/1/0/all/0/1">Niall Walsh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sloka_K/0/1/0/all/0/1">Kirils Sloka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Foster_J/0/1/0/all/0/1">Jennifer Foster</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+McCarren_A/0/1/0/all/0/1">Andrew McCarren</a></p>
1329
1330 <p>In this paper we perform an analytic comparison of a number of techniques
1331 used to detect fake and deceptive online reviews. We apply a number machine
1332 learning approaches found to be effective, and introduce our own approach by
1333 fine-tuning state of the art contextualised embeddings. The results we obtain
1334 show the potential of contextualised embeddings for fake review detection, and
1335 lay the groundwork for future research in this area.
1336 </p>
1337 </description>
1338 </item>
1339 <item>
1340 <title>Analysis of Chorin-Type Projection Methods for the Stochastic Stokes Equations with General Multiplicative Noises. (arXiv:2010.15297v1 [math.NA])</title>
1341 <link>http://fr.arxiv.org/abs/2010.15297</link>
1342 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Feng_X/0/1/0/all/0/1">Xiaobing Feng</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Vo_L/0/1/0/all/0/1">Liet Vo</a></p>
1343
1344 <p>This paper is concerned with numerical analysis of two fully discrete
1345 Chorin-type projection methods for the stochastic Stokes equations with general
1346 non-solenoidal multiplicative noise. The first scheme is the standard Chorin
1347 scheme and the second one is a modified Chorin scheme which is designed by
1348 employing the Helmholtz decomposition on the noise function at each time step
1349 to produce a projected divergence-free noise and a "pseudo pressure" after
1350 combining the original pressure and the curl-free part of the decomposition.
1351 Optimal order rates of the convergence are proved for both velocity and
1352 pressure approximations of these two (semi-discrete) Chorin schemes. It is
1353 crucial to measure the errors in appropriate norms. The fully discrete finite
1354 element methods are formulated by discretizing both semi-discrete Chorin
1355 schemes in space by the standard finite element method. Suboptimal order error
1356 estimates are derived for both fully discrete methods. It is proved that all
1357 spatial error constants contain a growth factor $k^{-1/2}$, where $k$ denotes
1358 the time step size, which explains the deteriorating performance of the
1359 standard Chorin scheme when $k\to 0$ and the space mesh size is fixed as
1360 observed earlier in the numerical tests of [9]. Numerical results are also
1361 provided to guage the performance of the proposed numerical methods and to
1362 validate the sharpness of the theoretical error estimates.
1363 </p>
1364 </description>
1365 </item>
1366 <item>
1367 <title>Uncovering Latent Biases in Text: Method and Application to Peer Review. (arXiv:2010.15300v1 [cs.CL])</title>
1368 <link>http://fr.arxiv.org/abs/2010.15300</link>
1369 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Manzoor_E/0/1/0/all/0/1">Emaad Manzoor</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shah_N/0/1/0/all/0/1">Nihar B. Shah</a></p>
1370
1371 <p>Quantifying systematic disparities in numerical quantities such as employment
1372 rates and wages between population subgroups provides compelling evidence for
1373 the existence of societal biases. However, biases in the text written for
1374 members of different subgroups (such as in recommendation letters for male and
1375 non-male candidates), though widely reported anecdotally, remain challenging to
1376 quantify. In this work, we introduce a novel framework to quantify bias in text
1377 caused by the visibility of subgroup membership indicators. We develop a
1378 nonparametric estimation and inference procedure to estimate this bias. We then
1379 formalize an identification strategy to causally link the estimated bias to the
1380 visibility of subgroup membership indicators, provided observations from time
1381 periods both before and after an identity-hiding policy change. We identify an
1382 application wherein "ground truth" bias can be inferred to evaluate our
1383 framework, instead of relying on synthetic or secondary data. Specifically, we
1384 apply our framework to quantify biases in the text of peer reviews from a
1385 reputed machine learning conference before and after the conference adopted a
1386 double-blind reviewing policy. We show evidence of biases in the review ratings
1387 that serves as "ground truth", and show that our proposed framework accurately
1388 detects these biases from the review text without having access to the review
1389 ratings.
1390 </p>
1391 </description>
1392 </item>
1393 <item>
1394 <title>Point Cloud Attribute Compression via Successive Subspace Graph Transform. (arXiv:2010.15302v1 [cs.CV])</title>
1395 <link>http://fr.arxiv.org/abs/2010.15302</link>
1396 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yueru Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shao_Y/0/1/0/all/0/1">Yiting Shao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_J/0/1/0/all/0/1">Jing Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_G/0/1/0/all/0/1">Ge Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kuo_C/0/1/0/all/0/1">C.-C. Jay Kuo</a></p>
1397
1398 <p>Inspired by the recently proposed successive subspace learning (SSL)
1399 principles, we develop a successive subspace graph transform (SSGT) to address
1400 point cloud attribute compression in this work. The octree geometry structure
1401 is utilized to partition the point cloud, where every node of the octree
1402 represents a point cloud subspace with a certain spatial size. We design a
1403 weighted graph with self-loop to describe the subspace and define a graph
1404 Fourier transform based on the normalized graph Laplacian. The transforms are
1405 applied to large point clouds from the leaf nodes to the root node of the
1406 octree recursively, while the represented subspace is expanded from the
1407 smallest one to the whole point cloud successively. It is shown by experimental
1408 results that the proposed SSGT method offers better R-D performances than the
1409 previous Region Adaptive Haar Transform (RAHT) method.
1410 </p>
1411 </description>
1412 </item>
1413 <item>
1414 <title>Automatic joint damage quantification using computer vision and deep learning. (arXiv:2010.15303v1 [cs.CV])</title>
1415 <link>http://fr.arxiv.org/abs/2010.15303</link>
1416 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_Q/0/1/0/all/0/1">Quang Tran</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Roesler_J/0/1/0/all/0/1">Jeffery R. Roesler</a></p>
1417
1418 <p>Joint raveled or spalled damage (henceforth called joint damage) can affect
1419 the safety and long-term performance of concrete pavements. It is important to
1420 assess and quantify the joint damage over time to assist in building action
1421 plans for maintenance, predicting maintenance costs, and maximize the concrete
1422 pavement service life. A framework for the accurate, autonomous, and rapid
1423 quantification of joint damage with a low-cost camera is proposed using a
1424 computer vision technique with a deep learning (DL) algorithm. The DL model is
1425 employed to train 263 images of sawcuts with joint damage. The trained DL model
1426 is used for pixel-wise color-masking joint damage in a series of query 2D
1427 images, which are used to reconstruct a 3D image using open-source structure
1428 from motion algorithm. Another damage quantification algorithm using a color
1429 threshold is applied to detect and compute the surface area of the damage in
1430 the 3D reconstructed image. The effectiveness of the framework was validated
1431 through inspecting joint damage at four transverse contraction joints in
1432 Illinois, USA, including three acceptable joints and one unacceptable joint by
1433 visual inspection. The results show the framework achieves 76% recall and 10%
1434 error.
1435 </p>
1436 </description>
1437 </item>
1438 <item>
1439 <title>ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection. (arXiv:2010.15306v1 [eess.AS])</title>
1440 <link>http://fr.arxiv.org/abs/2010.15306</link>
1441 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Shimada_K/0/1/0/all/0/1">Kazuki Shimada</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Koyama_Y/0/1/0/all/0/1">Yuichiro Koyama</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Takahashi_N/0/1/0/all/0/1">Naoya Takahashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Takahashi_S/0/1/0/all/0/1">Shusuke Takahashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Mitsufuji_Y/0/1/0/all/0/1">Yuki Mitsufuji</a></p>
1442
1443 <p>Neural-network (NN)-based methods show high performance in sound event
1444 localization and detection (SELD). Conventional NN-based methods use two
1445 branches for a sound event detection (SED) target and a direction-of-arrival
1446 (DOA) target. The two-branch representation with a single network has to decide
1447 how to balance the two objectives during optimization. Using two networks
1448 dedicated to each task increases system complexity and network size. To address
1449 these problems, we propose an activity-coupled Cartesian DOA (ACCDOA)
1450 representation, which assigns a sound event activity to the length of a
1451 corresponding Cartesian DOA vector. The ACCDOA representation enables us to
1452 solve a SELD task with a single target and has two advantages: avoiding the
1453 necessity of balancing the objectives and model size increase. In experimental
1454 evaluations with the DCASE 2020 Task 3 dataset, the ACCDOA representation
1455 outperformed the two-branch representation in SELD metrics with a smaller
1456 network size. The ACCDOA-based SELD system also performed better than
1457 state-of-the-art SELD systems in terms of localization and location-dependent
1458 detection.
1459 </p>
1460 </description>
1461 </item>
1462 <item>
1463 <title>DeviceTTS: A Small-Footprint, Fast, Stable Network for On-Device Text-to-Speech. (arXiv:2010.15311v1 [eess.AS])</title>
1464 <link>http://fr.arxiv.org/abs/2010.15311</link>
1465 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Huang_Z/0/1/0/all/0/1">Zhiying Huang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_H/0/1/0/all/0/1">Hao Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lei_M/0/1/0/all/0/1">Ming Lei</a></p>
1466
1467 <p>With the number of smart devices increasing, the demand for on-device
1468 text-to-speech (TTS) increases rapidly. In recent years, many prominent
1469 End-to-End TTS methods have been proposed, and have greatly improved the
1470 quality of synthesized speech. However, to ensure the qualified speech, most
1471 TTS systems depend on large and complex neural network models, and it's hard to
1472 deploy these TTS systems on-device. In this paper, a small-footprint, fast,
1473 stable network for on-device TTS is proposed, named as DeviceTTS. DeviceTTS
1474 makes use of a duration predictor as a bridge between encoder and decoder so as
1475 to avoid the problem of words skipping and repeating in Tacotron. As we all
1476 know, model size is a key factor for on-device TTS. For DeviceTTS, Deep
1477 Feedforward Sequential Memory Network (DFSMN) is used as the basic component.
1478 Moreover, to speed up inference, mix-resolution decoder is proposed for balance
1479 the inference speed and speech quality. Experiences are done with WORLD and
1480 LPCNet vocoder. Finally, with only 1.4 million model parameters and 0.099
1481 GFLOPS, DeviceTTS achieves comparable performance with Tacotron and FastSpeech.
1482 As far as we know, the DeviceTTS can meet the needs of most of the devices in
1483 practical application.
1484 </p>
1485 </description>
1486 </item>
1487 <item>
1488 <title>"where is this relationship going?": Understanding Relationship Trajectories in Narrative Text. (arXiv:2010.15313v1 [cs.CL])</title>
1489 <link>http://fr.arxiv.org/abs/2010.15313</link>
1490 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+You_K/0/1/0/all/0/1">Keen You</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Goldwasser_D/0/1/0/all/0/1">Dan Goldwasser</a></p>
1491
1492 <p>We examine a new commonsense reasoning task: given a narrative describing a
1493 social interaction that centers on two protagonists, systems make inferences
1494 about the underlying relationship trajectory. Specifically, we propose two
1495 evaluation tasks: Relationship Outlook Prediction MCQ and Resolution Prediction
1496 MCQ. In Relationship Outlook Prediction, a system maps an interaction to a
1497 relationship outlook that captures how the interaction is expected to change
1498 the relationship. In Resolution Prediction, a system attributes a given
1499 relationship outlook to a particular resolution that explains the outcome.
1500 These two tasks parallel two real-life questions that people frequently ponder
1501 upon as they navigate different social situations: "where is this relationship
1502 going?" and "how did we end up here?". To facilitate the investigation of human
1503 social relationships through these two tasks, we construct a new dataset,
1504 Social Narrative Tree, which consists of 1250 stories documenting a variety of
1505 daily social interactions. The narratives encode a multitude of social elements
1506 that interweave to give rise to rich commonsense knowledge of how relationships
1507 evolve with respect to social interactions. We establish baseline performances
1508 using language models and the accuracies are significantly lower than human
1509 performance. The results demonstrate that models need to look beyond syntactic
1510 and semantic signals to comprehend complex human relationships.
1511 </p>
1512 </description>
1513 </item>
1514 <item>
1515 <title>Recurrent neural circuits for contour detection. (arXiv:2010.15314v1 [cs.CV])</title>
1516 <link>http://fr.arxiv.org/abs/2010.15314</link>
1517 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Linsley_D/0/1/0/all/0/1">Drew Linsley</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_J/0/1/0/all/0/1">Junkyung Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ashok_A/0/1/0/all/0/1">Alekh Ashok</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Serre_T/0/1/0/all/0/1">Thomas Serre</a></p>
1518
1519 <p>We introduce a deep recurrent neural network architecture that approximates
1520 visual cortical circuits. We show that this architecture, which we refer to as
1521 the gamma-net, learns to solve contour detection tasks with better sample
1522 efficiency than state-of-the-art feedforward networks, while also exhibiting a
1523 classic perceptual illusion, known as the orientation-tilt illusion. Correcting
1524 this illusion significantly reduces gamma-net contour detection accuracy by
1525 driving it to prefer low-level edges over high-level object boundary contours.
1526 Overall, our study suggests that the orientation-tilt illusion is a byproduct
1527 of neural circuits that help biological visual systems achieve robust and
1528 efficient contour detection, and that incorporating these circuits in
1529 artificial neural networks can improve computer vision.
1530 </p>
1531 </description>
1532 </item>
1533 <item>
1534 <title>Exploring Generative Adversarial Networks for Image-to-Image Translation in STEM Simulation. (arXiv:2010.15315v1 [cs.CV])</title>
1535 <link>http://fr.arxiv.org/abs/2010.15315</link>
1536 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lawrence_N/0/1/0/all/0/1">Nick Lawrence</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_M/0/1/0/all/0/1">Mingren Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yin_R/0/1/0/all/0/1">Ruiqi Yin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_C/0/1/0/all/0/1">Cloris Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Morgan_D/0/1/0/all/0/1">Dane Morgan</a></p>
1537
1538 <p>The use of accurate scanning transmission electron microscopy (STEM) image
1539 simulation methods require large computation times that can make their use
1540 infeasible for the simulation of many images. Other simulation methods based on
1541 linear imaging models, such as the convolution method, are much faster but are
1542 too inaccurate to be used in application. In this paper, we explore deep
1543 learning models that attempt to translate a STEM image produced by the
1544 convolution method to a prediction of the high accuracy multislice image. We
1545 then compare our results to those of regression methods. We find that using the
1546 deep learning model Generative Adversarial Network (GAN) provides us with the
1547 best results and performs at a similar accuracy level to previous regression
1548 models on the same dataset. Codes and data for this project can be found in
1549 this GitHub repository, https://github.com/uw-cmg/GAN-STEM-Conv2MultiSlice.
1550 </p>
1551 </description>
1552 </item>
1553 <item>
1554 <title>Multiple Sclerosis Severity Classification From Clinical Text. (arXiv:2010.15316v1 [cs.CL])</title>
1555 <link>http://fr.arxiv.org/abs/2010.15316</link>
1556 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Costa_A/0/1/0/all/0/1">Alister D Costa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Denkovski_S/0/1/0/all/0/1">Stefan Denkovski</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Malyska_M/0/1/0/all/0/1">Michal Malyska</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Moon_S/0/1/0/all/0/1">Sae Young Moon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rufino_B/0/1/0/all/0/1">Brandon Rufino</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_Z/0/1/0/all/0/1">Zhen Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Killian_T/0/1/0/all/0/1">Taylor Killian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ghassemi_M/0/1/0/all/0/1">Marzyeh Ghassemi</a></p>
1557
1558 <p>Multiple Sclerosis (MS) is a chronic, inflammatory and degenerative
1559 neurological disease, which is monitored by a specialist using the Expanded
1560 Disability Status Scale (EDSS) and recorded in unstructured text in the form of
1561 a neurology consult note. An EDSS measurement contains an overall "EDSS" score
1562 and several functional subscores. Typically, expert knowledge is required to
1563 interpret consult notes and generate these scores. Previous approaches used
1564 limited context length Word2Vec embeddings and keyword searches to predict
1565 scores given a consult note, but often failed when scores were not explicitly
1566 stated. In this work, we present MS-BERT, the first publicly available
1567 transformer model trained on real clinical data other than MIMIC. Next, we
1568 present MSBC, a classifier that applies MS-BERT to generate embeddings and
1569 predict EDSS and functional subscores. Lastly, we explore combining MSBC with
1570 other models through the use of Snorkel to generate scores for unlabelled
1571 consult notes. MSBC achieves state-of-the-art performance on all metrics and
1572 prediction tasks and outperforms the models generated from the Snorkel
1573 ensemble. We improve Macro-F1 by 0.12 (to 0.88) for predicting EDSS and on
1574 average by 0.29 (to 0.63) for predicting functional subscores over previous
1575 Word2Vec CNN and rule-based approaches.
1576 </p>
1577 </description>
1578 </item>
1579 <item>
1580 <title>The IQIYI System for Voice Conversion Challenge 2020. (arXiv:2010.15317v1 [cs.SD])</title>
1581 <link>http://fr.arxiv.org/abs/2010.15317</link>
1582 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gan_W/0/1/0/all/0/1">Wendong Gan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_H/0/1/0/all/0/1">Haitao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yan_Y/0/1/0/all/0/1">Yin Yan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_J/0/1/0/all/0/1">Jianwei Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_B/0/1/0/all/0/1">Bolong Wen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_X/0/1/0/all/0/1">Xueping Xu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_H/0/1/0/all/0/1">Hai Li</a></p>
1583
1584 <p>This paper presents the IQIYI voice conversion system (T24) for Voice
1585 Conversion 2020. In the competition, each target speaker has 70 sentences. We
1586 have built an end-to-end voice conversion system based on PPG. First, the ASR
1587 acoustic model calculates the BN feature, which represents the content-related
1588 information in the speech. Then the Mel feature is calculated through an
1589 improved prosody tacotron model. Finally, the Mel spectrum is converted to wav
1590 through an improved LPCNet. The evaluation results show that this system can
1591 achieve better voice conversion effects. In the case of using 16k rather than
1592 24k sampling rate audio, the conversion result is relatively good in
1593 naturalness and similarity. Among them, our best results are in the similarity
1594 evaluation of the Task 2, the 2nd in the ASV-based objective evaluation and the
1595 5th in the subjective evaluation.
1596 </p>
1597 </description>
1598 </item>
1599 <item>
1600 <title>Gaussian Processes Model-based Control of Underactuated Balance Robots. (arXiv:2010.15320v1 [cs.RO])</title>
1601 <link>http://fr.arxiv.org/abs/2010.15320</link>
1602 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_K/0/1/0/all/0/1">Kuo Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yi_J/0/1/0/all/0/1">Jingang Yi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Song_D/0/1/0/all/0/1">Dezhen Song</a></p>
1603
1604 <p>Ranging from cart-pole systems and autonomous bicycles to bipedal robots,
1605 control of these underactuated balance robots aims to achieve both external
1606 (actuated) subsystem trajectory tracking and internal (unactuated) subsystem
1607 balancing tasks with limited actuation authority. This paper proposes a
1608 learning model-based control framework for underactuated balance robots. The
1609 key idea to simultaneously achieve tracking and balancing tasks is to design
1610 control strategies in slow- and fast-time scales, respectively. In slow-time
1611 scale, model predictive control (MPC) is used to generate the desired internal
1612 subsystem trajectory that encodes the external subsystem tracking performance
1613 and control input. In fast-time scale, the actual internal trajectory is
1614 stabilized to the desired internal trajectory by using an inverse dynamics
1615 controller. The coupling effects between the external and internal subsystems
1616 are captured through the planned internal trajectory profile and the dual
1617 structural properties of the robotic systems. The control design is based on
1618 Gaussian processes (GPs) regression model that are learned from experiments
1619 without need of priori knowledge about the robot dynamics nor successful
1620 balance demonstration. The GPs provide estimates of modeling uncertainties of
1621 the robotic systems and these uncertainty estimations are incorporated in the
1622 MPC design to enhance the control robustness to modeling errors. The
1623 learning-based control design is analyzed with guaranteed stability and
1624 performance. The proposed design is demonstrated by experiments on a Furuta
1625 pendulum and an autonomous bikebot.
1626 </p>
1627 </description>
1628 </item>
1629 <item>
1630 <title>Improvement of EAST Data Acquisition Configuration Management. (arXiv:2010.15322v1 [physics.ins-det])</title>
1631 <link>http://fr.arxiv.org/abs/2010.15322</link>
1632 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Ying_C/0/1/0/all/0/1">Chen Ying</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Shi_L/0/1/0/all/0/1">Li Shi</a></p>
1633
1634 <p>The data acquisition console is an important component of the EAST data
1635 acquisition system which provides unified data acquisition and long-term data
1636 storage for diagnostics. The data acquisition console is used to manage the
1637 data acquisition configuration information and control the data acquisition
1638 workflow. The data acquisition console has been developed many years, and with
1639 increasing of data acquisition nodes and emergence of new control nodes, the
1640 function of configuration management has become inadequate. It is going to
1641 update the configuration management function of data acquisition console. The
1642 upgraded data acquisition console based on LabVIEW should be oriented to the
1643 data acquisition administrator, with the functions of managing data acquisition
1644 nodes, managing control nodes, setting and publishing configuration parameters,
1645 batch management, database backup, monitoring the status of data acquisition
1646 nodes, controlling the data acquisition workflow, and shot simulation data
1647 acquisition test. The upgraded data acquisition console has been designed and
1648 under testing recently.
1649 </p>
1650 </description>
1651 </item>
1652 <item>
1653 <title>Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth. (arXiv:2010.15327v1 [cs.LG])</title>
1654 <link>http://fr.arxiv.org/abs/2010.15327</link>
1655 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_T/0/1/0/all/0/1">Thao Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Raghu_M/0/1/0/all/0/1">Maithra Raghu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kornblith_S/0/1/0/all/0/1">Simon Kornblith</a></p>
1656
1657 <p>A key factor in the success of deep neural networks is the ability to scale
1658 models to improve performance by varying the architecture depth and width. This
1659 simple property of neural network design has resulted in highly effective
1660 architectures for a variety of tasks. Nevertheless, there is limited
1661 understanding of effects of depth and width on the learned representations. In
1662 this paper, we study this fundamental question. We begin by investigating how
1663 varying depth and width affects model hidden representations, finding a
1664 characteristic block structure in the hidden representations of larger capacity
1665 (wider or deeper) models. We demonstrate that this block structure arises when
1666 model capacity is large relative to the size of the training set, and is
1667 indicative of the underlying layers preserving and propagating the dominant
1668 principal component of their representations. This discovery has important
1669 ramifications for features learned by different models, namely, representations
1670 outside the block structure are often similar across architectures with varying
1671 widths and depths, but the block structure is unique to each model. We analyze
1672 the output predictions of different model architectures, finding that even when
1673 the overall accuracy is similar, wide and deep models exhibit distinctive error
1674 patterns and variations across classes.
1675 </p>
1676 </description>
1677 </item>
1678 <item>
1679 <title>Scalable Attack-Resistant Obfuscation of Logic Circuits. (arXiv:2010.15329v1 [cs.CR])</title>
1680 <link>http://fr.arxiv.org/abs/2010.15329</link>
1681 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Alaql_A/0/1/0/all/0/1">Abdulrahman Alaql</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bhunia_S/0/1/0/all/0/1">Swarup Bhunia</a></p>
1682
1683 <p>Hardware IP protection has been one of the most critical areas of research in
1684 the past years. Recently, attacks on hardware IPs (such as reverse engineering
1685 or cloning) have evolved as attackers have developed sophisticated techniques.
1686 Therefore, hardware obfuscation has been introduced as a powerful tool to
1687 protect IPs against piracy attacks. However, many recent attempts to break
1688 existing obfuscation methods have been successful in unlocking the IP and
1689 restoring its functionality. In this paper, we propose SARO, a Scalable
1690 Attack-Resistant Obfuscation that provides a robust functional and structural
1691 design transformation process. SARO treats the target circuit as a graph, and
1692 performs a partitioning algorithm to produce a set of sub-graphs, then applies
1693 our novel Truth Table Transformation (T3) process to each partition. We also
1694 propose the $T3_{metric}$, which is developed to quantify the structural and
1695 functional design transformation level caused by the obfuscation process. We
1696 evaluate SARO on ISCAS85 and EPFL benchmarks, and provide full security and
1697 performance analysis of our proposed framework.
1698 </p>
1699 </description>
1700 </item>
1701 <item>
1702 <title>Learning Sampling Distributions Using Local 3D Workspace Decompositions for Motion Planning in High Dimensions. (arXiv:2010.15335v1 [cs.RO])</title>
1703 <link>http://fr.arxiv.org/abs/2010.15335</link>
1704 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chamzas_C/0/1/0/all/0/1">Constantinos Chamzas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kingston_Z/0/1/0/all/0/1">Zachary Kingston</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Quintero_Pena_C/0/1/0/all/0/1">Carlos Quintero-Pe&#xf1;a</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shrivastava_A/0/1/0/all/0/1">Anshumali Shrivastava</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kavraki_L/0/1/0/all/0/1">Lydia E. Kavraki</a></p>
1705
1706 <p>Earlier work has shown that reusing experience from prior motion planning
1707 problems can improve the efficiency of similar, future motion planning queries.
1708 However, for robots with many degrees-of-freedom, these methods exhibit poor
1709 generalization across different environments and often require large datasets
1710 that are impractical to gather. We present SPARK and FLAME , two
1711 experience-based frameworks for sampling-based planning applicable to complex
1712 manipulators in 3 D environments. Both combine samplers associated with
1713 features from a workspace decomposition into a global biased sampling
1714 distribution. SPARK decomposes the environment based on exact geometry while
1715 FLAME is more general, and uses an octree-based decomposition obtained from
1716 sensor data. We demonstrate the effectiveness of SPARK and FLAME on a Fetch
1717 robot tasked with challenging pick-and-place manipulation problems. Our
1718 approaches can be trained incrementally and significantly improve performance
1719 with only a handful of examples, generalizing better over diverse tasks and
1720 environments as compared to prior approaches.
1721 </p>
1722 </description>
1723 </item>
1724 <item>
1725 <title>SAR-NAS: Skeleton-based Action Recognition via Neural Architecture Searching. (arXiv:2010.15336v1 [cs.CV])</title>
1726 <link>http://fr.arxiv.org/abs/2010.15336</link>
1727 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_H/0/1/0/all/0/1">Haoyuan Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hou_Y/0/1/0/all/0/1">Yonghong Hou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_P/0/1/0/all/0/1">Pichao Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_Z/0/1/0/all/0/1">Zihui Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_W/0/1/0/all/0/1">Wanqing Li</a></p>
1728
1729 <p>This paper presents a study of automatic design of neural network
1730 architectures for skeleton-based action recognition. Specifically, we encode a
1731 skeleton-based action instance into a tensor and carefully define a set of
1732 operations to build two types of network cells: normal cells and reduction
1733 cells. The recently developed DARTS (Differentiable Architecture Search) is
1734 adopted to search for an effective network architecture that is built upon the
1735 two types of cells. All operations are 2D based in order to reduce the overall
1736 computation and search space. Experiments on the challenging NTU RGB+D and
1737 Kinectics datasets have verified that most of the networks developed to date
1738 for skeleton-based action recognition are likely not compact and efficient. The
1739 proposed method provides an approach to search for such a compact network that
1740 is able to achieve comparative or even better performance than the
1741 state-of-the-art methods.
1742 </p>
1743 </description>
1744 </item>
1745 <item>
1746 <title>A New "Model-Free" Method Combined with Neural Network for MIMO Systems. (arXiv:2010.15338v1 [eess.SY])</title>
1747 <link>http://fr.arxiv.org/abs/2010.15338</link>
1748 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_F/0/1/0/all/0/1">Feilong Zhang</a></p>
1749
1750 <p>In this brief, a model-free adaptive predictive control (MFAPC) is proposed.
1751 It outperforms the current model-free adaptive control (MFAC) for not only
1752 solving the time delay problem in multiple-input multiple-output (MIMO) systems
1753 but also relaxing the current rigorous assumptions for sake of a wider
1754 applicable range. The most attractive merit of the proposed controller is that
1755 the controller design, performance analysis and applications are easy for
1756 engineers to realize. Furthermore, the problem of how to choose the matrix
1757 {\lambda} is finished by analyzing the function of the closed-loop poles rather
1758 than the previous contraction mapping method. Additionally, in view of the
1759 nonlinear modeling capability and adaptability of neural networks (NNs), we
1760 combine these two classes of algorithms together. The feasibility and several
1761 interesting results of the proposed method are shown in simulations.
1762 </p>
1763 </description>
1764 </item>
1765 <item>
1766 <title>Identifying safe intersection design through unsupervised feature extraction from satellite imagery. (arXiv:2010.15343v1 [cs.CV])</title>
1767 <link>http://fr.arxiv.org/abs/2010.15343</link>
1768 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wijnands_J/0/1/0/all/0/1">Jasper S. Wijnands</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_H/0/1/0/all/0/1">Haifeng Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nice_K/0/1/0/all/0/1">Kerry A. Nice</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Thompson_J/0/1/0/all/0/1">Jason Thompson</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Scully_K/0/1/0/all/0/1">Katherine Scully</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_J/0/1/0/all/0/1">Jingqiu Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stevenson_M/0/1/0/all/0/1">Mark Stevenson</a></p>
1769
1770 <p>The World Health Organization has listed the design of safer intersections as
1771 a key intervention to reduce global road trauma. This article presents the
1772 first study to systematically analyze the design of all intersections in a
1773 large country, based on aerial imagery and deep learning. Approximately 900,000
1774 satellite images were downloaded for all intersections in Australia and
1775 customized computer vision techniques emphasized the road infrastructure. A
1776 deep autoencoder extracted high-level features, including the intersection's
1777 type, size, shape, lane markings, and complexity, which were used to cluster
1778 similar designs. An Australian telematics data set linked infrastructure design
1779 to driving behaviors captured during 66 million kilometers of driving. This
1780 showed more frequent hard acceleration events (per vehicle) at four- than
1781 three-way intersections, relatively low hard deceleration frequencies at
1782 T-intersections, and consistently low average speeds on roundabouts. Overall,
1783 domain-specific feature extraction enabled the identification of infrastructure
1784 improvements that could result in safer driving behaviors, potentially reducing
1785 road trauma.
1786 </p>
1787 </description>
1788 </item>
1789 <item>
1790 <title>Sea-Net: Squeeze-And-Excitation Attention Net For Diabetic Retinopathy Grading. (arXiv:2010.15344v1 [cs.CV])</title>
1791 <link>http://fr.arxiv.org/abs/2010.15344</link>
1792 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_Z/0/1/0/all/0/1">Ziyuan Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chopra_K/0/1/0/all/0/1">Kartik Chopra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zeng_Z/0/1/0/all/0/1">Zeng Zeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_X/0/1/0/all/0/1">Xiaoli Li</a></p>
1793
1794 <p>Diabetes is one of the most common disease in individuals. \textit{Diabetic
1795 retinopathy} (DR) is a complication of diabetes, which could lead to blindness.
1796 Automatic DR grading based on retinal images provides a great diagnostic and
1797 prognostic value for treatment planning. However, the subtle differences among
1798 severity levels make it difficult to capture important features using
1799 conventional methods. To alleviate the problems, a new deep learning
1800 architecture for robust DR grading is proposed, referred to as SEA-Net, in
1801 which, spatial attention and channel attention are alternatively carried out
1802 and boosted with each other, improving the classification performance. In
1803 addition, a hybrid loss function is proposed to further maximize the
1804 inter-class distance and reduce the intra-class variability. Experimental
1805 results have shown the effectiveness of the proposed architecture.
1806 </p>
1807 </description>
1808 </item>
1809 <item>
1810 <title>Developing Augmented Reality based Gaming Model to Teach Ethical Education in Primary Schools. (arXiv:2010.15346v1 [cs.CY])</title>
1811 <link>http://fr.arxiv.org/abs/2010.15346</link>
1812 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ali_M/0/1/0/all/0/1">Mohammad Ali</a></p>
1813
1814 <p>Education sector is adopting new technologies for both teaching and learning
1815 pedagogy. Augmented Reality (AR) is a new technology that can be used in the
1816 educational pedagogy to enhance the engagement with students. Students interact
1817 with AR-based educational material for more visualization and explanation.
1818 Therefore, the use of AR in education is becoming more popular. However, most
1819 researches narrate the use of AR technologies in the field of English, Maths,
1820 Science, Culture, Arts, and History education but the absence of ethical
1821 education is visible. In our paper, we design the system and develop an
1822 AR-based mobile game model in the field of Ethical education for pre-primary
1823 students. Students from pre-primary require more interactive lessons than
1824 theoretical concepts. So, we use AR technology to develop a game which offers
1825 interactive procedures where students can learn with fun and engage with the
1826 context. Finally, we develop a prototype that works with our research
1827 objective. We conclude our paper with future works.
1828 </p>
1829 </description>
1830 </item>
1831 <item>
1832 <title>Distance Invariant Sparse Autoencoder for Wireless Signal Strength Mapping. (arXiv:2010.15347v1 [eess.SP])</title>
1833 <link>http://fr.arxiv.org/abs/2010.15347</link>
1834 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Miyagusuku_R/0/1/0/all/0/1">Renato Miyagusuku</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ozaki_K/0/1/0/all/0/1">Koichi Ozaki</a></p>
1835
1836 <p>Wireless signal strength based localization can enable robust localization
1837 for robots using inexpensive sensors. For this, a location-to-signal-strength
1838 map has to be learned for each access point in the environment. Due to the
1839 ubiquity of Wireless networks in most environments, this can result in tens or
1840 hundreds of maps. To reduce the dimensionality of this problem, we employ
1841 autoencoders, which are a popular unsupervised approach for feature extraction
1842 and data compression. In particular, we propose the use of sparse autoencoders
1843 that learn latent spaces that preserve the relative distance between inputs.
1844 Distance invariance between input and latent spaces allows our system to
1845 successfully learn compact representations that allow precise data
1846 reconstruction but also have a low impact on localization performance when
1847 using maps from the latent space rather than the input space. We demonstrate
1848 the feasibility of our approach by performing experiments in outdoor
1849 environments.
1850 </p>
1851 </description>
1852 </item>
1853 <item>
1854 <title>A Hybrid Position/Force Controller for Joint Robots. (arXiv:2010.15350v1 [cs.RO])</title>
1855 <link>http://fr.arxiv.org/abs/2010.15350</link>
1856 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_S/0/1/0/all/0/1">Shengwen Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_J/0/1/0/all/0/1">Juan Ren</a></p>
1857
1858 <p>In this paper, we present a hybrid position/force controller for operating
1859 joint robots. The hybrid controller has two goals---motion tracking and force
1860 regulating. As long as these two goals are not mutually exclusive, they can be
1861 decoupled in some way. In this work, we make use of the smooth and invertible
1862 mapping from joint space to task space to decouple the two control goals and
1863 design controllers separately. The traditional motion controller in task space
1864 is used for motion control, while the force controller is designed through
1865 manipulating the desired trajectory to regulate the force indirectly. Two case
1866 studies---contour tracking/polishing surfaces and grabbing boxes with two
1867 robotic arms---are presented to show the efficacy of the hybrid controller, and
1868 simulations with physics engines are carried out to validate the efficacy of
1869 the proposed method.
1870 </p>
1871 </description>
1872 </item>
1873 <item>
1874 <title>An automated and multi-parametric algorithm for objective analysis of meibography images. (arXiv:2010.15352v1 [eess.IV])</title>
1875 <link>http://fr.arxiv.org/abs/2010.15352</link>
1876 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Xiao_P/0/1/0/all/0/1">Peng Xiao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Luo_Z/0/1/0/all/0/1">Zhongzhou Luo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Deng_Y/0/1/0/all/0/1">Yuqing Deng</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_G/0/1/0/all/0/1">Gengyuan Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Yuan_J/0/1/0/all/0/1">Jin Yuan</a></p>
1877
1878 <p>Meibography is a non-contact imaging technique used by ophthalmologists to
1879 assist in the evaluation and diagnosis of meibomian gland dysfunction (MGD).
1880 While artificial qualitative analysis of meibography images could lead to low
1881 repeatability and efficiency and multi-parametric analysis is demanding to
1882 offer more comprehensive information in discovering subtle changes of meibomian
1883 glands during MGD progression, we developed an automated and multi-parametric
1884 algorithm for objective and quantitative analysis of meibography images. The
1885 full architecture of the algorithm can be divided into three steps: (1)
1886 segmentation of the tarsal conjunctiva area as the region of interest (ROI);
1887 (2) segmentation and identification of glands within the ROI; and (3)
1888 quantitative multi-parametric analysis including newly defined gland diameter
1889 deformation index (DI), gland tortuosity index (TI), and glands signal index
1890 (SI). To evaluate the performance of the automated algorithm, the similarity
1891 index (k) and the segmentation error including the false positive rate (r_P)
1892 and the false negative rate (r_N) are calculated between the manually defined
1893 ground truth and the automatic segmentations of both the ROI and meibomian
1894 glands of 15 typical meibography images. The feasibility of the algorithm is
1895 demonstrated in analyzing typical meibograhy images.
1896 </p>
1897 </description>
1898 </item>
1899 <item>
1900 <title>Domain decomposition and partitioning methods for mixed finite element discretizations of the Biot system of poroelasticity. (arXiv:2010.15353v1 [math.NA])</title>
1901 <link>http://fr.arxiv.org/abs/2010.15353</link>
1902 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Jayadharan_M/0/1/0/all/0/1">Manu Jayadharan</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Khattatov_E/0/1/0/all/0/1">Eldar Khattatov</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Yotov_I/0/1/0/all/0/1">Ivan Yotov</a></p>
1903
1904 <p>We develop non-overlapping domain decomposition methods for the Biot system
1905 of poroelasticity in a mixed form. The solid deformation is modeled with a
1906 mixed three-field formulation with weak stress symmetry. The fluid flow is
1907 modeled with a mixed Darcy formulation. We introduce displacement and pressure
1908 Lagrange multipliers on the subdomain interfaces to impose weakly continuity of
1909 normal stress and normal velocity, respectively. The global problem is reduced
1910 to an interface problem for the Lagrange multipliers, which is solved by a
1911 Krylov space iterative method. We study both monolithic and split methods. In
1912 the monolithic method, a coupled displacement-pressure interface problem is
1913 solved, with each iteration requiring the solution of local Biot problems. We
1914 show that the resulting interface operator is positive definite and analyze the
1915 convergence of the iteration. We further study drained split and fixed stress
1916 Biot splittings, in which case we solve separate interface problems requiring
1917 elasticity and Darcy solves. We analyze the stability of the split
1918 formulations. Numerical experiments are presented to illustrate the convergence
1919 of the domain decomposition methods and compare their accuracy and efficiency.
1920 </p>
1921 </description>
1922 </item>
1923 <item>
1924 <title>Reconfigurable Intelligent Surface Aided Secure Transmission: Outage-Constrained Energy-Efficiency Maximization. (arXiv:2010.15354v1 [cs.IT])</title>
1925 <link>http://fr.arxiv.org/abs/2010.15354</link>
1926 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Z/0/1/0/all/0/1">Zongze Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shuai Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_M/0/1/0/all/0/1">Miaowen Wen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Y/0/1/0/all/0/1">Yik-Chung Wu</a></p>
1927
1928 <p>Reconfigurable intelligent surface (RIS) has the potential to significantly
1929 enhance the network secure transmission performance by reconfiguring the
1930 wireless propagation environment. However, due to the passive nature of
1931 eavesdroppers and the cascaded channel brought by the RIS, the eavesdroppers'
1932 channel state information is imperfectly obtained at the base station. Under
1933 the channel uncertainty, the optimal phase-shift, power allocation, and
1934 transmission rate design for secure transmission is currently unknown due to
1935 the difficulty of handling the probabilistic constraint with coupled variables.
1936 To fill this gap, this paper formulates a problem of energy-efficient secure
1937 transmission design while incorporating the probabilistic constraint. By
1938 transforming the probabilistic constraint and decoupling variables, the secure
1939 energy efficiency maximization problem can be solved via alternatively
1940 executing difference-of-convex programming and semidefinite relaxation
1941 technique. To scale the solution to massive antennas and reflecting elements
1942 scenario, a fast first-order algorithm with low complexity is further proposed.
1943 Simulation results show that the proposed first-order algorithm achieves
1944 identical performance to the conventional method but saves at least two orders
1945 of magnitude in computation time. Moreover, the resultant RIS aided secure
1946 transmission significantly improves the energy efficiency compared to baseline
1947 schemes of random phase-shift, fixed phase-shift, and RIS ignoring CSI
1948 uncertainty.
1949 </p>
1950 </description>
1951 </item>
1952 <item>
1953 <title>Financial ticket intelligent recognition system based on deep learning. (arXiv:2010.15356v1 [cs.LG])</title>
1954 <link>http://fr.arxiv.org/abs/2010.15356</link>
1955 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_F/0/1/0/all/0/1">Fukang Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_H/0/1/0/all/0/1">Haiyu Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_B/0/1/0/all/0/1">Bo Xu</a></p>
1956
1957 <p>Facing the rapid growth in the issuance of financial tickets (or bills,
1958 invoices etc.), traditional manual invoice reimbursement and financial
1959 accounting system are imposing an increasing burden on financial accountants
1960 and consuming excessive manpower. To solve this problem, we proposes an
1961 iterative self-learning Framework of Financial Ticket intelligent Recognition
1962 System (FFTRS), which can support the fast iterative updating and extensibility
1963 of the algorithm model, which are the fundamental requirements for a practical
1964 financial accounting system. In addition, we designed a simple yet efficient
1965 Financial Ticket Faster Detection network (FTFDNet) and an intelligent data
1966 warehouse of financial ticket are designed to strengthen its efficiency and
1967 performance. At present, the system can recognize 194 kinds of financial
1968 tickets and has an automatic iterative optimization mechanism, which means,
1969 with the increase of application time, the types of tickets supported by the
1970 system will continue to increase, and the accuracy of recognition will continue
1971 to improve. Experimental results show that the average recognition accuracy of
1972 the system is 97.07%, and the average running time for a single ticket is
1973 175.67ms. The practical value of the system has been tested in a commercial
1974 application, which makes a beneficial attempt for the deep learning technology
1975 in financial accounting work.
1976 </p>
1977 </description>
1978 </item>
1979 <item>
1980 <title>A stochastic optimization algorithm for analyzing planar central and balanced configurations in the $n$-body problem. (arXiv:2010.15358v1 [math.DS])</title>
1981 <link>http://fr.arxiv.org/abs/2010.15358</link>
1982 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Doicu_A/0/1/0/all/0/1">Alexandru Doicu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Zhao_L/0/1/0/all/0/1">Lei Zhao</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Doicu_A/0/1/0/all/0/1">Adrian Doicu</a></p>
1983
1984 <p>A stochastic optimization algorithm for analyzing planar central and balanced
1985 configurations in the $n$-body problem is presented. We find a comprehensive
1986 list of equal mass central configurations satisfying the Morse equality up to
1987 $n=12$. We show some exemplary balanced configurations in the case $n=5$, as
1988 well as some balanced configurations without any axis of symmetry in the cases
1989 $n=4$ and $n=10$.
1990 </p>
1991 </description>
1992 </item>
1993 <item>
1994 <title>Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection. (arXiv:2010.15360v1 [cs.CL])</title>
1995 <link>http://fr.arxiv.org/abs/2010.15360</link>
1996 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shaolei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zhongyuan Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Che_W/0/1/0/all/0/1">Wanxiang Che</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_T/0/1/0/all/0/1">Ting Liu</a></p>
1997
1998 <p>Most existing approaches to disfluency detection heavily rely on
1999 human-annotated corpora, which is expensive to obtain in practice. There have
2000 been several proposals to alleviate this issue with, for instance,
2001 self-supervised learning techniques, but they still require human-annotated
2002 corpora. In this work, we explore the unsupervised learning paradigm which can
2003 potentially work with unlabeled text corpora that are cheaper and easier to
2004 obtain. Our model builds upon the recent work on Noisy Student Training, a
2005 semi-supervised learning approach that extends the idea of self-training.
2006 Experimental results on the commonly used English Switchboard test set show
2007 that our approach achieves competitive performance compared to the previous
2008 state-of-the-art supervised systems using contextualized word embeddings (e.g.
2009 BERT and ELECTRA).
2010 </p>
2011 </description>
2012 </item>
2013 <item>
2014 <title>Model-Agnostic Counterfactual Reasoning for Eliminating Popularity Bias in Recommender System. (arXiv:2010.15363v1 [cs.IR])</title>
2015 <link>http://fr.arxiv.org/abs/2010.15363</link>
2016 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_T/0/1/0/all/0/1">Tianxin Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_F/0/1/0/all/0/1">Fuli Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_J/0/1/0/all/0/1">Jiawei Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shi_C/0/1/0/all/0/1">Chufeng Shi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Z/0/1/0/all/0/1">Ziwei Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yi_J/0/1/0/all/0/1">Jinfeng Yi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_X/0/1/0/all/0/1">Xiangnan He</a></p>
2017
2018 <p>The general aim of the recommender system is to provide personalized
2019 suggestions to users, which is opposed to suggesting popular items. However,
2020 the normal training paradigm, i.e., fitting a recommender model to recover the
2021 user behavior data with pointwise or pairwise loss, makes the model biased
2022 towards popular items. This results in the terrible Matthew effect, making
2023 popular items be more frequently recommended and become even more popular.
2024 Existing work addresses this issue with Inverse Propensity Weighting (IPW),
2025 which decreases the impact of popular items on the training and increases the
2026 impact of long-tail items. Although theoretically sound, IPW methods are highly
2027 sensitive to the weighting strategy, which is notoriously difficult to tune.
2028 </p>
2029 <p>In this work, we explore the popularity bias issue from a novel and
2030 fundamental perspective -- cause-effect. We identify that popularity bias lies
2031 in the direct effect from the item node to the ranking score, such that an
2032 item's intrinsic property is the cause of mistakenly assigning it a higher
2033 ranking score. To eliminate popularity bias, it is essential to answer the
2034 counterfactual question that what the ranking score would be if the model only
2035 uses item property. To this end, we formulate a causal graph to describe the
2036 important cause-effect relations in the recommendation process. During
2037 training, we perform multi-task learning to achieve the contribution of each
2038 cause; during testing, we perform counterfactual inference to remove the effect
2039 of item popularity. Remarkably, our solution amends the learning process of
2040 recommendation which is agnostic to a wide range of models. We demonstrate it
2041 on Matrix Factorization (MF) and LightGCN, which are representative of the
2042 conventional and state-of-the-art model for collaborative filtering.
2043 Experiments on five real-world datasets demonstrate the effectiveness of our
2044 method.
2045 </p>
2046 </description>
2047 </item>
2048 <item>
2049 <title>Online State-Time Trajectory Planning Using Timed-ESDF in Highly Dynamic Environments. (arXiv:2010.15364v1 [cs.RO])</title>
2050 <link>http://fr.arxiv.org/abs/2010.15364</link>
2051 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_D/0/1/0/all/0/1">Delong Zhu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_T/0/1/0/all/0/1">Tong Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_J/0/1/0/all/0/1">Jiahui Lin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fang_Y/0/1/0/all/0/1">Yuqi Fang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Meng_M/0/1/0/all/0/1">Max Q.-H. Meng</a></p>
2052
2053 <p>Online state-time trajectory planning in highly dynamic environments remains
2054 an unsolved problem due to the unpredictable motions of moving obstacles and
2055 the curse of dimensionality from the state-time space. Existing state-time
2056 planners are typically implemented based on randomized sampling approaches or
2057 path searching on discretized state graph. The smoothness, path clearance, and
2058 planning efficiency of these planners are usually not satisfying. In this work,
2059 we propose a gradient-based planner over the state-time space for online
2060 trajectory generation in highly dynamic environments. To enable the
2061 gradient-based optimization, we propose a Timed-ESDT that supports distance and
2062 gradient queries with state-time keys. Based on the Timed-ESDT, we also define
2063 a smooth prior and an obstacle likelihood function that is compatible with the
2064 state-time space. The trajectory planning is then formulated to a MAP problem
2065 and solved by an efficient numerical optimizer. Moreover, to improve the
2066 optimality of the planner, we also define a state-time graph and then conduct
2067 path searching on it to find a better initialization for the optimizer. By
2068 integrating the graph searching, the planning quality is significantly
2069 improved. Experiment results on simulated and benchmark datasets show that our
2070 planner can outperform the state-of-the-art methods, demonstrating its
2071 significant advantages over the traditional ones.
2072 </p>
2073 </description>
2074 </item>
2075 <item>
2076 <title>Infinite Time Solutions of Numerical Schemes for Advection Problems. (arXiv:2010.15365v1 [math.NA])</title>
2077 <link>http://fr.arxiv.org/abs/2010.15365</link>
2078 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Biswas_A/0/1/0/all/0/1">Abhijit Biswas</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Seibold_B/0/1/0/all/0/1">Benjamin Seibold</a></p>
2079
2080 <p>This paper addresses the question whether there are numerical schemes for
2081 constant-coefficient advection problems that can yield convergent solutions for
2082 an infinite time horizon. The motivation is that such methods may serve as
2083 building blocks for long-time accurate solutions in more complex
2084 advection-dominated problems. After establishing a new notion of convergence in
2085 an infinite time limit of numerical methods, we first show that linear methods
2086 cannot meet this convergence criterion. Then we present a new numerical
2087 methodology, based on a nonlinear jet scheme framework. We show that these
2088 methods do satisfy the new convergence criterion, thus establishing that
2089 numerical methods exist that converge on an infinite time horizon, and
2090 demonstrate the long-time accuracy gains incurred by this property.
2091 </p>
2092 </description>
2093 </item>
2094 <item>
2095 <title>Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation. (arXiv:2010.15366v1 [cs.SD])</title>
2096 <link>http://fr.arxiv.org/abs/2010.15366</link>
2097 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_S/0/1/0/all/0/1">Sung-Feng Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chuang_S/0/1/0/all/0/1">Shun-Po Chuang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_D/0/1/0/all/0/1">Da-Rong Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yi-Chen Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_G/0/1/0/all/0/1">Gene-Ping Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_H/0/1/0/all/0/1">Hung-yi Lee</a></p>
2098
2099 <p>Speech separation has been well-developed while there are still problems
2100 waiting to be solved. The main problem we focus on in this paper is the
2101 frequent label permutation switching of permutation invariant training (PIT).
2102 For N-speaker separation, there would be N! possible label permutations. How to
2103 stably select correct label permutations is a long-standing problem. In this
2104 paper, we utilize self-supervised pre-training to stabilize the label
2105 permutations. Among several types of self-supervised tasks, speech enhancement
2106 based pre-training tasks show significant effectiveness in our experiments.
2107 When using off-the-shelf pre-trained models, training duration could be
2108 shortened to one-third to two-thirds. Furthermore, even taking pre-training
2109 time into account, the entire training process could still be shorter without a
2110 performance drop when using a larger batch size.
2111 </p>
2112 </description>
2113 </item>
2114 <item>
2115 <title>Learning Centric Wireless Resource Allocation for Edge Computing: Algorithm and Experiment. (arXiv:2010.15371v1 [cs.IT])</title>
2116 <link>http://fr.arxiv.org/abs/2010.15371</link>
2117 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_L/0/1/0/all/0/1">Liangkai Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hong_Y/0/1/0/all/0/1">Yuncong Hong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shuai Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Han_R/0/1/0/all/0/1">Ruihua Han</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_D/0/1/0/all/0/1">Dachuan Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_R/0/1/0/all/0/1">Rui Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hao_Q/0/1/0/all/0/1">Qi Hao</a></p>
2118
2119 <p>Edge intelligence is an emerging network architecture that integrates
2120 sensing, communication, computing components, and supports various machine
2121 learning applications, where a fundamental communication question is: how to
2122 allocate the limited wireless resources (such as time, energy) to the
2123 simultaneous model training of heterogeneous learning tasks? Existing methods
2124 ignore two important facts: 1) different models have heterogeneous demands on
2125 training data; 2) there is a mismatch between the simulated environment and the
2126 real-world environment. As a result, they could lead to low learning
2127 performance in practice. This paper proposes the learning centric wireless
2128 resource allocation (LCWRA) scheme that maximizes the worst learning
2129 performance of multiple classification tasks. Analysis shows that the optimal
2130 transmission time has an inverse power relationship with respect to the
2131 classification error. Finally, both simulation and experimental results are
2132 provided to verify the performance of the proposed LCWRA scheme and its
2133 robustness in real implementation.
2134 </p>
2135 </description>
2136 </item>
2137 <item>
2138 <title>Learning Personalized Discretionary Lane-Change Initiation for Fully Autonomous Driving Based on Reinforcement Learning. (arXiv:2010.15372v1 [cs.HC])</title>
2139 <link>http://fr.arxiv.org/abs/2010.15372</link>
2140 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Z/0/1/0/all/0/1">Zhuoxi Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zheng Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_B/0/1/0/all/0/1">Bo Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nakano_K/0/1/0/all/0/1">Kimihiko Nakano</a></p>
2141
2142 <p>In this article, the authors present a novel method to learn the personalized
2143 tactic of discretionary lane-change initiation for fully autonomous vehicles
2144 through human-computer interactions. Instead of learning from human-driving
2145 demonstrations, a reinforcement learning technique is employed to learn how to
2146 initiate lane changes from traffic context, the action of a self-driving
2147 vehicle, and in-vehicle user feedback. The proposed offline algorithm rewards
2148 the action-selection strategy when the user gives positive feedback and
2149 penalizes it when negative feedback. Also, a multi-dimensional driving scenario
2150 is considered to represent a more realistic lane-change trade-off. The results
2151 show that the lane-change initiation model obtained by this method can
2152 reproduce the personal lane-change tactic, and the performance of the
2153 customized models (average accuracy 86.1%) is much better than that of the
2154 non-customized models (average accuracy 75.7%). This method allows continuous
2155 improvement of customization for users during fully autonomous driving even
2156 without human-driving experience, which will significantly enhance the user
2157 acceptance of high-level autonomy of self-driving vehicles.
2158 </p>
2159 </description>
2160 </item>
2161 <item>
2162 <title>Solving Sparse Linear Inverse Problems in Communication Systems: A Deep Learning Approach With Adaptive Depth. (arXiv:2010.15376v1 [eess.SP])</title>
2163 <link>http://fr.arxiv.org/abs/2010.15376</link>
2164 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Chen_W/0/1/0/all/0/1">Wei Chen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_B/0/1/0/all/0/1">Bowen Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jin_S/0/1/0/all/0/1">Shi Jin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ai_B/0/1/0/all/0/1">Bo Ai</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhong_Z/0/1/0/all/0/1">Zhangdui Zhong</a></p>
2165
2166 <p>Sparse signal recovery problems from noisy linear measurements appear in many
2167 areas of wireless communications. In recent years, deep learning (DL) based
2168 approaches have attracted interests of researchers to solve the sparse linear
2169 inverse problem by unfolding iterative algorithms as neural networks.
2170 Typically, research concerning DL assume a fixed number of network layers.
2171 However, it ignores a key character in traditional iterative algorithms, where
2172 the number of iterations required for convergence changes with varying sparsity
2173 levels. By investigating on the projected gradient descent, we unveil the
2174 drawbacks of the existing DL methods with fixed depth. Then we propose an
2175 end-to-end trainable DL architecture, which involves an extra halting score at
2176 each layer. Therefore, the proposed method learns how many layers to execute to
2177 emit an output, and the network depth is dynamically adjusted for each task in
2178 the inference phase. We conduct experiments using both synthetic data and
2179 applications including random access in massive MTC and massive MIMO channel
2180 estimation, and the results demonstrate the improved efficiency for the
2181 proposed approach.
2182 </p>
2183 </description>
2184 </item>
2185 <item>
2186 <title>Supervised sequential pattern mining of event sequences in sport to identify important patterns of play: an application to rugby union. (arXiv:2010.15377v1 [cs.LG])</title>
2187 <link>http://fr.arxiv.org/abs/2010.15377</link>
2188 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bunker_R/0/1/0/all/0/1">Rory Bunker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fujii_K/0/1/0/all/0/1">Keisuke Fujii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hanada_H/0/1/0/all/0/1">Hiroyuki Hanada</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Takeuchi_I/0/1/0/all/0/1">Ichiro Takeuchi</a></p>
2189
2190 <p>Given a set of sequences comprised of time-ordered events, sequential pattern
2191 mining is useful to identify frequent sub-sequences from different sequences or
2192 within the same sequence. However, in sport, these techniques cannot determine
2193 the importance of particular patterns of play to good or bad outcomes, which is
2194 often of greater interest to coaches. In this study, we apply a supervised
2195 sequential pattern mining algorithm called safe pattern pruning (SPP) to 490
2196 labelled event sequences representing passages of play from one rugby team's
2197 matches from the 2018 Japan Top League, and then evaluate the importance of the
2198 obtained sub-sequences to points-scoring outcomes. Linebreaks, successful
2199 lineouts, regained kicks in play, repeated phase-breakdown play, and failed
2200 opposition exit plays were identified as important patterns of play for the
2201 team scoring. When sequences were labelled with points scoring outcomes for the
2202 opposition teams, opposition team linebreaks, errors made by the team,
2203 opposition team lineouts, and repeated phase-breakdown play by the opposition
2204 team were identified as important patterns of play for the opposition team
2205 scoring. By virtue of its supervised nature and pruning properties, SPP
2206 obtained a greater variety of generally more sophisticated patterns than the
2207 well-known unsupervised PrefixSpan algorithm.
2208 </p>
2209 </description>
2210 </item>
2211 <item>
2212 <title>Collaborative Method for Incremental Learning on Classification and Generation. (arXiv:2010.15378v1 [cs.CV])</title>
2213 <link>http://fr.arxiv.org/abs/2010.15378</link>
2214 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_B/0/1/0/all/0/1">Byungju Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_J/0/1/0/all/0/1">Jaeyoung Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_K/0/1/0/all/0/1">Kyungsu Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_S/0/1/0/all/0/1">Sungjin Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_J/0/1/0/all/0/1">Junmo Kim</a></p>
2215
2216 <p>Although well-trained deep neural networks have shown remarkable performance
2217 on numerous tasks, they rapidly forget what they have learned as soon as they
2218 begin to learn with additional data with the previous data stop being provided.
2219 In this paper, we introduce a novel algorithm, Incremental Class Learning with
2220 Attribute Sharing (ICLAS), for incremental class learning with deep neural
2221 networks. As one of its component, we also introduce a generative model,
2222 incGAN, which can generate images with increased variety compared with the
2223 training data. Under challenging environment of data deficiency, ICLAS
2224 incrementally trains classification and the generation networks. Since ICLAS
2225 trains both networks, our algorithm can perform multiple times of incremental
2226 class learning. The experiments on MNIST dataset demonstrate the advantages of
2227 our algorithm.
2228 </p>
2229 </description>
2230 </item>
2231 <item>
2232 <title>The Performance Analysis of Generalized Margin Maximizer (GMM) on Separable Data. (arXiv:2010.15379v1 [stat.ML])</title>
2233 <link>http://fr.arxiv.org/abs/2010.15379</link>
2234 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Salehi_F/0/1/0/all/0/1">Fariborz Salehi</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Abbasi_E/0/1/0/all/0/1">Ehsan Abbasi</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Hassibi_B/0/1/0/all/0/1">Babak Hassibi</a></p>
2235
2236 <p>Logistic models are commonly used for binary classification tasks. The
2237 success of such models has often been attributed to their connection to
2238 maximum-likelihood estimators. It has been shown that gradient descent
2239 algorithm, when applied on the logistic loss, converges to the max-margin
2240 classifier (a.k.a. hard-margin SVM). The performance of the max-margin
2241 classifier has been recently analyzed. Inspired by these results, in this
2242 paper, we present and study a more general setting, where the underlying
2243 parameters of the logistic model possess certain structures (sparse,
2244 block-sparse, low-rank, etc.) and introduce a more general framework (which is
2245 referred to as "Generalized Margin Maximizer", GMM). While classical max-margin
2246 classifiers minimize the $2$-norm of the parameter vector subject to linearly
2247 separating the data, GMM minimizes any arbitrary convex function of the
2248 parameter vector. We provide a precise analysis of the performance of GMM via
2249 the solution of a system of nonlinear equations. We also provide a detailed
2250 study for three special cases: ($1$) $\ell_2$-GMM that is the max-margin
2251 classifier, ($2$) $\ell_1$-GMM which encourages sparsity, and ($3$)
2252 $\ell_{\infty}$-GMM which is often used when the parameter vector has binary
2253 entries. Our theoretical results are validated by extensive simulation results
2254 across a range of parameter values, problem instances, and model structures.
2255 </p>
2256 </description>
2257 </item>
2258 <item>
2259 <title>Learning to Actively Learn: A Robust Approach. (arXiv:2010.15382v1 [cs.LG])</title>
2260 <link>http://fr.arxiv.org/abs/2010.15382</link>
2261 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jifan Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jamieson_K/0/1/0/all/0/1">Kevin Jamieson</a></p>
2262
2263 <p>This work proposes a procedure for designing algorithms for specific adaptive
2264 data collection tasks like active learning and pure-exploration multi-armed
2265 bandits. Unlike the design of traditional adaptive algorithms that rely on
2266 concentration of measure and careful analysis to justify the correctness and
2267 sample complexity of the procedure, our adaptive algorithm is learned via
2268 adversarial training over equivalence classes of problems derived from
2269 information theoretic lower bounds. In particular, a single adaptive learning
2270 algorithm is learned that competes with the best adaptive algorithm learned for
2271 each equivalence class. Our procedure takes as input just the available
2272 queries, set of hypotheses, loss function, and total query budget. This is in
2273 contrast to existing meta-learning work that learns an adaptive algorithm
2274 relative to an explicit, user-defined subset or prior distribution over
2275 problems which can be challenging to define and be mismatched to the instance
2276 encountered at test time. This work is particularly focused on the regime when
2277 the total query budget is very small, such as a few dozen, which is much
2278 smaller than those budgets typically considered by theoretically derived
2279 algorithms. We perform synthetic experiments to justify the stability and
2280 effectiveness of the training procedure, and then evaluate the method on tasks
2281 derived from real data including a noisy 20 Questions game and a joke
2282 recommendation task.
2283 </p>
2284 </description>
2285 </item>
2286 <item>
2287 <title>Prediction-Based Power Oversubscription in Cloud Platforms. (arXiv:2010.15388v1 [cs.DC])</title>
2288 <link>http://fr.arxiv.org/abs/2010.15388</link>
2289 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kumbhare_A/0/1/0/all/0/1">Alok Kumbhare</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Azimi_R/0/1/0/all/0/1">Reza Azimi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Manousakis_I/0/1/0/all/0/1">Ioannis Manousakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bonde_A/0/1/0/all/0/1">Anand Bonde</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frujeri_F/0/1/0/all/0/1">Felipe Frujeri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mahalingam_N/0/1/0/all/0/1">Nithish Mahalingam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Misra_P/0/1/0/all/0/1">Pulkit Misra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Javadi_S/0/1/0/all/0/1">Seyyed Ahmad Javadi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schroeder_B/0/1/0/all/0/1">Bianca Schroeder</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fontoura_M/0/1/0/all/0/1">Marcus Fontoura</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bianchini_R/0/1/0/all/0/1">Ricardo Bianchini</a></p>
2290
2291 <p>Datacenter designers rely on conservative estimates of IT equipment power
2292 draw to provision resources. This leaves resources underutilized and requires
2293 more datacenters to be built. Prior work has used power capping to shave the
2294 rare power peaks and add more servers to the datacenter, thereby
2295 oversubscribing its resources and lowering capital costs. This works well when
2296 the workloads and their server placements are known. Unfortunately, these
2297 factors are unknown in public clouds, forcing providers to limit the
2298 oversubscription so that performance is never impacted.
2299 </p>
2300 <p>In this paper, we argue that providers can use predictions of workload
2301 performance criticality and virtual machine (VM) resource utilization to
2302 increase oversubscription. This poses many challenges, such as identifying the
2303 performance-critical workloads from black-box VMs, creating support for
2304 criticality-aware power management, and increasing oversubscription while
2305 limiting the impact of capping. We address these challenges for the hardware
2306 and software infrastructures of Microsoft Azure. The results show that we
2307 enable a 2x increase in oversubscription with minimum impact to critical
2308 workloads.
2309 </p>
2310 </description>
2311 </item>
2312 <item>
2313 <title>Learning Audio Embeddings with User Listening Data for Content-based Music Recommendation. (arXiv:2010.15389v1 [cs.SD])</title>
2314 <link>http://fr.arxiv.org/abs/2010.15389</link>
2315 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_K/0/1/0/all/0/1">Ke Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liang_B/0/1/0/all/0/1">Beici Liang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_X/0/1/0/all/0/1">Xiaoshuan Ma</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gu_M/0/1/0/all/0/1">Minwei Gu</a></p>
2316
2317 <p>Personalized recommendation on new track releases has always been a
2318 challenging problem in the music industry. To combat this problem, we first
2319 explore user listening history and demographics to construct a user embedding
2320 representing the user's music preference. With the user embedding and audio
2321 data from user's liked and disliked tracks, an audio embedding can be obtained
2322 for each track using metric learning with Siamese networks. For a new track, we
2323 can decide the best group of users to recommend by computing the similarity
2324 between the track's audio embedding and different user embeddings,
2325 respectively. The proposed system yields state-of-the-art performance on
2326 content-based music recommendation tested with millions of users and tracks.
2327 Also, we extract audio embeddings as features for music genre classification
2328 tasks. The results show the generalization ability of our audio embeddings.
2329 </p>
2330 </description>
2331 </item>
2332 <item>
2333 <title>Multitask Bandit Learning through Heterogeneous Feedback Aggregation. (arXiv:2010.15390v1 [cs.LG])</title>
2334 <link>http://fr.arxiv.org/abs/2010.15390</link>
2335 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zhi Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1">Chicheng Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_M/0/1/0/all/0/1">Manish Kumar Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Riek_L/0/1/0/all/0/1">Laurel D. Riek</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chaudhuri_K/0/1/0/all/0/1">Kamalika Chaudhuri</a></p>
2336
2337 <p>In many real-world applications, multiple agents seek to learn how to perform
2338 highly related yet slightly different tasks in an online bandit learning
2339 protocol. We formulate this problem as the $\epsilon$-multi-player multi-armed
2340 bandit problem, in which a set of players concurrently interact with a set of
2341 arms, and for each arm, the reward distributions for all players are similar
2342 but not necessarily identical. We develop an upper confidence bound-based
2343 algorithm, RobustAgg$(\epsilon)$, that adaptively aggregates rewards collected
2344 by different players. In the setting where an upper bound on the pairwise
2345 similarities of reward distributions between players is known, we achieve
2346 instance-dependent regret guarantees that depend on the amenability of
2347 information sharing across players. We complement these upper bounds with
2348 nearly matching lower bounds. In the setting where pairwise similarities are
2349 unknown, we provide a lower bound, as well as an algorithm that trades off
2350 minimax regret guarantees for adaptivity to unknown similarity structure.
2351 </p>
2352 </description>
2353 </item>
2354 <item>
2355 <title>Robustifying Binary Classification to Adversarial Perturbation. (arXiv:2010.15391v1 [cs.LG])</title>
2356 <link>http://fr.arxiv.org/abs/2010.15391</link>
2357 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Salehi_F/0/1/0/all/0/1">Fariborz Salehi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hassibi_B/0/1/0/all/0/1">Babak Hassibi</a></p>
2358
2359 <p>Despite the enormous success of machine learning models in various
2360 applications, most of these models lack resilience to (even small)
2361 perturbations in their input data. Hence, new methods to robustify machine
2362 learning models seem very essential. To this end, in this paper we consider the
2363 problem of binary classification with adversarial perturbations. Investigating
2364 the solution to a min-max optimization (which considers the worst-case loss in
2365 the presence of adversarial perturbations) we introduce a generalization to the
2366 max-margin classifier which takes into account the power of the adversary in
2367 manipulating the data. We refer to this classifier as the "Robust Max-margin"
2368 (RM) classifier. Under some mild assumptions on the loss function, we
2369 theoretically show that the gradient descent iterates (with sufficiently small
2370 step size) converge to the RM classifier in its direction. Therefore, the RM
2371 classifier can be studied to compute various performance measures (e.g.
2372 generalization error) of binary classification with adversarial perturbations.
2373 </p>
2374 </description>
2375 </item>
2376 <item>
2377 <title>Off-Policy Interval Estimation with Lipschitz Value Iteration. (arXiv:2010.15392v1 [cs.LG])</title>
2378 <link>http://fr.arxiv.org/abs/2010.15392</link>
2379 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_Z/0/1/0/all/0/1">Ziyang Tang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_Y/0/1/0/all/0/1">Yihao Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_N/0/1/0/all/0/1">Na Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_J/0/1/0/all/0/1">Jian Peng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Q/0/1/0/all/0/1">Qiang Liu</a></p>
2380
2381 <p>Off-policy evaluation provides an essential tool for evaluating the effects
2382 of different policies or treatments using only observed data. When applied to
2383 high-stakes scenarios such as medical diagnosis or financial decision-making,
2384 it is crucial to provide provably correct upper and lower bounds of the
2385 expected reward, not just a classical single point estimate, to the end-users,
2386 as executing a poor policy can be very costly. In this work, we propose a
2387 provably correct method for obtaining interval bounds for off-policy evaluation
2388 in a general continuous setting. The idea is to search for the maximum and
2389 minimum values of the expected reward among all the Lipschitz Q-functions that
2390 are consistent with the observations, which amounts to solving a constrained
2391 optimization problem on a Lipschitz function space. We go on to introduce a
2392 Lipschitz value iteration method to monotonically tighten the interval, which
2393 is simple yet efficient and provably convergent. We demonstrate the practical
2394 efficiency of our method on a range of benchmarks.
2395 </p>
2396 </description>
2397 </item>
2398 <item>
2399 <title>Discovery and classification of Twitter bots. (arXiv:2010.15393v1 [cs.SI])</title>
2400 <link>http://fr.arxiv.org/abs/2010.15393</link>
2401 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shevtsov_A/0/1/0/all/0/1">Alexander Shevtsov Alexander Shevtsov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Oikonomidou_M/0/1/0/all/0/1">Maria Oikonomidou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Antonakaki_D/0/1/0/all/0/1">Despoina Antonakaki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pratikakis_P/0/1/0/all/0/1">Polyvios Pratikakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kanterakis_A/0/1/0/all/0/1">Alexandros Kanterakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ioannidis_S/0/1/0/all/0/1">Sotiris Ioannidis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fragopoulou_P/0/1/0/all/0/1">Paraskevi Fragopoulou</a></p>
2402
2403 <p>A very large number of people use Online Social Networks daily. Such
2404 platforms thus become attractive targets for agents that seek to gain access to
2405 the attention of large audiences, and influence perceptions or opinions.
2406 Botnets, collections of automated accounts controlled by a single agent, are a
2407 common mechanism for exerting maximum influence. Botnets may be used to better
2408 infiltrate the social graph over time and to create an illusion of community
2409 behavior, amplifying their message and increasing persuasion.
2410 </p>
2411 <p>This paper investigates Twitter botnets, their behavior, their interaction
2412 with user communities and their evolution over time. We analyzed a dense crawl
2413 of a subset of Twitter traffic, amounting to nearly all interactions by
2414 Greek-speaking Twitter users for a period of 36 months. We detected over a
2415 million events where seemingly unrelated accounts tweeted nearly identical
2416 content at nearly the same time. We filtered these concurrent content injection
2417 events and detected a set of 1,850 accounts that repeatedly exhibit this
2418 pattern of behavior, suggesting that they are fully or in part controlled and
2419 orchestrated by the same software. We found botnets that appear for brief
2420 intervals and disappear, as well as botnets that evolve and grow, spanning the
2421 duration of our dataset. We analyze statistical differences between bot
2422 accounts and human users, as well as botnet interaction with user communities
2423 and Twitter trending topics.
2424 </p>
2425 </description>
2426 </item>
2427 <item>
2428 <title>Smart Homes: Security Challenges and Privacy Concerns. (arXiv:2010.15394v1 [cs.CR])</title>
2429 <link>http://fr.arxiv.org/abs/2010.15394</link>
2430 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hall_F/0/1/0/all/0/1">Fraser Hall</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maglaras_L/0/1/0/all/0/1">Leandros Maglaras</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Aivaliotis_T/0/1/0/all/0/1">Theodoros Aivaliotis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xagoraris_L/0/1/0/all/0/1">Loukas Xagoraris</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kantzavelou_I/0/1/0/all/0/1">Ioanna Kantzavelou</a></p>
2431
2432 <p>Development and growth of Internet of Things (IoT) technology has
2433 exponentially increased over the course of the last 10 years since its
2434 inception, and as a result has directly influenced the popularity and size of
2435 smart homes. In this article we present the main technologies and applications
2436 that constitute a smart home, we identify the main security and privacy
2437 challenges that smart home face and we provide good practices to mitigate those
2438 threats.
2439 </p>
2440 </description>
2441 </item>
2442 <item>
2443 <title>Channel Estimation and Equalization for CP-OFDM-based OTFS in Fractional Doppler Channels. (arXiv:2010.15396v1 [cs.IT])</title>
2444 <link>http://fr.arxiv.org/abs/2010.15396</link>
2445 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hashimoto_N/0/1/0/all/0/1">Noriyuki Hashimoto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Osawa_N/0/1/0/all/0/1">Noboru Osawa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yamazaki_K/0/1/0/all/0/1">Kosuke Yamazaki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ibi_S/0/1/0/all/0/1">Shinsuke Ibi</a></p>
2446
2447 <p>Orthogonal time frequency and space (OTFS) modulation is a promising
2448 technology that satisfies high Doppler requirements for future mobile systems.
2449 OTFS modulation encodes information symbols and pilot symbols into the
2450 two-dimensional (2D) delay-Doppler (DD) domain. The received symbols suffer
2451 from inter-Doppler interference (IDI) in the fading channels with fractional
2452 Doppler shifts that are sampled at noninteger indices in the DD domain. IDI has
2453 been treated as an unavoidable effect because the fractional Doppler shifts
2454 cannot be obtained directly from the received pilot symbols. In this paper, we
2455 provide a solution to channel estimation for fractional Doppler channels. The
2456 proposed estimation provides new insight into the OTFS input-output relation in
2457 the DD domain as a 2D circular convolution with a small approximation.
2458 According to the input-output relation, we also provide a low-complexity
2459 channel equalization method using the estimated channel information. We
2460 demonstrate the error performance of the proposed channel estimation and
2461 equalization in several channels by simulations. The simulation results show
2462 that in high-mobility environments, the total system utilizing the proposed
2463 methods outperforms orthogonal frequency division multiplexing (OFDM) with
2464 ideal channel estimation and a conventional channel estimation method using a
2465 pseudo sequence.
2466 </p>
2467 </description>
2468 </item>
2469 <item>
2470 <title>Free-boundary conformal parameterization of point clouds. (arXiv:2010.15399v1 [cs.CG])</title>
2471 <link>http://fr.arxiv.org/abs/2010.15399</link>
2472 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1">Yechen Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Choi_G/0/1/0/all/0/1">Gary P. T. Choi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lui_L/0/1/0/all/0/1">Lok Ming Lui</a></p>
2473
2474 <p>With the advancement in 3D scanning technology, there has been a surge of
2475 interest in the use of point clouds in science and engineering. To facilitate
2476 the computations and analyses of point clouds, prior works have considered
2477 parameterizing them onto some simple planar domains with a fixed boundary shape
2478 such as a unit circle or a rectangle. However, the geometry of the fixed shape
2479 may lead to some undesirable distortion in the parameterization. It is
2480 therefore more natural to consider free-boundary conformal parameterizations of
2481 point clouds, which minimize the local geometric distortion of the mapping
2482 without constraining the overall shape. In this work, we propose a novel
2483 approximation scheme of the Laplace--Beltrami operator on point clouds and
2484 utilize it for developing a free-boundary conformal parameterization method for
2485 disk-type point clouds. With the aid of the free-boundary conformal
2486 parameterization, high-quality point cloud meshing can be easily achieved.
2487 Furthermore, we show that using the idea of conformal welding in complex
2488 analysis, the point cloud conformal parameterization can be computed in a
2489 divide-and-conquer manner. Experimental results are presented to demonstrate
2490 the effectiveness of the proposed method.
2491 </p>
2492 </description>
2493 </item>
2494 <item>
2495 <title>On Efficient and Scalable Time-Continuous Spatial Crowdsourcing -- Full Version. (arXiv:2010.15404v1 [cs.DB])</title>
2496 <link>http://fr.arxiv.org/abs/2010.15404</link>
2497 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_T/0/1/0/all/0/1">Ting Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_X/0/1/0/all/0/1">Xike Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_X/0/1/0/all/0/1">Xin Cao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pedersen_T/0/1/0/all/0/1">Torben Bach Pedersen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xiao_M/0/1/0/all/0/1">Mingjun Xiao</a></p>
2498
2499 <p>The proliferation of advanced mobile terminals opened up a new crowdsourcing
2500 avenue, spatial crowdsourcing, to utilize the crowd potential to perform
2501 real-world tasks. In this work, we study a new type of spatial crowdsourcing,
2502 called time-continuous spatial crowdsourcing (TCSC in short). It supports broad
2503 applications for long-term continuous spatial data acquisition, ranging from
2504 environmental monitoring to traffic surveillance in citizen science and
2505 crowdsourcing projects. However, due to limited budgets and limited
2506 availability of workers in practice, the data collected is often incomplete,
2507 incurring data deficiency problem. To tackle that, in this work, we first
2508 propose an entropy-based quality metric, which captures the joint effects of
2509 incompletion in data acquisition and the imprecision in data interpolation.
2510 Based on that, we investigate quality-aware task assignment methods for both
2511 single- and multi-task scenarios. We show the NP-hardness of the single-task
2512 case, and design polynomial-time algorithms with guaranteed approximation
2513 ratios. We study novel indexing and pruning techniques for further enhancing
2514 the performance in practice. Then, we extend the solution to multi-task
2515 scenarios and devise a parallel framework for speeding up the process of
2516 optimization. We conduct extensive experiments on both real and synthetic
2517 datasets to show the effectiveness of our proposals.
2518 </p>
2519 </description>
2520 </item>
2521 <item>
2522 <title>Conversation Graph: Data Augmentation, Training and Evaluation for Non-Deterministic Dialogue Management. (arXiv:2010.15411v1 [cs.CL])</title>
2523 <link>http://fr.arxiv.org/abs/2010.15411</link>
2524 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gritta_M/0/1/0/all/0/1">Milan Gritta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lampouras_G/0/1/0/all/0/1">Gerasimos Lampouras</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Iacobacci_I/0/1/0/all/0/1">Ignacio Iacobacci</a></p>
2525
2526 <p>Task-oriented dialogue systems typically rely on large amounts of
2527 high-quality training data or require complex handcrafted rules. However,
2528 existing datasets are often limited in size considering the complexity of the
2529 dialogues. Additionally, conventional training signal inference is not suitable
2530 for non-deterministic agent behaviour, i.e. considering multiple actions as
2531 valid in identical dialogue states. We propose the Conversation Graph
2532 (ConvGraph), a graph-based representation of dialogues that can be exploited
2533 for data augmentation, multi-reference training and evaluation of
2534 non-deterministic agents. ConvGraph generates novel dialogue paths to augment
2535 data volume and diversity. Intrinsic and extrinsic evaluation across three
2536 datasets shows that data augmentation and/or multi-reference training with
2537 ConvGraph can improve dialogue success rates by up to 6.4%.
2538 </p>
2539 </description>
2540 </item>
2541 <item>
2542 <title>Measuring and Harnessing Transference in Multi-Task Learning. (arXiv:2010.15413v1 [cs.LG])</title>
2543 <link>http://fr.arxiv.org/abs/2010.15413</link>
2544 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fifty_C/0/1/0/all/0/1">Christopher Fifty</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Amid_E/0/1/0/all/0/1">Ehsan Amid</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_Z/0/1/0/all/0/1">Zhe Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_T/0/1/0/all/0/1">Tianhe Yu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Anil_R/0/1/0/all/0/1">Rohan Anil</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Finn_C/0/1/0/all/0/1">Chelsea Finn</a></p>
2545
2546 <p>Multi-task learning can leverage information learned by one task to benefit
2547 the training of other tasks. Despite this capacity, na\"ive formulations often
2548 degrade performance and in particular, identifying the tasks that would benefit
2549 from co-training remains a challenging design question. In this paper, we
2550 analyze the dynamics of information transfer, or transference, across tasks
2551 throughout training. Specifically, we develop a similarity measure that can
2552 quantify transference among tasks and use this quantity to both better
2553 understand the optimization dynamics of multi-task learning as well as improve
2554 overall learning performance. In the latter case, we propose two methods to
2555 leverage our transference metric. The first operates at a macro-level by
2556 selecting which tasks should train together while the second functions at a
2557 micro-level by determining how to combine task gradients at each training step.
2558 We find these methods can lead to significant improvement over prior work on
2559 three supervised multi-task learning benchmarks and one multi-task
2560 reinforcement learning paradigm.
2561 </p>
2562 </description>
2563 </item>
2564 <item>
2565 <title>A Novel Anomaly Detection Algorithm for Hybrid Production Systems based on Deep Learning and Timed Automata. (arXiv:2010.15415v1 [cs.LG])</title>
2566 <link>http://fr.arxiv.org/abs/2010.15415</link>
2567 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hranisavljevic_N/0/1/0/all/0/1">Nemanja Hranisavljevic</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Niggemann_O/0/1/0/all/0/1">Oliver Niggemann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maier_A/0/1/0/all/0/1">Alexander Maier</a></p>
2568
2569 <p>Performing anomaly detection in hybrid systems is a challenging task since it
2570 requires analysis of timing behavior and mutual dependencies of both discrete
2571 and continuous signals. Typically, it requires modeling system behavior, which
2572 is often accomplished manually by human engineers. Using machine learning for
2573 creating a behavioral model from observations has advantages, such as lower
2574 development costs and fewer requirements for specific knowledge about the
2575 system. The paper presents DAD:DeepAnomalyDetection, a new approach for
2576 automatic model learning and anomaly detection in hybrid production systems. It
2577 combines deep learning and timed automata for creating behavioral model from
2578 observations. The ability of deep belief nets to extract binary features from
2579 real-valued inputs is used for transformation of continuous to discrete
2580 signals. These signals, together with the original discrete signals are than
2581 handled in an identical way. Anomaly detection is performed by the comparison
2582 of actual and predicted system behavior. The algorithm has been applied to few
2583 data sets including two from real systems and has shown promising results.
2584 </p>
2585 </description>
2586 </item>
2587 <item>
2588 <title>ProCAN: Progressive Growing Channel Attentive Non-Local Network for Lung Nodule Classification. (arXiv:2010.15417v1 [eess.IV])</title>
2589 <link>http://fr.arxiv.org/abs/2010.15417</link>
2590 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Al_Shabi_M/0/1/0/all/0/1">Mundher Al-Shabi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shak_K/0/1/0/all/0/1">Kelvin Shak</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Tan_M/0/1/0/all/0/1">Maxine Tan</a></p>
2591
2592 <p>Lung cancer classification in screening computed tomography (CT) scans is one
2593 of the most crucial tasks for early detection of this disease. Many lives can
2594 be saved if we are able to accurately classify malignant/ cancerous lung
2595 nodules. Consequently, several deep learning based models have been proposed
2596 recently to classify lung nodules as malignant or benign. Nevertheless, the
2597 large variation in the size and heterogeneous appearance of the nodules makes
2598 this task an extremely challenging one. We propose a new Progressive Growing
2599 Channel Attentive Non-Local (ProCAN) network for lung nodule classification.
2600 The proposed method addresses this challenge from three different aspects.
2601 First, we enrich the Non-Local network by adding channel-wise attention
2602 capability to it. Second, we apply Curriculum Learning principles, whereby we
2603 first train our model on easy examples before hard/ difficult ones. Third, as
2604 the classification task gets harder during the Curriculum learning, our model
2605 is progressively grown to increase its capability of handling the task at hand.
2606 We examined our proposed method on two different public datasets and compared
2607 its performance with state-of-the-art methods in the literature. The results
2608 show that the ProCAN model outperforms state-of-the-art methods and achieves an
2609 AUC of 98.05% and accuracy of 95.28% on the LIDC-IDRI dataset. Moreover, we
2610 conducted extensive ablation studies to analyze the contribution and effects of
2611 each new component of our proposed method.
2612 </p>
2613 </description>
2614 </item>
2615 <item>
2616 <title>Scalable Graph Neural Networks via Bidirectional Propagation. (arXiv:2010.15421v1 [cs.LG])</title>
2617 <link>http://fr.arxiv.org/abs/2010.15421</link>
2618 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_M/0/1/0/all/0/1">Ming Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_Z/0/1/0/all/0/1">Zhewei Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_B/0/1/0/all/0/1">Bolin Ding</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yaliang Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yuan_Y/0/1/0/all/0/1">Ye Yuan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Du_X/0/1/0/all/0/1">Xiaoyong Du</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_J/0/1/0/all/0/1">Ji-Rong Wen</a></p>
2619
2620 <p>Graph Neural Networks (GNN) is an emerging field for learning on
2621 non-Euclidean data. Recently, there has been increased interest in designing
2622 GNN that scales to large graphs. Most existing methods use "graph sampling" or
2623 "layer-wise sampling" techniques to reduce training time. However, these
2624 methods still suffer from degrading performance and scalability problems when
2625 applying to graphs with billions of edges. This paper presents GBP, a scalable
2626 GNN that utilizes a localized bidirectional propagation process from both the
2627 feature vectors and the training/testing nodes. Theoretical analysis shows that
2628 GBP is the first method that achieves sub-linear time complexity for both the
2629 precomputation and the training phases. An extensive empirical study
2630 demonstrates that GBP achieves state-of-the-art performance with significantly
2631 less training/testing time. Most notably, GBP can deliver superior performance
2632 on a graph with over 60 million nodes and 1.8 billion edges in less than half
2633 an hour on a single machine.
2634 </p>
2635 </description>
2636 </item>
2637 <item>
2638 <title>Tilde at WMT 2020: News Task Systems. (arXiv:2010.15423v1 [cs.CL])</title>
2639 <link>http://fr.arxiv.org/abs/2010.15423</link>
2640 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Krislauks_R/0/1/0/all/0/1">Rihards Kri&#x161;lauks</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pinnis_M/0/1/0/all/0/1">M&#x101;rcis Pinnis</a></p>
2641
2642 <p>This paper describes Tilde's submission to the WMT2020 shared task on news
2643 translation for both directions of the English-Polish language pair in both the
2644 constrained and the unconstrained tracks. We follow our submissions from the
2645 previous years and build our baseline systems to be morphologically motivated
2646 sub-word unit-based Transformer base models that we train using the Marian
2647 machine translation toolkit. Additionally, we experiment with different
2648 parallel and monolingual data selection schemes, as well as sampled
2649 back-translation. Our final models are ensembles of Transformer base and
2650 Transformer big models that feature right-to-left re-ranking.
2651 </p>
2652 </description>
2653 </item>
2654 <item>
2655 <title>Detection of asteroid trails in Hubble Space Telescope images using Deep Learning. (arXiv:2010.15425v1 [astro-ph.IM])</title>
2656 <link>http://fr.arxiv.org/abs/2010.15425</link>
2657 <description><p>Authors: <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Parfeni_A/0/1/0/all/0/1">Andrei A. Parfeni</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Caramete_L/0/1/0/all/0/1">Laurentiu I. Caramete</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Dobre_A/0/1/0/all/0/1">Andreea M. Dobre</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Bach_N/0/1/0/all/0/1">Nguyen Tran Bach</a></p>
2658
2659 <p>We present an application of Deep Learning for the image recognition of
2660 asteroid trails in single-exposure photos taken by the Hubble Space Telescope.
2661 Using algorithms based on multi-layered deep Convolutional Neural Networks, we
2662 report accuracies of above 80% on the validation set. Our project was motivated
2663 by the Hubble Asteroid Hunter project on Zooniverse, which focused on
2664 identifying these objects in order to localize and better characterize them. We
2665 aim to demonstrate that Machine Learning techniques can be very useful in
2666 trying to solve problems that are closely related to Astronomy and
2667 Astrophysics, but that they are still not developed enough for very specific
2668 tasks.
2669 </p>
2670 </description>
2671 </item>
2672 <item>
2673 <title>Physics-informed deep learning for flow and deformation in poroelastic media. (arXiv:2010.15426v1 [cs.CE])</title>
2674 <link>http://fr.arxiv.org/abs/2010.15426</link>
2675 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bekele_Y/0/1/0/all/0/1">Yared W. Bekele</a></p>
2676
2677 <p>A physics-informed neural network is presented for poroelastic problems with
2678 coupled flow and deformation processes. The governing equilibrium and mass
2679 balance equations are discussed and specific derivations for two-dimensional
2680 cases are presented. A fully-connected deep neural network is used for
2681 training. Barry and Mercer's source problem with time-dependent fluid
2682 injection/extraction in an idealized poroelastic medium, which has an exact
2683 analytical solution, is used as a numerical example. A random sample from the
2684 analytical solution is used as training data and the performance of the model
2685 is tested by predicting the solution on the entire domain after training. The
2686 deep learning model predicts the horizontal and vertical deformations well
2687 while the error in the predicted pore pressure predictions is slightly higher
2688 because of the sparsity of the pore pressure values.
2689 </p>
2690 </description>
2691 </item>
2692 <item>
2693 <title>Sparse Signal Reconstruction for Nonlinear Models via Piecewise Rational Optimization. (arXiv:2010.15427v1 [math.OC])</title>
2694 <link>http://fr.arxiv.org/abs/2010.15427</link>
2695 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Marmin_A/0/1/0/all/0/1">Arthur Marmin</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Castella_M/0/1/0/all/0/1">Marc Castella</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Pesquet_J/0/1/0/all/0/1">Jean-Christophe Pesquet</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Duval_L/0/1/0/all/0/1">Laurent Duval</a></p>
2696
2697 <p>We propose a method to reconstruct sparse signals degraded by a nonlinear
2698 distortion and acquired at a limited sampling rate. Our method formulates the
2699 reconstruction problem as a nonconvex minimization of the sum of a data fitting
2700 term and a penalization term. In contrast with most previous works which settle
2701 for approximated local solutions, we seek for a global solution to the obtained
2702 challenging nonconvex problem. Our global approach relies on the so-called
2703 Lasserre relaxation of polynomial optimization. We here specifically include in
2704 our approach the case of piecewise rational functions, which makes it possible
2705 to address a wide class of nonconvex exact and continuous relaxations of the
2706 $\ell_0$ penalization function. Additionally, we study the complexity of the
2707 optimization problem. It is shown how to use the structure of the problem to
2708 lighten the computational burden efficiently. Finally, numerical simulations
2709 illustrate the benefits of our method in terms of both global optimality and
2710 signal reconstruction.
2711 </p>
2712 </description>
2713 </item>
2714 <item>
2715 <title>Self-paced Data Augmentation for Training Neural Networks. (arXiv:2010.15434v1 [cs.LG])</title>
2716 <link>http://fr.arxiv.org/abs/2010.15434</link>
2717 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Takase_T/0/1/0/all/0/1">Tomoumi Takase</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Karakida_R/0/1/0/all/0/1">Ryo Karakida</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Asoh_H/0/1/0/all/0/1">Hideki Asoh</a></p>
2718
2719 <p>Data augmentation is widely used for machine learning; however, an effective
2720 method to apply data augmentation has not been established even though it
2721 includes several factors that should be tuned carefully. One such factor is
2722 sample suitability, which involves selecting samples that are suitable for data
2723 augmentation. A typical method that applies data augmentation to all training
2724 samples disregards sample suitability, which may reduce classifier performance.
2725 To address this problem, we propose the self-paced augmentation (SPA) to
2726 automatically and dynamically select suitable samples for data augmentation
2727 when training a neural network. The proposed method mitigates the deterioration
2728 of generalization performance caused by ineffective data augmentation. We
2729 discuss two reasons the proposed SPA works relative to curriculum learning and
2730 desirable changes to loss function instability. Experimental results
2731 demonstrate that the proposed SPA can improve the generalization performance,
2732 particularly when the number of training samples is small. In addition, the
2733 proposed SPA outperforms the state-of-the-art RandAugment method.
2734 </p>
2735 </description>
2736 </item>
2737 <item>
2738 <title>Group-Harmonic and Group-Closeness Maximization -- Approximation and Engineering. (arXiv:2010.15435v1 [cs.DS])</title>
2739 <link>http://fr.arxiv.org/abs/2010.15435</link>
2740 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Angriman_E/0/1/0/all/0/1">Eugenio Angriman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Becker_R/0/1/0/all/0/1">Ruben Becker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+DAngelo_G/0/1/0/all/0/1">Gianlorenzo D&#x27;Angelo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gilbert_H/0/1/0/all/0/1">Hugo Gilbert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Grinten_A/0/1/0/all/0/1">Alexander van der Grinten</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Meyerhenke_H/0/1/0/all/0/1">Henning Meyerhenke</a></p>
2741
2742 <p>Centrality measures characterize important nodes in networks. Efficiently
2743 computing such nodes has received a lot of attention. When considering the
2744 generalization of computing central groups of nodes, challenging optimization
2745 problems occur. In this work, we study two such problems, group-harmonic
2746 maximization and group-closeness maximization both from a theoretical and from
2747 an algorithm engineering perspective.
2748 </p>
2749 <p>On the theoretical side, we obtain the following results. For group-harmonic
2750 maximization, unless $P=NP$, there is no polynomial-time algorithm that
2751 achieves an approximation factor better than $1-1/e$ (directed) and $1-1/(4e)$
2752 (undirected), even for unweighted graphs. On the positive side, we show that a
2753 greedy algorithm achieves an approximation factor of $\lambda(1-2/e)$
2754 (directed) and $\lambda(1-1/e)/2$ (undirected), where $\lambda$ is the ratio of
2755 minimal and maximal edge weights. For group-closeness maximization, the
2756 undirected case is $NP$-hard to be approximated to within a factor better than
2757 $1-1/(e+1)$ and a constant approximation factor is achieved by a local-search
2758 algorithm. For the directed case, however, we show that, for any
2759 $\epsilon&lt;1/2$, the problem is $NP$-hard to be approximated within a factor of
2760 $4|V|^{-\epsilon}$.
2761 </p>
2762 <p>From the algorithm engineering perspective, we provide efficient
2763 implementations of the above greedy and local search algorithms. In our
2764 experimental study we show that, on small instances where an optimum solution
2765 can be computed in reasonable time, the quality of both the greedy and the
2766 local search algorithms come very close to the optimum. On larger instances,
2767 our local search algorithms yield results with superior quality compared to
2768 existing greedy and local search solutions, at the cost of additional running
2769 time. We thus advocate local search for scenarios where solution quality is of
2770 highest concern.
2771 </p>
2772 </description>
2773 </item>
2774 <item>
2775 <title>Affordance-Aware Handovers with Human Arm Mobility Constraints. (arXiv:2010.15436v1 [cs.RO])</title>
2776 <link>http://fr.arxiv.org/abs/2010.15436</link>
2777 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ardon_P/0/1/0/all/0/1">Paola Ard&#xf3;n</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cabrera_M/0/1/0/all/0/1">Maria E. Cabrera</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pairet_E/0/1/0/all/0/1">&#xc8;ric Pairet</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Petrick_R/0/1/0/all/0/1">Ronald P. A. Petrick</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramamoorthy_S/0/1/0/all/0/1">Subramanian Ramamoorthy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lohan_K/0/1/0/all/0/1">Katrin S. Lohan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cakmak_M/0/1/0/all/0/1">Maya Cakmak</a></p>
2778
2779 <p>Reasoning about object handover configurations allows an assistive agent to
2780 estimate the appropriateness of handover for a receiver with different arm
2781 mobility capacities. While there are existing approaches to estimating the
2782 effectiveness of handovers, their findings are limited to users without arm
2783 mobility impairments and to specific objects. Therefore, current
2784 state-of-the-art approaches are unable to hand over novel objects to receivers
2785 with different arm mobility capacities. We propose a method that generalises
2786 handover behaviours to previously unseen objects, subject to the constraint of
2787 a user's arm mobility levels and the task context. We propose a
2788 heuristic-guided hierarchically optimised cost whose optimisation adapts object
2789 configurations for receivers with low arm mobility. This also ensures that the
2790 robot grasps consider the context of the user's upcoming task, i.e., the usage
2791 of the object. To understand preferences over handover configurations, we
2792 report on the findings of an online study, wherein we presented different
2793 handover methods, including ours, to $259$ users with different levels of arm
2794 mobility. We encapsulate these preferences in a SRL that is able to reason
2795 about the most suitable handover configuration given a receiver's arm mobility
2796 and upcoming task. We find that people's preferences over handover methods are
2797 correlated to their arm mobility capacities. In experiments with a PR2 robotic
2798 platform, we obtained an average handover accuracy of $90.8\%$ when
2799 generalising handovers to novel objects.
2800 </p>
2801 </description>
2802 </item>
2803 <item>
2804 <title>Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model. (arXiv:2010.15437v1 [cs.CL])</title>
2805 <link>http://fr.arxiv.org/abs/2010.15437</link>
2806 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ihori_M/0/1/0/all/0/1">Mana Ihori</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Masumura_R/0/1/0/all/0/1">Ryo Masumura</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Makishima_N/0/1/0/all/0/1">Naoki Makishima</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tanaka_T/0/1/0/all/0/1">Tomohiro Tanaka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Takashima_A/0/1/0/all/0/1">Akihiko Takashima</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Orihashi_S/0/1/0/all/0/1">Shota Orihashi</a></p>
2807
2808 <p>This paper presents a novel fusion method for integrating an external
2809 language model (LM) into the Transformer based sequence-to-sequence (seq2seq)
2810 model. While paired data are basically required to train the seq2seq model, the
2811 external LM can be trained with only unpaired data. Thus, it is important to
2812 leverage memorized knowledge in the external LM for building the seq2seq model,
2813 since it is hard to prepare a large amount of paired data. However, the
2814 existing fusion methods assume that the LM is integrated with recurrent neural
2815 network-based seq2seq models instead of the Transformer. Therefore, this paper
2816 proposes a fusion method that can explicitly utilize network structures in the
2817 Transformer. The proposed method, called {\bf memory attentive fusion},
2818 leverages the Transformer-style attention mechanism that repeats source-target
2819 attention in a multi-hop manner for reading the memorized knowledge in the LM.
2820 Our experiments on two text-style conversion tasks demonstrate that the
2821 proposed method performs better than conventional fusion methods.
2822 </p>
2823 </description>
2824 </item>
2825 <item>
2826 <title>Modeling and Control of COVID-19 Epidemic through Testing Policies. (arXiv:2010.15438v1 [math.OC])</title>
2827 <link>http://fr.arxiv.org/abs/2010.15438</link>
2828 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Niazi_M/0/1/0/all/0/1">Muhammad Umar B. Niazi</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Kibangou_A/0/1/0/all/0/1">Alain Kibangou</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Canudas_de_Wit_C/0/1/0/all/0/1">Carlos Canudas-de-Wit</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Nikitin_D/0/1/0/all/0/1">Denis Nikitin</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Tumash_L/0/1/0/all/0/1">Liudmila Tumash</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Bliman_P/0/1/0/all/0/1">Pierre-Alexandre Bliman</a></p>
2829
2830 <p>Testing for the infected cases is one of the most important mechanisms to
2831 control an epidemic. It enables to isolate the detected infected individuals,
2832 thereby limiting the disease transmission to the susceptible population.
2833 However, despite the significance of testing policies, the recent literature on
2834 the subject lacks a control-theoretic perspective. In this work, an epidemic
2835 model that incorporates the testing rate as a control input is presented. The
2836 proposed model differentiates the undetected infected from the detected
2837 infected cases, who are assumed to be removed from the disease spreading
2838 process in the population. First, the model is estimated and validated for
2839 COVID-19 data in France. Then, two testing policies are proposed, the so-called
2840 best-effort strategy for testing (BEST) and constant optimal strategy for
2841 testing (COST). The BEST policy is a suppression strategy that provides a lower
2842 bound on the testing rate such that the epidemic switches from a spreading to a
2843 non-spreading state. The COST policy is a mitigation strategy that provides an
2844 optimal value of testing rate that minimizes the peak value of the infected
2845 population when the total stockpile of tests is limited. Both testing policies
2846 are evaluated by predicting the number of active intensive care unit (ICU)
2847 cases and the cumulative number of deaths due to COVID-19.
2848 </p>
2849 </description>
2850 </item>
2851 <item>
2852 <title>FlatNet: Towards Photorealistic Scene Reconstruction from Lensless Measurements. (arXiv:2010.15440v1 [eess.IV])</title>
2853 <link>http://fr.arxiv.org/abs/2010.15440</link>
2854 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Khan_S/0/1/0/all/0/1">Salman S. Khan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sundar_V/0/1/0/all/0/1">Varun Sundar</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Boominathan_V/0/1/0/all/0/1">Vivek Boominathan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Veeraraghavan_A/0/1/0/all/0/1">Ashok Veeraraghavan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Mitra_K/0/1/0/all/0/1">Kaushik Mitra</a></p>
2855
2856 <p>Lensless imaging has emerged as a potential solution towards realizing
2857 ultra-miniature cameras by eschewing the bulky lens in a traditional camera.
2858 Without a focusing lens, the lensless cameras rely on computational algorithms
2859 to recover the scenes from multiplexed measurements. However, the current
2860 iterative-optimization-based reconstruction algorithms produce noisier and
2861 perceptually poorer images. In this work, we propose a non-iterative deep
2862 learning based reconstruction approach that results in orders of magnitude
2863 improvement in image quality for lensless reconstructions. Our approach, called
2864 $\textit{FlatNet}$, lays down a framework for reconstructing high-quality
2865 photorealistic images from mask-based lensless cameras, where the camera's
2866 forward model formulation is known. FlatNet consists of two stages: (1) an
2867 inversion stage that maps the measurement into a space of intermediate
2868 reconstruction by learning parameters within the forward model formulation, and
2869 (2) a perceptual enhancement stage that improves the perceptual quality of this
2870 intermediate reconstruction. These stages are trained together in an end-to-end
2871 manner. We show high-quality reconstructions by performing extensive
2872 experiments on real and challenging scenes using two different types of
2873 lensless prototypes: one which uses a separable forward model and another,
2874 which uses a more general non-separable cropped-convolution model. Our
2875 end-to-end approach is fast, produces photorealistic reconstructions, and is
2876 easy to adopt for other mask-based lensless cameras.
2877 </p>
2878 </description>
2879 </item>
2880 <item>
2881 <title>Self-awareness in intelligent vehicles: Feature based dynamic Bayesian models for abnormality detection. (arXiv:2010.15441v1 [cs.LG])</title>
2882 <link>http://fr.arxiv.org/abs/2010.15441</link>
2883 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kanapram_D/0/1/0/all/0/1">Divya Thekke Kanapram</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marin_Plaza_P/0/1/0/all/0/1">Pablo Marin-Plaza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marcenaro_L/0/1/0/all/0/1">Lucio Marcenaro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Martin_D/0/1/0/all/0/1">David Martin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Escalera_A/0/1/0/all/0/1">Arturo de la Escalera</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Regazzoni_C/0/1/0/all/0/1">Carlo Regazzoni</a></p>
2884
2885 <p>The evolution of Intelligent Transportation Systems in recent times
2886 necessitates the development of self-awareness in agents. Before the intensive
2887 use of Machine Learning, the detection of abnormalities was manually programmed
2888 by checking every variable and creating huge nested conditions that are very
2889 difficult to track. This paper aims to introduce a novel method to develop
2890 self-awareness in autonomous vehicles that mainly focuses on detecting abnormal
2891 situations around the considered agents. Multi-sensory time-series data from
2892 the vehicles are used to develop the data-driven Dynamic Bayesian Network (DBN)
2893 models used for future state prediction and the detection of dynamic
2894 abnormalities. Moreover, an initial level collective awareness model that can
2895 perform joint anomaly detection in co-operative tasks is proposed. The GNG
2896 algorithm learns the DBN models' discrete node variables; probabilistic
2897 transition links connect the node variables. A Markov Jump Particle Filter
2898 (MJPF) is applied to predict future states and detect when the vehicle is
2899 potentially misbehaving using learned DBNs as filter parameters. In this paper,
2900 datasets from real experiments of autonomous vehicles performing various tasks
2901 used to learn and test a set of switching DBN models.
2902 </p>
2903 </description>
2904 </item>
2905 <item>
2906 <title>Advanced Python Performance Monitoring with Score-P. (arXiv:2010.15444v1 [cs.DC])</title>
2907 <link>http://fr.arxiv.org/abs/2010.15444</link>
2908 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gocht_A/0/1/0/all/0/1">Andreas Gocht</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schone_R/0/1/0/all/0/1">Robert Sch&#xf6;ne</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frenzel_J/0/1/0/all/0/1">Jan Frenzel</a></p>
2909
2910 <p>Within the last years, Python became more prominent in the scientific
2911 community and is now used for simulations, machine learning, and data analysis.
2912 All these tasks profit from additional compute power offered by parallelism and
2913 offloading. In the domain of High Performance Computing (HPC), we can look back
2914 to decades of experience exploiting different levels of parallelism on the
2915 core, node or inter-node level, as well as utilising accelerators. By using
2916 performance analysis tools to investigate all these levels of parallelism, we
2917 can tune applications for unprecedented performance. Unfortunately, standard
2918 Python performance analysis tools cannot cope with highly parallel programs.
2919 Since the development of such software is complex and error-prone, we
2920 demonstrate an easy-to-use solution based on an existing tool infrastructure
2921 for performance analysis. In this paper, we describe how to apply the
2922 established instrumentation framework \scorep to trace Python applications. We
2923 finish with a study of the overhead that users can expect for instrumenting
2924 their applications.
2925 </p>
2926 </description>
2927 </item>
2928 <item>
2929 <title>Progressive Voice Trigger Detection: Accuracy vs Latency. (arXiv:2010.15446v1 [eess.AS])</title>
2930 <link>http://fr.arxiv.org/abs/2010.15446</link>
2931 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Sigtia_S/0/1/0/all/0/1">Siddharth Sigtia</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bridle_J/0/1/0/all/0/1">John Bridle</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Richards_H/0/1/0/all/0/1">Hywel Richards</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Clark_P/0/1/0/all/0/1">Pascal Clark</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Marchi_E/0/1/0/all/0/1">Erik Marchi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Garg_V/0/1/0/all/0/1">Vineet Garg</a></p>
2932
2933 <p>We present an architecture for voice trigger detection for virtual
2934 assistants. The main idea in this work is to exploit information in words that
2935 immediately follow the trigger phrase. We first demonstrate that by including
2936 more audio context after a detected trigger phrase, we can indeed get a more
2937 accurate decision. However, waiting to listen to more audio each time incurs a
2938 latency increase. Progressive Voice Trigger Detection allows us to trade-off
2939 latency and accuracy by accepting clear trigger candidates quickly, but waiting
2940 for more context to decide whether to accept more marginal examples. Using a
2941 two-stage architecture, we show that by delaying the decision for just 3% of
2942 detected true triggers in the test set, we are able to obtain a relative
2943 improvement of 66% in false rejection rate, while incurring only a negligible
2944 increase in latency.
2945 </p>
2946 </description>
2947 </item>
2948 <item>
2949 <title>Capacity-achieving codes: a review on double transitivity. (arXiv:2010.15453v1 [cs.IT])</title>
2950 <link>http://fr.arxiv.org/abs/2010.15453</link>
2951 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ivanov_K/0/1/0/all/0/1">Kirill Ivanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Urbanke_R/0/1/0/all/0/1">R&#xfc;diger L. Urbanke</a></p>
2952
2953 <p>Recently it was proved that if a linear code is invariant under the action of
2954 a doubly transitive permutation group, it achieves the capacity of erasure
2955 channel. Therefore, it is of sufficient interest to classify all codes,
2956 invariant under such permutation groups. We take a step in this direction and
2957 give a review of all suitable groups and the known results on codes invariant
2958 under these groups. It turns out that there are capacity-achieving families of
2959 algebraic geometric codes.
2960 </p>
2961 </description>
2962 </item>
2963 <item>
2964 <title>Scalable Federated Learning over Passive Optical Networks. (arXiv:2010.15454v1 [cs.NI])</title>
2965 <link>http://fr.arxiv.org/abs/2010.15454</link>
2966 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_J/0/1/0/all/0/1">Jun Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_L/0/1/0/all/0/1">Lei Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_J/0/1/0/all/0/1">Jiajia Chen</a></p>
2967
2968 <p>Two-step aggregation is introduced to facilitate scalable federated learning
2969 (SFL) over passive optical networks (PONs). Results reveal that the SFL keeps
2970 the required PON upstream bandwidth constant regardless of the number of
2971 involved clients, while bringing ~10% learning accuracy improvement.
2972 </p>
2973 </description>
2974 </item>
2975 <item>
2976 <title>Optimal Sharing and and Fair Cost Allocation of Community Energy Storage. (arXiv:2010.15455v1 [cs.GT])</title>
2977 <link>http://fr.arxiv.org/abs/2010.15455</link>
2978 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_Y/0/1/0/all/0/1">Yu Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_G/0/1/0/all/0/1">Guoqiang Hu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Spanos_C/0/1/0/all/0/1">Costas J. Spanos</a></p>
2979
2980 <p>This paper studies an ES sharing model where multiple buildings cooperatively
2981 invest and share a community ES (CES) to harness economic benefits from on-site
2982 renewable integration and utility price arbitrage. Particularly, we formulate
2983 the problem that integrates the optimal ES sizing, operation and cost
2984 allocation as a coalition game, which are generally addressed separately in the
2985 literature. Particularly, we address the fair ex-post cost allocation which has
2986 not been well studied. To overcome the computational challenge of computing the
2987 entire information of explicit characteristic functions that takes exponential
2988 time, we propose a fair cost allocation based on nucleolus by employing a
2989 constraints generation technique. We study the fairness and computational
2990 efficiency of the method through a number of case studies. The numeric results
2991 imply that the proposed method outperforms the Shapley approach and
2992 proportional method either in computational efficiency or fairness. Notably,
2993 for the proposed method, only a small fraction of characteristic functions
2994 (2.54%) is computed to achieve the cost allocation versus the entire
2995 information required by Shapley approach. With the proposed cost allocation, we
2996 investigate the enhanced economic benefits of the CES model for individual
2997 buildings over individual ES (IES) installation. We see the CES model provides
2998 higher cost reduction to each committed buildings. Moreover, the value of
2999 storage is obviously improved (about 1.83 times) with the CES model over the
3000 IES model.
3001 </p>
3002 </description>
3003 </item>
3004 <item>
3005 <title>Multilayer Clustered Graph Learning. (arXiv:2010.15456v1 [cs.LG])</title>
3006 <link>http://fr.arxiv.org/abs/2010.15456</link>
3007 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gheche_M/0/1/0/all/0/1">Mireille El Gheche</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frossard_P/0/1/0/all/0/1">Pascal Frossard</a></p>
3008
3009 <p>Multilayer graphs are appealing mathematical tools for modeling multiple
3010 types of relationship in the data. In this paper, we aim at analyzing
3011 multilayer graphs by properly combining the information provided by individual
3012 layers, while preserving the specific structure that allows us to eventually
3013 identify communities or clusters that are crucial in the analysis of graph
3014 data. To do so, we learn a clustered representative graph by solving an
3015 optimization problem that involves a data fidelity term to the observed layers,
3016 and a regularization pushing for a sparse and community-aware graph. We use the
3017 contrastive loss as a data fidelity term, in order to properly aggregate the
3018 observed layers into a representative graph. The regularization is based on a
3019 measure of graph sparsification called "effective resistance", coupled with a
3020 penalization of the first few eigenvalues of the representative graph Laplacian
3021 matrix to favor the formation of communities. The proposed optimization problem
3022 is nonconvex but fully differentiable, and thus can be solved via the projected
3023 gradient method. Experiments show that our method leads to a significant
3024 improvement w.r.t. state-of-the-art multilayer graph learning algorithms for
3025 solving clustering problems.
3026 </p>
3027 </description>
3028 </item>
3029 <item>
3030 <title>FiGLearn: Filter and Graph Learning using Optimal Transport. (arXiv:2010.15457v1 [cs.LG])</title>
3031 <link>http://fr.arxiv.org/abs/2010.15457</link>
3032 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Minder_M/0/1/0/all/0/1">Matthias Minder</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farsijani_Z/0/1/0/all/0/1">Zahra Farsijani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shah_D/0/1/0/all/0/1">Dhruti Shah</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gheche_M/0/1/0/all/0/1">Mireille El Gheche</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frossard_P/0/1/0/all/0/1">Pascal Frossard</a></p>
3033
3034 <p>In many applications, a dataset can be considered as a set of observed
3035 signals that live on an unknown underlying graph structure. Some of these
3036 signals may be seen as white noise that has been filtered on the graph topology
3037 by a graph filter. Hence, the knowledge of the filter and the graph provides
3038 valuable information about the underlying data generation process and the
3039 complex interactions that arise in the dataset. We hence introduce a novel
3040 graph signal processing framework for jointly learning the graph and its
3041 generating filter from signal observations. We cast a new optimisation problem
3042 that minimises the Wasserstein distance between the distribution of the signal
3043 observations and the filtered signal distribution model. Our proposed method
3044 outperforms state-of-the-art graph learning frameworks on synthetic data. We
3045 then apply our method to a temperature anomaly dataset, and further show how
3046 this framework can be used to infer missing values if only very little
3047 information is available.
3048 </p>
3049 </description>
3050 </item>
3051 <item>
3052 <title>Named Entity Recognition for Social Media Texts with Semantic Augmentation. (arXiv:2010.15458v1 [cs.CL])</title>
3053 <link>http://fr.arxiv.org/abs/2010.15458</link>
3054 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nie_Y/0/1/0/all/0/1">Yuyang Nie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_Y/0/1/0/all/0/1">Yuanhe Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wan_X/0/1/0/all/0/1">Xiang Wan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Song_Y/0/1/0/all/0/1">Yan Song</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dai_B/0/1/0/all/0/1">Bo Dai</a></p>
3055
3056 <p>Existing approaches for named entity recognition suffer from data sparsity
3057 problems when conducted on short and informal texts, especially user-generated
3058 social media content. Semantic augmentation is a potential way to alleviate
3059 this problem. Given that rich semantic information is implicitly preserved in
3060 pre-trained word embeddings, they are potential ideal resources for semantic
3061 augmentation. In this paper, we propose a neural-based approach to NER for
3062 social media texts where both local (from running text) and augmented semantics
3063 are taken into account. In particular, we obtain the augmented semantic
3064 information from a large-scale corpus, and propose an attentive semantic
3065 augmentation module and a gate module to encode and aggregate such information,
3066 respectively. Extensive experiments are performed on three benchmark datasets
3067 collected from English and Chinese social media platforms, where the results
3068 demonstrate the superiority of our approach to previous studies across all
3069 three datasets.
3070 </p>
3071 </description>
3072 </item>
3073 <item>
3074 <title>Concatenated Codes for Recovery From Multiple Reads of DNA Sequences. (arXiv:2010.15461v1 [cs.IT])</title>
3075 <link>http://fr.arxiv.org/abs/2010.15461</link>
3076 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lenz_A/0/1/0/all/0/1">Andreas Lenz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maarouf_I/0/1/0/all/0/1">Issam Maarouf</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Welter_L/0/1/0/all/0/1">Lorenz Welter</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wachter_Zeh_A/0/1/0/all/0/1">Antonia Wachter-Zeh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rosnes_E/0/1/0/all/0/1">Eirik Rosnes</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Amat_A/0/1/0/all/0/1">Alexandre Graell i Amat</a></p>
3077
3078 <p>Decoding sequences that stem from multiple transmissions of a codeword over
3079 an insertion, deletion, and substitution channel is a critical component of
3080 efficient deoxyribonucleic acid (DNA) data storage systems. In this paper, we
3081 consider a concatenated coding scheme with an outer low-density parity-check
3082 code and either an inner convolutional code or a block code. We propose two new
3083 decoding algorithms for inference from multiple received sequences, both
3084 combining the inner code and channel to a joint hidden Markov model to infer
3085 symbolwise a posteriori probabilities (APPs). The first decoder computes the
3086 exact APPs by jointly decoding the received sequences, whereas the second
3087 decoder approximates the APPs by combining the results of separately decoded
3088 received sequences. Using the proposed algorithms, we evaluate the performance
3089 of decoding multiple received sequences by means of achievable information
3090 rates and Monte-Carlo simulations. We show significant performance gains
3091 compared to a single received sequence.
3092 </p>
3093 </description>
3094 </item>
3095 <item>
3096 <title>Self-Supervised Video Representation Using Pretext-Contrastive Learning. (arXiv:2010.15464v1 [cs.CV])</title>
3097 <link>http://fr.arxiv.org/abs/2010.15464</link>
3098 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tao_L/0/1/0/all/0/1">Li Tao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_X/0/1/0/all/0/1">Xueting Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yamasaki_T/0/1/0/all/0/1">Toshihiko Yamasaki</a></p>
3099
3100 <p>Pretext tasks and contrastive learning have been successful in
3101 self-supervised learning for video retrieval and recognition. In this study, we
3102 analyze their optimization targets and utilize the hyper-sphere feature space
3103 to explore the connections between them, indicating the compatibility and
3104 consistency of these two different learning methods. Based on the analysis, we
3105 propose a self-supervised training method, referred as Pretext-Contrastive
3106 Learning (PCL), to learn video representations. Extensive experiments based on
3107 different combinations of pretext task baselines and contrastive losses confirm
3108 the strong agreement with their self-supervised learning targets, demonstrating
3109 the effectiveness and the generality of PCL. The combination of pretext tasks
3110 and contrastive losses showed significant improvements in both video retrieval
3111 and recognition over the corresponding baselines. And we can also outperform
3112 current state-of-the-art methods in the same manner. Further, our PCL is
3113 flexible and can be applied to almost all existing pretext task methods.
3114 </p>
3115 </description>
3116 </item>
3117 <item>
3118 <title>Improving Named Entity Recognition with Attentive Ensemble of Syntactic Information. (arXiv:2010.15466v1 [cs.CL])</title>
3119 <link>http://fr.arxiv.org/abs/2010.15466</link>
3120 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nie_Y/0/1/0/all/0/1">Yuyang Nie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_Y/0/1/0/all/0/1">Yuanhe Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Song_Y/0/1/0/all/0/1">Yan Song</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ao_X/0/1/0/all/0/1">Xiang Ao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wan_X/0/1/0/all/0/1">Xiang Wan</a></p>
3121
3122 <p>Named entity recognition (NER) is highly sensitive to sentential syntactic
3123 and semantic properties where entities may be extracted according to how they
3124 are used and placed in the running text. To model such properties, one could
3125 rely on existing resources to providing helpful knowledge to the NER task; some
3126 existing studies proved the effectiveness of doing so, and yet are limited in
3127 appropriately leveraging the knowledge such as distinguishing the important
3128 ones for particular context. In this paper, we improve NER by leveraging
3129 different types of syntactic information through attentive ensemble, which
3130 functionalizes by the proposed key-value memory networks, syntax attention, and
3131 the gate mechanism for encoding, weighting and aggregating such syntactic
3132 information, respectively. Experimental results on six English and Chinese
3133 benchmark datasets suggest the effectiveness of the proposed model and show
3134 that it outperforms previous studies on all experiment datasets.
3135 </p>
3136 </description>
3137 </item>
3138 <item>
3139 <title>Emergence of Spatial Coordinates via Exploration. (arXiv:2010.15469v1 [cs.LG])</title>
3140 <link>http://fr.arxiv.org/abs/2010.15469</link>
3141 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Laflaquiere_A/0/1/0/all/0/1">Alban Laflaqui&#xe8;re</a></p>
3142
3143 <p>Spatial knowledge is a fundamental building block for the development of
3144 advanced perceptive and cognitive abilities. Traditionally, in robotics, the
3145 Euclidean (x,y,z) coordinate system and the agent's forward model are defined a
3146 priori. We show that a naive agent can autonomously build an internal
3147 coordinate system, with the same dimension and metric regularity as the
3148 external space, simply by learning to predict the outcome of sensorimotor
3149 transitions in a self-supervised way.
3150 </p>
3151 </description>
3152 </item>
3153 <item>
3154 <title>Hybrid mimetic finite-difference and virtual element formulation for coupled poromechanics. (arXiv:2010.15470v1 [math.NA])</title>
3155 <link>http://fr.arxiv.org/abs/2010.15470</link>
3156 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Borio_A/0/1/0/all/0/1">Andrea Borio</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Hamon_F/0/1/0/all/0/1">Fran&#xe7;ois Hamon</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Castelletto_N/0/1/0/all/0/1">Nicola Castelletto</a>, <a href="http://fr.arxiv.org/find/math/1/au:+White_J/0/1/0/all/0/1">Joshua A. White</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Settgast_R/0/1/0/all/0/1">Randolph R. Settgast</a></p>
3157
3158 <p>We present a hybrid mimetic finite-difference and virtual element formulation
3159 for coupled single-phase poromechanics on unstructured meshes. The key
3160 advantage of the scheme is that it is convergent on complex meshes containing
3161 highly distorted cells with arbitrary shapes. We use a local pressure-jump
3162 stabilization method based on unstructured macro-elements to prevent the
3163 development of spurious pressure modes in incompressible problems approaching
3164 undrained conditions. A scalable linear solution strategy is obtained using a
3165 block-triangular preconditioner designed specifically for the saddle-point
3166 systems arising from the proposed discretization. The accuracy and efficiency
3167 of our approach are demonstrated numerically on two-dimensional benchmark
3168 problems.
3169 </p>
3170 </description>
3171 </item>
3172 <item>
3173 <title>Iteratively reweighted greedy set cover. (arXiv:2010.15476v1 [cs.DS])</title>
3174 <link>http://fr.arxiv.org/abs/2010.15476</link>
3175 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Alexa_M/0/1/0/all/0/1">Marc Alexa</a></p>
3176
3177 <p>We empirically analyze a simple heuristic for large sparse set cover
3178 problems. It uses the weighted greedy algorithm as a basic building block. By
3179 multiplicative updates of the weights attached to the elements, the greedy
3180 solution is iteratively improved. The implementation of this algorithm is
3181 trivial and the algorithm is essentially free of parameters that would require
3182 tuning. More iterations can only improve the solution. This set of features
3183 makes the approach attractive for practical problems.
3184 </p>
3185 </description>
3186 </item>
3187 <item>
3188 <title>Learned infinite elements. (arXiv:2010.15479v1 [math.NA])</title>
3189 <link>http://fr.arxiv.org/abs/2010.15479</link>
3190 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Hohage_T/0/1/0/all/0/1">Thorsten Hohage</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lehrenfeld_C/0/1/0/all/0/1">Christoph Lehrenfeld</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Preuss_J/0/1/0/all/0/1">Janosch Preuss</a></p>
3191
3192 <p>We study the numerical solution of scalar time-harmonic wave equations on
3193 unbounded domains which can be split into a bounded interior domain of primary
3194 interest and an exterior domain with separable geometry. To compute the
3195 solution in the interior domain, approximations to the Dirichlet-to-Neumann
3196 (DtN) map of the exterior domain have to be imposed as transparent boundary
3197 conditions on the artificial coupling boundary. Although the DtN map can be
3198 computed by separation of variables, it is a nonlocal operator with dense
3199 matrix representations, and hence computationally inefficient. Therefore,
3200 approximations of DtN maps by sparse matrices, usually involving additional
3201 degrees of freedom, have been studied intensively in the literature using a
3202 variety of approaches including different types of infinite elements, local
3203 non-reflecting boundary conditions, and perfectly matched layers. The entries
3204 of these sparse matrices are derived analytically, e.g. from transformations or
3205 asymptotic expansions of solutions to the differential equation in the exterior
3206 domain. In contrast, in this paper we propose to `learn' the matrix entries
3207 from the DtN map in its separated form by solving an optimization problem as a
3208 preprocessing step. Theoretical considerations suggest that the approximation
3209 quality of learned infinite elements improves exponentially with increasing
3210 number of infinite element degrees of freedom, which is confirmed in numerical
3211 experiments. These numerical studies also show that learned infinite elements
3212 outperform state-of-the-art methods for the Helmholtz equation. At the same
3213 time, learned infinite elements are much more flexible than traditional methods
3214 as they, e.g., work similarly well for exterior domains involving strong
3215 reflections, for example, for the atmosphere of the Sun, which is strongly
3216 inhomogeneous and exhibits reflections at the corona.
3217 </p>
3218 </description>
3219 </item>
3220 <item>
3221 <title>Convergence of Constrained Anderson Acceleration. (arXiv:2010.15482v1 [math.NA])</title>
3222 <link>http://fr.arxiv.org/abs/2010.15482</link>
3223 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Barre_M/0/1/0/all/0/1">Mathieu Barr&#xe9;</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Taylor_A/0/1/0/all/0/1">Adrien Taylor</a>, <a href="http://fr.arxiv.org/find/math/1/au:+dAspremont_A/0/1/0/all/0/1">Alexandre d&#x27;Aspremont</a></p>
3224
3225 <p>We prove non asymptotic linear convergence rates for the constrained Anderson
3226 acceleration extrapolation scheme. These guarantees come from new upper bounds
3227 on the constrained Chebyshev problem, which consists in minimizing the maximum
3228 absolute value of a polynomial on a bounded real interval with $l_1$
3229 constraints on its coefficients vector. Constrained Anderson Acceleration has a
3230 numerical cost comparable to that of the original scheme.
3231 </p>
3232 </description>
3233 </item>
3234 <item>
3235 <title>Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification. (arXiv:2010.15487v1 [cs.CV])</title>
3236 <link>http://fr.arxiv.org/abs/2010.15487</link>
3237 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ali_A/0/1/0/all/0/1">Arslan Ali</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Migliorati_A/0/1/0/all/0/1">Andrea Migliorati</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bianchi_T/0/1/0/all/0/1">Tiziano Bianchi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Magli_E/0/1/0/all/0/1">Enrico Magli</a></p>
3238
3239 <p>Deep learning has shown outstanding performance in several applications
3240 including image classification. However, deep classifiers are known to be
3241 highly vulnerable to adversarial attacks, in that a minor perturbation of the
3242 input can easily lead to an error. Providing robustness to adversarial attacks
3243 is a very challenging task especially in problems involving a large number of
3244 classes, as it typically comes at the expense of an accuracy decrease. In this
3245 work, we propose the Gaussian class-conditional simplex (GCCS) loss: a novel
3246 approach for training deep robust multiclass classifiers that provides
3247 adversarial robustness while at the same time achieving or even surpassing the
3248 classification accuracy of state-of-the-art methods. Differently from other
3249 frameworks, the proposed method learns a mapping of the input classes onto
3250 target distributions in a latent space such that the classes are linearly
3251 separable. Instead of maximizing the likelihood of target labels for individual
3252 samples, our objective function pushes the network to produce feature
3253 distributions yielding high inter-class separation. The mean values of the
3254 distributions are centered on the vertices of a simplex such that each class is
3255 at the same distance from every other class. We show that the regularization of
3256 the latent space based on our approach yields excellent classification accuracy
3257 and inherently provides robustness to multiple adversarial attacks, both
3258 targeted and untargeted, outperforming state-of-the-art approaches over
3259 challenging datasets.
3260 </p>
3261 </description>
3262 </item>
3263 <item>
3264 <title>Linearizing Combinators. (arXiv:2010.15490v1 [math.CT])</title>
3265 <link>http://fr.arxiv.org/abs/2010.15490</link>
3266 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Cockett_R/0/1/0/all/0/1">Robin Cockett</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lemay_J/0/1/0/all/0/1">Jean-Simon Pacaud Lemay</a></p>
3267
3268 <p>In 2017, Bauer, Johnson, Osborne, Riehl, and Tebbe (BJORT) showed that the
3269 Abelian functor calculus provides an example of a Cartesian differential
3270 category. The definition of a Cartesian differential category is based on a
3271 differential combinator which directly formalizes the total derivative from
3272 multivariable calculus. However, in the aforementioned work the authors used
3273 techniques from Goodwillie's functor calculus to establish a linearization
3274 process from which they then derived a differential combinator. This raised the
3275 question of what the precise relationship between linearization and having a
3276 differential combinator might be.
3277 </p>
3278 <p>In this paper, we introduce the notion of a linearizing combinator which
3279 abstracts linearization in the Abelian functor calculus. We then use it to
3280 provide an alternative axiomatization of a Cartesian differential category.
3281 Every Cartesian differential category comes equipped with a canonical
3282 linearizing combinator obtained by differentiation at zero. Conversely, a
3283 differential combinator can be constructed \`a la BJORT when one has a system
3284 of partial linearizing combinators in each context. Thus, while linearizing
3285 combinators do provide an alternative axiomatization of Cartesian differential
3286 categories, an explicit notion of partial linearization is required. This is in
3287 contrast to the situation for differential combinators where partial
3288 differentiation is automatic in the presence of total differentiation. The
3289 ability to form a system of partial linearizing combinators from a total
3290 linearizing combinator, while not being possible in general, is possible when
3291 the setting is Cartesian closed.
3292 </p>
3293 </description>
3294 </item>
3295 <item>
3296 <title>A Novel Fast 3D Single Image Super-Resolution Algorithm. (arXiv:2010.15491v1 [eess.IV])</title>
3297 <link>http://fr.arxiv.org/abs/2010.15491</link>
3298 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Tuador_N/0/1/0/all/0/1">Nwigbo Kenule Tuador</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pham_D/0/1/0/all/0/1">Duong Hung Pham</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Michetti_J/0/1/0/all/0/1">J&#xe9;r&#xf4;me Michetti</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Basarab_A/0/1/0/all/0/1">Adrian Basarab</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kouame_D/0/1/0/all/0/1">Denis Kouam&#xe9;</a></p>
3299
3300 <p>This paper introduces a novel computationally efficient method of solving the
3301 3D single image super-resolution (SR) problem, i.e., reconstruction of a
3302 high-resolution volume from its low-resolution counterpart. The main
3303 contribution lies in the original way of handling simultaneously the associated
3304 decimation and blurring operators, based on their underlying properties in the
3305 frequency domain. In particular, the proposed decomposition technique of the 3D
3306 decimation operator allows a straightforward implementation for Tikhonov
3307 regularization, and can be further used to take into consideration other
3308 regularization functions such as the total variation, enabling the
3309 computational cost of state-of-the-art algorithms to be considerably decreased.
3310 Numerical experiments carried out showed that the proposed approach outperforms
3311 existing 3D SR methods.
3312 </p>
3313 </description>
3314 </item>
3315 <item>
3316 <title>"What, not how" -- Solving an under-actuated insertion task from scratch. (arXiv:2010.15492v1 [cs.RO])</title>
3317 <link>http://fr.arxiv.org/abs/2010.15492</link>
3318 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Vezzani_G/0/1/0/all/0/1">Giulia Vezzani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Neunert_M/0/1/0/all/0/1">Michael Neunert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wulfmeier_M/0/1/0/all/0/1">Markus Wulfmeier</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jeong_R/0/1/0/all/0/1">Rae Jeong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lampe_T/0/1/0/all/0/1">Thomas Lampe</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Siegel_N/0/1/0/all/0/1">Noah Siegel</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hafner_R/0/1/0/all/0/1">Roland Hafner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Abdolmaleki_A/0/1/0/all/0/1">Abbas Abdolmaleki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Riedmiller_M/0/1/0/all/0/1">Martin Riedmiller</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nori_F/0/1/0/all/0/1">Francesco Nori</a></p>
3319
3320 <p>Robot manipulation requires a complex set of skills that need to be carefully
3321 combined and coordinated to solve a task. Yet, most ReinforcementLearning (RL)
3322 approaches in robotics study tasks which actually consist only of a single
3323 manipulation skill, such as grasping an object or inserting a pre-grasped
3324 object. As a result the skill ('how' to solve the task) but not the actual goal
3325 of a complete manipulation ('what' to solve) is specified. In contrast, we
3326 study a complex manipulation goal that requires an agent to learn and combine
3327 diverse manipulation skills. We propose a challenging, highly under-actuated
3328 peg-in-hole task with a free, rotational asymmetrical peg, requiring a broad
3329 range of manipulation skills. While correct peg (re-)orientation is a
3330 requirement for successful insertion, there is no reward associated with it.
3331 Hence an agent needs to understand this pre-condition and learn the skill to
3332 fulfil it. The final insertion reward is sparse, allowing freedom in the
3333 solution and leading to complex emerging behaviour not envisioned during the
3334 task design. We tackle the problem in a multi-task RL framework using Scheduled
3335 Auxiliary Control (SAC-X) combined with Regularized Hierarchical Policy
3336 Optimization (RHPO) which successfully solves the task in simulation and from
3337 scratch on a single robot where data is severely limited.
3338 </p>
3339 </description>
3340 </item>
3341 <item>
3342 <title>Enhancing Vulnerable Road User Safety: A Survey of Existing Practices and Consideration for Using Mobile Devices for V2X Connections. (arXiv:2010.15502v1 [cs.NI])</title>
3343 <link>http://fr.arxiv.org/abs/2010.15502</link>
3344 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dasanayaka_N/0/1/0/all/0/1">Nishanthi Dasanayaka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hasan_K/0/1/0/all/0/1">Khondokar Fida Hasan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_C/0/1/0/all/0/1">Charles Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_Y/0/1/0/all/0/1">Yanming Feng</a></p>
3345
3346 <p>Vulnerable road users (VRUs) such as pedestrians, cyclists and motorcyclists
3347 are at the highest risk in the road traffic environment. Globally, over half of
3348 road traffic deaths are vulnerable road users. Although substantial efforts are
3349 being made to improve VRU safety from engineering solutions to law enforcement,
3350 the death toll of VRUs' continues to rise. The emerging technology, Cooperative
3351 Intelligent Transportation System (C-ITS), has the proven potential to enhance
3352 road safety by enabling wireless communication to exchange information among
3353 road users. Such exchanged information is utilized for creating situational
3354 awareness and detecting any potential collisions in advance to take necessary
3355 measures to avoid any possible road casualties. The current state-of-the-art
3356 solutions of C-ITS for VRU safety, however, are limited to unidirectional
3357 communication where VRUs are only responsible for alerting their presence to
3358 drivers with the intention of avoiding collisions. This one-way interaction is
3359 substantially limiting the enormous potential of C-ITS which otherwise can be
3360 employed to devise a more effective solution for the VRU safety where VRU can
3361 be equipped with bidirectional communication with full C-ITS functionalities.
3362 To address such problems and to explore better C-ITS solution suggestions for
3363 VRU, this paper reviewed and evaluated the current technologies and safety
3364 methods proposed for VRU safety over the period 2007-2020. Later, it presents
3365 the design considerations for a cellular-based Vehicle-to-VRU (V2VRU)
3366 communication system along with potential challenges of a cellular-based
3367 approach to provide necessary recommendations.
3368 </p>
3369 </description>
3370 </item>
3371 <item>
3372 <title>A stochastic $\theta$-SEIHRD model: adding randomness to the COVID-19 spread. (arXiv:2010.15504v1 [math.NA])</title>
3373 <link>http://fr.arxiv.org/abs/2010.15504</link>
3374 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Leitao_A/0/1/0/all/0/1">&#xc1;lvaro Leitao</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Vazquez_C/0/1/0/all/0/1">Carlos V&#xe1;zquez</a></p>
3375
3376 <p>In this article we mainly extend the deterministic model developed in [10] to
3377 a stochastic setting. More precisely, we incorporated randomness in some
3378 coefficients by assuming that they follow a prescribed stochastic dynamics. In
3379 this way, the model variables are now represented by stochastic process, that
3380 can be simulated by appropriately solve the system of stochastic differential
3381 equations. Thus, the model becomes more complete and flexible than the
3382 deterministic analogous, as it incorporates additional uncertainties which are
3383 present in more realistic situations. In particular, confidence intervals for
3384 the main variables and worst case scenarios can be computed.
3385 </p>
3386 </description>
3387 </item>
3388 <item>
3389 <title>Dynamic Formation Reshaping Based on Point Set Registration in a Swarm of Drones. (arXiv:2010.15506v1 [cs.RO])</title>
3390 <link>http://fr.arxiv.org/abs/2010.15506</link>
3391 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-Hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_M/0/1/0/all/0/1">Muhammad Mehboob Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3392
3393 <p>This work focuses on the formation reshaping in an optimized manner in
3394 autonomous swarm of drones. Here, the two main problems are: 1) how to break
3395 and reshape the initial formation in an optimal manner, and 2) how to do such
3396 reformation while minimizing the overall deviation of the drones and the
3397 overall time, i.e., without slowing down. To address the first problem, we
3398 introduce a set of routines for the drones/agents to follow while reshaping to
3399 a secondary formation shape. And the second problem is resolved by utilizing
3400 the temperature function reduction technique, originally used in the point set
3401 registration process. The goal is to be able to dynamically reform the shape of
3402 multi-agent based swarm in near-optimal manner while going through narrow
3403 openings between, for instance obstacles, and then bringing the agents back to
3404 their original shape after passing through the narrow passage using point set
3405 registration technique.
3406 </p>
3407 </description>
3408 </item>
3409 <item>
3410 <title>Dynamic Resource-aware Corner Detection for Bio-inspired Vision Sensors. (arXiv:2010.15507v1 [cs.CV])</title>
3411 <link>http://fr.arxiv.org/abs/2010.15507</link>
3412 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miele_A/0/1/0/all/0/1">Antonio Miele</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3413
3414 <p>Event-based cameras are vision devices that transmit only brightness changes
3415 with low latency and ultra-low power consumption. Such characteristics make
3416 event-based cameras attractive in the field of localization and object tracking
3417 in resource-constrained systems. Since the number of generated events in such
3418 cameras is huge, the selection and filtering of the incoming events are
3419 beneficial from both increasing the accuracy of the features and reducing the
3420 computational load. In this paper, we present an algorithm to detect
3421 asynchronous corners from a stream of events in real-time on embedded systems.
3422 The algorithm is called the Three Layer Filtering-Harris or TLF-Harris
3423 algorithm. The algorithm is based on an events' filtering strategy whose
3424 purpose is 1) to increase the accuracy by deliberately eliminating some
3425 incoming events, i.e., noise, and 2) to improve the real-time performance of
3426 the system, i.e., preserving a constant throughput in terms of input events per
3427 second, by discarding unnecessary events with a limited accuracy loss. An
3428 approximation of the Harris algorithm, in turn, is used to exploit its
3429 high-quality detection capability with a low-complexity implementation to
3430 enable seamless real-time performance on embedded computing platforms. The
3431 proposed algorithm is capable of selecting the best corner candidate among
3432 neighbors and achieves an average execution time savings of 59 % compared with
3433 the conventional Harris score. Moreover, our approach outperforms the competing
3434 methods, such as eFAST, eHarris, and FA-Harris, in terms of real-time
3435 performance, and surpasses Arc* in terms of accuracy.
3436 </p>
3437 </description>
3438 </item>
3439 <item>
3440 <title>FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement. (arXiv:2010.15508v1 [eess.AS])</title>
3441 <link>http://fr.arxiv.org/abs/2010.15508</link>
3442 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Horaud_R/0/1/0/all/0/1">Radu Horaud</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_X/0/1/0/all/0/1">Xiaofei Li</a></p>
3443
3444 <p>This paper proposes a full-band and sub-band fusion model, named as
3445 FullSubNet, for single-channel real-time speech enhancement. Full-band and
3446 sub-band refer to the models that input full-band and sub-band noisy spectral
3447 feature, output full-band and sub-band speech target, respectively. The
3448 sub-band model processes each frequency independently. Its input consists of
3449 one frequency and several context frequencies. The output is the prediction of
3450 the clean speech target for the corresponding frequency. These two types of
3451 models have distinct characteristics. The full-band model can capture the
3452 global spectral context and the long-distance cross-band dependencies. However,
3453 it lacks the ability to modeling signal stationarity and attending the local
3454 spectral pattern. The sub-band model is just the opposite. In our proposed
3455 FullSubNet, we connect a pure full-band model and a pure sub-band model
3456 sequentially and use practical joint training to integrate these two types of
3457 models' advantages. We conducted experiments on the DNS challenge (INTERSPEECH
3458 2020) dataset to evaluate the proposed method. Experimental results show that
3459 full-band and sub-band information are complementary, and the FullSubNet can
3460 effectively integrate them. Besides, the performance of the FullSubNet also
3461 exceeds that of the top-ranked methods in the DNS Challenge (INTERSPEECH 2020).
3462 </p>
3463 </description>
3464 </item>
3465 <item>
3466 <title>Night vision obstacle detection and avoidance based on Bio-Inspired Vision Sensors. (arXiv:2010.15509v1 [cs.CV])</title>
3467 <link>http://fr.arxiv.org/abs/2010.15509</link>
3468 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_M/0/1/0/all/0/1">Muhammad Mehboob Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3469
3470 <p>Moving towards autonomy, unmanned vehicles rely heavily on state-of-the-art
3471 collision avoidance systems (CAS). However, the detection of obstacles
3472 especially during night-time is still a challenging task since the lighting
3473 conditions are not sufficient for traditional cameras to function properly.
3474 Therefore, we exploit the powerful attributes of event-based cameras to perform
3475 obstacle detection in low lighting conditions. Event cameras trigger events
3476 asynchronously at high output temporal rate with high dynamic range of up to
3477 120 $dB$. The algorithm filters background activity noise and extracts objects
3478 using robust Hough transform technique. The depth of each detected object is
3479 computed by triangulating 2D features extracted utilising LC-Harris. Finally,
3480 asynchronous adaptive collision avoidance (AACA) algorithm is applied for
3481 effective avoidance. Qualitative evaluation is compared using event-camera and
3482 traditional camera.
3483 </p>
3484 </description>
3485 </item>
3486 <item>
3487 <title>Asynchronous Corner Tracking Algorithm based on Lifetime of Events for DAVIS Cameras. (arXiv:2010.15510v1 [cs.CV])</title>
3488 <link>http://fr.arxiv.org/abs/2010.15510</link>
3489 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-Hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miele_A/0/1/0/all/0/1">Antonio Miele</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3490
3491 <p>Event cameras, i.e., the Dynamic and Active-pixel Vision Sensor (DAVIS) ones,
3492 capture the intensity changes in the scene and generates a stream of events in
3493 an asynchronous fashion. The output rate of such cameras can reach up to 10
3494 million events per second in high dynamic environments. DAVIS cameras use novel
3495 vision sensors that mimic human eyes. Their attractive attributes, such as high
3496 output rate, High Dynamic Range (HDR), and high pixel bandwidth, make them an
3497 ideal solution for applications that require high-frequency tracking. Moreover,
3498 applications that operate in challenging lighting scenarios can exploit the
3499 high HDR of event cameras, i.e., 140 dB compared to 60 dB of traditional
3500 cameras. In this paper, a novel asynchronous corner tracking method is proposed
3501 that uses both events and intensity images captured by a DAVIS camera. The
3502 Harris algorithm is used to extract features, i.e., frame-corners from
3503 keyframes, i.e., intensity images. Afterward, a matching algorithm is used to
3504 extract event-corners from the stream of events. Events are solely used to
3505 perform asynchronous tracking until the next keyframe is captured. Neighboring
3506 events, within a window size of 5x5 pixels around the event-corner, are used to
3507 calculate the velocity and direction of extracted event-corners by fitting the
3508 2D planar using a randomized Hough transform algorithm. Experimental evaluation
3509 showed that our approach is able to update the location of the extracted
3510 corners up to 100 times during the blind time of traditional cameras, i.e.,
3511 between two consecutive intensity images.
3512 </p>
3513 </description>
3514 </item>
3515 <item>
3516 <title>An Exact Solution Path Algorithm for SLOPE and Quasi-Spherical OSCAR. (arXiv:2010.15511v1 [stat.ME])</title>
3517 <link>http://fr.arxiv.org/abs/2010.15511</link>
3518 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Nomura_S/0/1/0/all/0/1">Shunichi Nomura</a></p>
3519
3520 <p>Sorted $L_1$ penalization estimator (SLOPE) is a regularization technique for
3521 sorted absolute coefficients in high-dimensional regression. By arbitrarily
3522 setting its regularization weights $\lambda$ under the monotonicity constraint,
3523 SLOPE can have various feature selection and clustering properties. On weight
3524 tuning, the selected features and their clusters are very sensitive to the
3525 tuning parameters. Moreover, the exhaustive tracking of their changes is
3526 difficult using grid search methods. This study presents a solution path
3527 algorithm that provides the complete and exact path of solutions for SLOPE in
3528 fine-tuning regularization weights. A simple optimality condition for SLOPE is
3529 derived and used to specify the next splitting point of the solution path. This
3530 study also proposes a new design of a regularization sequence $\lambda$ for
3531 feature clustering, which is called the quasi-spherical and octagonal shrinkage
3532 and clustering algorithm for regression (QS-OSCAR). QS-OSCAR is designed with a
3533 contour surface of the regularization terms most similar to a sphere. Among
3534 several regularization sequence designs, sparsity and clustering performance
3535 are compared through simulation studies. The numerical observations show that
3536 QS-OSCAR performs feature clustering more efficiently than other designs.
3537 </p>
3538 </description>
3539 </item>
3540 <item>
3541 <title>UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition. (arXiv:2010.15521v1 [eess.AS])</title>
3542 <link>http://fr.arxiv.org/abs/2010.15521</link>
3543 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_Z/0/1/0/all/0/1">Zhiyu Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_H/0/1/0/all/0/1">Hui Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Batushiren/0/1/0/all/0/1">Batushiren</a></p>
3544
3545 <p>Speech enhancement at extremely low signal-to-noise ratio (SNR) condition is
3546 a very challenging problem and rarely investigated in previous works. This
3547 paper proposes a robust speech enhancement approach (UNetGAN) based on U-Net
3548 and generative adversarial learning to deal with this problem. This approach
3549 consists of a generator network and a discriminator network, which operate
3550 directly in the time domain. The generator network adopts a U-Net like
3551 structure and employs dilated convolution in the bottleneck of it. We evaluate
3552 the performance of the UNetGAN at low SNR conditions (up to -20dB) on the
3553 public benchmark. The result demonstrates that it significantly improves the
3554 speech quality and substantially outperforms the representative deep learning
3555 models, including SEGAN, cGAN fo SE, Bidirectional LSTM using phase-sensitive
3556 spectrum approximation cost function (PSA-BLSTM) and Wave-U-Net regarding
3557 Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech
3558 quality (PESQ).
3559 </p>
3560 </description>
3561 </item>
3562 <item>
3563 <title>A brief overview of swarm intelligence-based algorithms for numerical association rule mining. (arXiv:2010.15524v1 [cs.NE])</title>
3564 <link>http://fr.arxiv.org/abs/2010.15524</link>
3565 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fister_I/0/1/0/all/0/1">Iztok Fister Jr.</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fister_I/0/1/0/all/0/1">Iztok Fister</a></p>
3566
3567 <p>Numerical Association Rule Mining is a popular variant of Association Rule
3568 Mining, where numerical attributes are handled without discretization. This
3569 means that the algorithms for dealing with this problem can operate directly,
3570 not only with categorical, but also with numerical attributes. Until recently,
3571 a big portion of these algorithms were based on a stochastic nature-inspired
3572 population-based paradigm. As a result, evolutionary and swarm
3573 intelligence-based algorithms showed big efficiency for dealing with the
3574 problem. In line with this, the main mission of this chapter is to make a
3575 historical overview of swarm intelligence-based algorithms for Numerical
3576 Association Rule Mining, as well as to present the main features of these
3577 algorithms for the observed problem. A taxonomy of the algorithms was proposed
3578 on the basis of the applied features found in this overview. Challenges,
3579 waiting in the future, finish this paper.
3580 </p>
3581 </description>
3582 </item>
3583 <item>
3584 <title>Self-Learning Threshold-Based Load Balancing. (arXiv:2010.15525v1 [cs.PF])</title>
3585 <link>http://fr.arxiv.org/abs/2010.15525</link>
3586 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goldsztajn_D/0/1/0/all/0/1">Diego Goldsztajn</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Borst_S/0/1/0/all/0/1">Sem C. Borst</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Leeuwaarden_J/0/1/0/all/0/1">Johan S. H. van Leeuwaarden</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Mukherjee_D/0/1/0/all/0/1">Debankur Mukherjee</a> (3), <a href="http://fr.arxiv.org/find/cs/1/au:+Whiting_P/0/1/0/all/0/1">Philip A. Whiting</a> (4) ((1) Eindhoven University of Technology, (2) Tilburg University, (3) Georgia Institute of Technology, (4) Macquarie University)</p>
3587
3588 <p>We consider a large-scale service system where incoming tasks have to be
3589 instantaneously dispatched to one out of many parallel server pools. The
3590 dispatcher uses a threshold for balancing the load and keeping the maximum
3591 number of concurrent tasks across server pools low. We demonstrate that such a
3592 policy is optimal on the fluid and diffusion scales for a suitable threshold
3593 value, while only involving a small communication overhead. In order to set the
3594 threshold optimally, it is important, however, to learn the load of the system,
3595 which may be uncertain or even time-varying. For that purpose, we design a
3596 control rule for tuning the threshold in an online manner. We provide
3597 conditions which guarantee that this adaptive threshold settles at the optimal
3598 value, along with estimates for the time until this happens.
3599 </p>
3600 </description>
3601 </item>
3602 <item>
3603 <title>A comparison of automatic multi-tissue segmentation methods of the human fetal brain using the FeTA Dataset. (arXiv:2010.15526v1 [eess.IV])</title>
3604 <link>http://fr.arxiv.org/abs/2010.15526</link>
3605 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Payette_K/0/1/0/all/0/1">Kelly Payette</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Dumast_P/0/1/0/all/0/1">Priscille de Dumast</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kebiri_H/0/1/0/all/0/1">Hamza Kebiri</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ezhov_I/0/1/0/all/0/1">Ivan Ezhov</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Paetzold_J/0/1/0/all/0/1">Johannes C. Paetzold</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shit_S/0/1/0/all/0/1">Suprosanna Shit</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Iqbal_A/0/1/0/all/0/1">Asim Iqbal</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Khan_R/0/1/0/all/0/1">Romesa Khan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kottke_R/0/1/0/all/0/1">Raimund Kottke</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Grehten_P/0/1/0/all/0/1">Patrice Grehten</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ji_H/0/1/0/all/0/1">Hui Ji</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lanczi_L/0/1/0/all/0/1">Levente Lanczi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Nagy_M/0/1/0/all/0/1">Marianna Nagy</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Beresova_M/0/1/0/all/0/1">Monika Beresova</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Nguyen_T/0/1/0/all/0/1">Thi Dao Nguyen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Natalucci_G/0/1/0/all/0/1">Giancarlo Natalucci</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Karayannis_T/0/1/0/all/0/1">Theofanis Karayannis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Menze_B/0/1/0/all/0/1">Bjoern Menze</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Cuadra_M/0/1/0/all/0/1">Meritxell Bach Cuadra</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jakab_A/0/1/0/all/0/1">Andras Jakab</a></p>
3606
3607 <p>It is critical to quantitatively analyse the developing human fetal brain in
3608 order to fully understand neurodevelopment in both normal fetuses and those
3609 with congenital disorders. To facilitate this analysis, automatic multi-tissue
3610 fetal brain segmentation algorithms are needed, which in turn requires open
3611 databases of segmented fetal brains. Here we introduce a publicly available
3612 database of 50 manually segmented pathological and non-pathological fetal
3613 magnetic resonance brain volume reconstructions across a range of gestational
3614 ages (20 to 33 weeks) into 7 different tissue categories (external
3615 cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, deep
3616 grey matter, brainstem/spinal cord). In addition, we quantitatively evaluate
3617 the accuracy of several automatic multi-tissue segmentation algorithms of the
3618 developing human fetal brain. Four research groups participated, submitting a
3619 total of 10 algorithms, demonstrating the benefits the database for the
3620 development of automatic algorithms.
3621 </p>
3622 </description>
3623 </item>
3624 <item>
3625 <title>On the robustness of kernel-based pairwise learning. (arXiv:2010.15527v1 [stat.ML])</title>
3626 <link>http://fr.arxiv.org/abs/2010.15527</link>
3627 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Gensler_P/0/1/0/all/0/1">Patrick Gensler</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Christmann_A/0/1/0/all/0/1">Andreas Christmann</a></p>
3628
3629 <p>It is shown that many results on the statistical robustness of kernel-based
3630 pairwise learning can be derived under basically no assumptions on the input
3631 and output spaces. In particular neither moment conditions on the conditional
3632 distribution of Y given X = x nor the boundedness of the output space is
3633 needed. We obtain results on the existence and boundedness of the influence
3634 function and show qualitative robustness of the kernel-based estimator. The
3635 present paper generalizes results by Christmann and Zhou (2016) by allowing the
3636 prediction function to take two arguments and can thus be applied in a variety
3637 of situations such as ranking.
3638 </p>
3639 </description>
3640 </item>
3641 <item>
3642 <title>An End to End Network Architecture for Fundamental Matrix Estimation. (arXiv:2010.15528v1 [cs.CV])</title>
3643 <link>http://fr.arxiv.org/abs/2010.15528</link>
3644 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yesheng Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_X/0/1/0/all/0/1">Xu Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_D/0/1/0/all/0/1">Dahong Qian</a></p>
3645
3646 <p>In this paper, we present a novel end-to-end network architecture to estimate
3647 fundamental matrix directly from stereo images. To establish a complete working
3648 pipeline, different deep neural networks in charge of finding correspondences
3649 in images, performing outlier rejection and calculating fundamental matrix, are
3650 integrated into an end-to-end network architecture.
3651 </p>
3652 <p>To well train the network and preserve geometry properties of fundamental
3653 matrix, a new loss function is introduced. To evaluate the accuracy of
3654 estimated fundamental matrix more reasonably, we design a new evaluation metric
3655 which is highly consistent with visualization result. Experiments conducted on
3656 both outdoor and indoor data-sets show that this network outperforms
3657 traditional methods as well as previous deep learning based methods on various
3658 metrics and achieves significant performance improvements.
3659 </p>
3660 </description>
3661 </item>
3662 <item>
3663 <title>Probabilistic interval predictor based on dissimilarity functions. (arXiv:2010.15530v1 [eess.SY])</title>
3664 <link>http://fr.arxiv.org/abs/2010.15530</link>
3665 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Carnerero_A/0/1/0/all/0/1">A. Daniel Carnerero</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ramirez_D/0/1/0/all/0/1">Daniel R. Ramirez</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Alamo_T/0/1/0/all/0/1">Teodoro Alamo</a></p>
3666
3667 <p>This work presents a new method to obtain probabilistic interval predictions
3668 of a dynamical system. The method uses stored past system measurements to
3669 estimate the future evolution of the system. The proposed method relies on the
3670 use of dissimilarity functions to estimate the conditional probability density
3671 function of the outputs. A family of empirical probability density functions,
3672 parameterized by means of two parameters, is introduced. It is shown that the
3673 the proposed family encompasses the multivariable normal probability density
3674 function as a particular case. We show that the proposed method constitutes a
3675 generalization of classical estimation methods. A cross-validation scheme is
3676 used to tune the two parameters on which the methodology relies. In order to
3677 prove the effectiveness of the methodology presented, some numerical examples
3678 and comparisons are provided.
3679 </p>
3680 </description>
3681 </item>
3682 <item>
3683 <title>Coordinated Formation Control for Intelligent and Connected Vehicles in Multiple Traffic Scenarios. (arXiv:2010.15531v1 [eess.SY])</title>
3684 <link>http://fr.arxiv.org/abs/2010.15531</link>
3685 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_Q/0/1/0/all/0/1">Qing Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Cai_M/0/1/0/all/0/1">Mengchi Cai</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_K/0/1/0/all/0/1">Keqiang Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_B/0/1/0/all/0/1">Biao Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_J/0/1/0/all/0/1">Jianqiang Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wu_X/0/1/0/all/0/1">Xiangbin Wu</a></p>
3686
3687 <p>In this paper, a unified multi-vehicle formation control framework for
3688 Intelligent and Connected Vehicles (ICVs) that can apply to multiple traffic
3689 scenarios is proposed. In the one-dimensional scenario, different formation
3690 geometries are analyzed and the interlaced structure is mathematically
3691 modelized to improve driving safety while making full use of the lane capacity.
3692 The assignment problem for vehicles and target positions is solved using
3693 Hungarian Algorithm to improve the flexibility of the method in multiple
3694 scenarios. In the two-dimensional scenario, an improved virtual platoon method
3695 is proposed to transfer the complex two-dimensional passing problem to the
3696 one-dimensional formation control problem based on the idea of rotation
3697 projection. Besides, the vehicle regrouping method is proposed to connect the
3698 two scenarios. Simulation results prove that the proposed multi-vehicle
3699 formation control framework can apply to multiple typical scenarios and have
3700 better performance than existing methods.
3701 </p>
3702 </description>
3703 </item>
3704 <item>
3705 <title>How do Offline Measures for Exploration in Reinforcement Learning behave?. (arXiv:2010.15533v1 [cs.LG])</title>
3706 <link>http://fr.arxiv.org/abs/2010.15533</link>
3707 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hollenstein_J/0/1/0/all/0/1">Jakob J. Hollenstein</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Auddy_S/0/1/0/all/0/1">Sayantan Auddy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Saveriano_M/0/1/0/all/0/1">Matteo Saveriano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Renaudo_E/0/1/0/all/0/1">Erwan Renaudo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Piater_J/0/1/0/all/0/1">Justus Piater</a></p>
3708
3709 <p>Sufficient exploration is paramount for the success of a reinforcement
3710 learning agent. Yet, exploration is rarely assessed in an algorithm-independent
3711 way. We compare the behavior of three data-based, offline exploration metrics
3712 described in the literature on intuitive simple distributions and highlight
3713 problems to be aware of when using them. We propose a fourth metric,uniform
3714 relative entropy, and implement it using either a k-nearest-neighbor or a
3715 nearest-neighbor-ratio estimator, highlighting that the implementation choices
3716 have a profound impact on these measures.
3717 </p>
3718 </description>
3719 </item>
3720 <item>
3721 <title>Poster: Benchmarking Financial Data Feed Systems. (arXiv:2010.15534v1 [cs.PF])</title>
3722 <link>http://fr.arxiv.org/abs/2010.15534</link>
3723 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Coenen_M/0/1/0/all/0/1">Manuel Coenen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wagner_C/0/1/0/all/0/1">Christoph Wagner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Echler_A/0/1/0/all/0/1">Alexander Echler</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frischbier_S/0/1/0/all/0/1">Sebastian Frischbier</a></p>
3724
3725 <p>Data-driven solutions for the investment industry require event-based backend
3726 systems to process high-volume financial data feeds with low latency, high
3727 throughput, and guaranteed delivery modes.
3728 </p>
3729 <p>At vwd we process an average of 18 billion incoming event notifications from
3730 500+ data sources for 30 million symbols per day and peak rates of 1+ million
3731 notifications per second using custom-built platforms that keep audit logs of
3732 every event.
3733 </p>
3734 <p>We currently assess modern open source event-processing platforms such as
3735 Kafka, NATS, Redis, Flink or Storm for the use in our ticker plant to reduce
3736 the maintenance effort for cross-cutting concerns and leverage hybrid
3737 deployment models. For comparability and repeatability we benchmark candidates
3738 with a standardized workload we derived from our real data feeds.
3739 </p>
3740 <p>We have enhanced an existing light-weight open source benchmarking tool in
3741 its processing, logging, and reporting capabilities to cope with our workloads.
3742 The resulting tool wrench can simulate workloads or replay snapshots in volume
3743 and dynamics like those we process in our ticker plant. We provide the tool as
3744 open source.
3745 </p>
3746 <p>As part of ongoing work we contribute details on (a) our workload and
3747 requirements for benchmarking candidate platforms for financial feed
3748 processing; (b) the current state of the tool wrench.
3749 </p>
3750 </description>
3751 </item>
3752 <item>
3753 <title>Unbabel's Participation in the WMT20 Metrics Shared Task. (arXiv:2010.15535v1 [cs.CL])</title>
3754 <link>http://fr.arxiv.org/abs/2010.15535</link>
3755 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rei_R/0/1/0/all/0/1">Ricardo Rei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stewart_C/0/1/0/all/0/1">Craig Stewart</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farinha_C/0/1/0/all/0/1">Catarina Farinha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lavie_A/0/1/0/all/0/1">Alon Lavie</a></p>
3756
3757 <p>We present the contribution of the Unbabel team to the WMT 2020 Shared Task
3758 on Metrics. We intend to participate on the segment-level, document-level and
3759 system-level tracks on all language pairs, as well as the 'QE as a Metric'
3760 track. Accordingly, we illustrate results of our models in these tracks with
3761 reference to test sets from the previous year. Our submissions build upon the
3762 recently proposed COMET framework: We train several estimator models to regress
3763 on different human-generated quality scores and a novel ranking model trained
3764 on relative ranks obtained from Direct Assessments. We also propose a simple
3765 technique for converting segment-level predictions into a document-level score.
3766 Overall, our systems achieve strong results for all language pairs on previous
3767 test sets and in many cases set a new state-of-the-art.
3768 </p>
3769 </description>
3770 </item>
3771 <item>
3772 <title>Matern Gaussian Processes on Graphs. (arXiv:2010.15538v1 [stat.ML])</title>
3773 <link>http://fr.arxiv.org/abs/2010.15538</link>
3774 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Borovitskiy_V/0/1/0/all/0/1">Viacheslav Borovitskiy</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Azangulov_I/0/1/0/all/0/1">Iskander Azangulov</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Terenin_A/0/1/0/all/0/1">Alexander Terenin</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Mostowsky_P/0/1/0/all/0/1">Peter Mostowsky</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Deisenroth_M/0/1/0/all/0/1">Marc Peter Deisenroth</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Durrande_N/0/1/0/all/0/1">Nicolas Durrande</a></p>
3775
3776 <p>Gaussian processes are a versatile framework for learning unknown functions
3777 in a manner that permits one to utilize prior information about their
3778 properties. Although many different Gaussian process models are readily
3779 available when the input space is Euclidean, the choice is much more limited
3780 for Gaussian processes whose input space is an undirected graph. In this work,
3781 we leverage the stochastic partial differential equation characterization of
3782 Mat\'{e}rn Gaussian processes - a widely-used model class in the Euclidean
3783 setting - to study their analog for undirected graphs. We show that the
3784 resulting Gaussian processes inherit various attractive properties of their
3785 Euclidean and Riemannian analogs and provide techniques that allow them to be
3786 trained using standard methods, such as inducing points. This enables graph
3787 Mat\'{e}rn Gaussian processes to be employed in mini-batch and non-conjugate
3788 settings, thereby making them more accessible to practitioners and easier to
3789 deploy within larger learning frameworks.
3790 </p>
3791 </description>
3792 </item>
3793 <item>
3794 <title>Micromagnetics of thin films in the presence of Dzyaloshinskii-Moriya interaction. (arXiv:2010.15541v1 [math.AP])</title>
3795 <link>http://fr.arxiv.org/abs/2010.15541</link>
3796 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Davoli_E/0/1/0/all/0/1">Elisa Davoli</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Fratta_G/0/1/0/all/0/1">Giovanni Di Fratta</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Praetorius_D/0/1/0/all/0/1">Dirk Praetorius</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ruggeri_M/0/1/0/all/0/1">Michele Ruggeri</a></p>
3797
3798 <p>In this paper, we study the thin-film limit of the micromagnetic energy
3799 functional in the presence of bulk Dzyaloshinskii-Moriya interaction (DMI). Our
3800 analysis includes both a stationary $\Gamma$-convergence result for the
3801 micromagnetic energy, as well as the identification of the asymptotic behavior
3802 of the associated Landau-Lifshitz-Gilbert equation. In particular, we prove
3803 that, in the limiting model, part of the DMI term behaves like the projection
3804 of the magnetic moment onto the normal to the film, contributing this way to an
3805 increase in the shape anisotropy arising from the magnetostatic self-energy.
3806 Finally, we discuss a convergent finite element approach for the approximation
3807 of the time-dependent case and use it to numerically compare the original
3808 three-dimensional model with the two-dimensional thin-film limit.
3809 </p>
3810 </description>
3811 </item>
3812 <item>
3813 <title>Systematic literature review protocol Identification and classification of feature modeling errors. (arXiv:2010.15545v1 [cs.SE])</title>
3814 <link>http://fr.arxiv.org/abs/2010.15545</link>
3815 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sepulveda_S/0/1/0/all/0/1">Samuel Sep&#xfa;lveda</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Diaz_J/0/1/0/all/0/1">Jaime D&#xed;az</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Esperguel_M/0/1/0/all/0/1">Marcelo Esperguel</a></p>
3816
3817 <p>Context: The importance of feature modeling languages for software product
3818 lines and the planning stage for a systematic literature review. Objective: A
3819 protocol for carrying out a systematic literature review about the evidence for
3820 identifying and classifying the errors in feature modeling languages. Method:
3821 The definition of a protocol to conduct a systematic literature review
3822 according to the guidelines of B. Kitchenham. Results: A validated protocol to
3823 conduct a systematic literature review. Conclusions: A proposal for the
3824 protocol definition of a systematic literature review about the identification
3825 and classification of errors in feature modeling was built. Initial results
3826 show that the effects and results for solving these errors should be carried
3827 out.
3828 </p>
3829 </description>
3830 </item>
3831 <item>
3832 <title>Multi-Constitutive Neural Network for Large Deformation Poromechanics Problem. (arXiv:2010.15549v1 [cs.LG])</title>
3833 <link>http://fr.arxiv.org/abs/2010.15549</link>
3834 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Q/0/1/0/all/0/1">Qi Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yilin Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_Z/0/1/0/all/0/1">Ziyi Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Darve_E/0/1/0/all/0/1">Eric Darve</a></p>
3835
3836 <p>In this paper, we study the problem of large-strain consolidation in
3837 poromechanics with deep neural networks. Given different material properties
3838 and different loading conditions, the goal is to predict pore pressure and
3839 settlement. We propose a novel method "multi-constitutive neural network"
3840 (MCNN) such that one model can solve several different constitutive laws. We
3841 introduce a one-hot encoding vector as an additional input vector, which is
3842 used to label the constitutive law we wish to solve. Then we build a DNN which
3843 takes as input (X, t) along with a constitutive model label and outputs the
3844 corresponding solution. It is the first time, to our knowledge, that we can
3845 evaluate multi-constitutive laws through only one training process while still
3846 obtaining good accuracies. We found that MCNN trained to solve multiple PDEs
3847 outperforms individual neural network solvers trained with PDE.
3848 </p>
3849 </description>
3850 </item>
3851 <item>
3852 <title>ADABOOK & MULTIBOOK: Adaptive Boosting with Chance Correction. (arXiv:2010.15550v1 [cs.LG])</title>
3853 <link>http://fr.arxiv.org/abs/2010.15550</link>
3854 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Powers_D/0/1/0/all/0/1">David M. W. Powers</a></p>
3855
3856 <p>There has been considerable interest in boosting and bagging, including the
3857 combination of the adaptive techniques of AdaBoost with the random selection
3858 with replacement techniques of Bagging. At the same time there has been a
3859 revisiting of the way we evaluate, with chance-corrected measures like Kappa,
3860 Informedness, Correlation or ROC AUC being advocated. This leads to the
3861 question of whether learning algorithms can do better by optimizing an
3862 appropriate chance corrected measure. Indeed, it is possible for a weak learner
3863 to optimize Accuracy to the detriment of the more reaslistic chance-corrected
3864 measures, and when this happens the booster can give up too early. This
3865 phenomenon is known to occur with conventional Accuracy-based AdaBoost, and the
3866 MultiBoost algorithm has been developed to overcome such problems using restart
3867 techniques based on bagging. This paper thus complements the theoretical work
3868 showing the necessity of using chance-corrected measures for evaluation, with
3869 empirical work showing how use of a chance-corrected measure can improve
3870 boosting. We show that the early surrender problem occurs in MultiBoost too, in
3871 multiclass situations, so that chance-corrected AdaBook and Multibook can beat
3872 standard Multiboost or AdaBoost, and we further identify which chance-corrected
3873 measures to use when.
3874 </p>
3875 </description>
3876 </item>
3877 <item>
3878 <title>Investigating the Robustness of Artificial Intelligent Algorithms with Mixture Experiments. (arXiv:2010.15551v1 [stat.ML])</title>
3879 <link>http://fr.arxiv.org/abs/2010.15551</link>
3880 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Lian_J/0/1/0/all/0/1">Jiayi Lian</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Freeman_L/0/1/0/all/0/1">Laura Freeman</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Hong_Y/0/1/0/all/0/1">Yili Hong</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Deng_X/0/1/0/all/0/1">Xinwei Deng</a></p>
3881
3882 <p>Artificial intelligent (AI) algorithms, such as deep learning and XGboost,
3883 are used in numerous applications including computer vision, autonomous
3884 driving, and medical diagnostics. The robustness of these AI algorithms is of
3885 great interest as inaccurate prediction could result in safety concerns and
3886 limit the adoption of AI systems. In this paper, we propose a framework based
3887 on design of experiments to systematically investigate the robustness of AI
3888 classification algorithms. A robust classification algorithm is expected to
3889 have high accuracy and low variability under different application scenarios.
3890 The robustness can be affected by a wide range of factors such as the imbalance
3891 of class labels in the training dataset, the chosen prediction algorithm, the
3892 chosen dataset of the application, and a change of distribution in the training
3893 and test datasets. To investigate the robustness of AI classification
3894 algorithms, we conduct a comprehensive set of mixture experiments to collect
3895 prediction performance results. Then statistical analyses are conducted to
3896 understand how various factors affect the robustness of AI classification
3897 algorithms. We summarize our findings and provide suggestions to practitioners
3898 in AI applications.
3899 </p>
3900 </description>
3901 </item>
3902 <item>
3903 <title>Successive Halving Top-k Operator. (arXiv:2010.15552v1 [cs.LG])</title>
3904 <link>http://fr.arxiv.org/abs/2010.15552</link>
3905 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Pietruszka_M/0/1/0/all/0/1">Micha&#x142; Pietruszka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Borchmann_L/0/1/0/all/0/1">&#x141;ukasz Borchmann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gralinski_F/0/1/0/all/0/1">Filip Grali&#x144;ski</a></p>
3906
3907 <p>We propose a differentiable successive halving method of relaxing the top-k
3908 operator, rendering gradient-based optimization possible. The need to perform
3909 softmax iteratively on the entire vector of scores is avoided by using a
3910 tournament-style selection. As a result, a much better approximation of top-k
3911 with lower computational cost is achieved compared to the previous approach.
3912 </p>
3913 </description>
3914 </item>
3915 <item>
3916 <title>Modulation Pattern Detection Using Complex Convolutions in Deep Learning. (arXiv:2010.15556v1 [cs.LG])</title>
3917 <link>http://fr.arxiv.org/abs/2010.15556</link>
3918 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Krzyston_J/0/1/0/all/0/1">Jakob Krzyston</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bhattacharjea_R/0/1/0/all/0/1">Rajib Bhattacharjea</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stark_A/0/1/0/all/0/1">Andrew Stark</a></p>
3919
3920 <p>Transceivers used for telecommunications transmit and receive specific
3921 modulation patterns that are represented as sequences of complex numbers.
3922 Classifying modulation patterns is challenging because noise and channel
3923 impairments affect the signals in complicated ways such that the received
3924 signal bears little resemblance to the transmitted signal. Although deep
3925 learning approaches have shown great promise over statistical methods in this
3926 problem space, deep learning frameworks continue to lag in support for
3927 complex-valued data. To address this gap, we study the implementation and use
3928 of complex convolutions in a series of convolutional neural network
3929 architectures. Replacement of data structure and convolution operations by
3930 their complex generalization in an architecture improves performance, with
3931 statistical significance, at recognizing modulation patterns in complex-valued
3932 signals with high SNR after being trained on low SNR signals. This suggests
3933 complex-valued convolutions enables networks to learn more meaningful
3934 representations. We investigate this hypothesis by comparing the features
3935 learned in each experiment by visualizing the inputs that results in one-hot
3936 modulation pattern classification for each network.
3937 </p>
3938 </description>
3939 </item>
3940 <item>
3941 <title>Quantum Computing: A Taxonomy, Systematic Review and Future Directions. (arXiv:2010.15559v1 [cs.ET])</title>
3942 <link>http://fr.arxiv.org/abs/2010.15559</link>
3943 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gill_S/0/1/0/all/0/1">Sukhpal Singh Gill</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kumar_A/0/1/0/all/0/1">Adarsh Kumar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_H/0/1/0/all/0/1">Harvinder Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_M/0/1/0/all/0/1">Manmeet Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kaur_K/0/1/0/all/0/1">Kamalpreet Kaur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Usman_M/0/1/0/all/0/1">Muhammad Usman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Buyya_R/0/1/0/all/0/1">Rajkumar Buyya</a></p>
3944
3945 <p>Quantum computing is an emerging paradigm with the potential to offer
3946 significant computational advantage over conventional classical computing by
3947 exploiting quantum-mechanical principles such as entanglement and
3948 superposition. It is anticipated that this computational advantage of quantum
3949 computing will help to solve many complex and computationally intractable
3950 problems in several areas of research such as drug design, data science, clean
3951 energy, finance, industrial chemical development, secure communications, and
3952 quantum chemistry, among others. In recent years, tremendous progress in both
3953 quantum hardware development and quantum software/algorithm have brought
3954 quantum computing much closer to reality. As the quantum devices are expected
3955 to steadily scale up in the next few years, quantum decoherence and qubit
3956 interconnectivity are two of the major challenges to achieve quantum advantage
3957 in the NISQ era. Quantum computing is a highly topical and fast-moving field of
3958 research with significant ongoing progress in all facets. A systematic review
3959 of the existing literature on quantum computing will be invaluable to
3960 understand the current status of this emerging field and identify open
3961 challenges for the quantum computing community in the coming years. This review
3962 article presents a comprehensive review of quantum computing literature, and
3963 taxonomy of quantum computing. Further, the proposed taxonomy is used to map
3964 various related studies to identify the research gaps. A detailed overview of
3965 quantum software tools and technologies, post-quantum cryptography and quantum
3966 computer hardware development to document the current state-of-the-art in the
3967 respective areas. We finish the article by highlighting various open challenges
3968 and promising future directions for research.
3969 </p>
3970 </description>
3971 </item>
3972 <item>
3973 <title>Genetic U-Net: Automatically Designing Lightweight U-shaped CNN Architectures Using the Genetic Algorithm for Retinal Vessel Segmentation. (arXiv:2010.15560v1 [eess.IV])</title>
3974 <link>http://fr.arxiv.org/abs/2010.15560</link>
3975 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Wei_J/0/1/0/all/0/1">Jiahong Wei</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Fan_Z/0/1/0/all/0/1">Zhun Fan</a></p>
3976
3977 <p>Many previous works based on deep learning for retinal vessel segmentation
3978 have achieved promising performance by manually designing U-shaped
3979 convolutional neural networks (CNNs). However, the manual design of these CNNs
3980 is time-consuming and requires extensive empirical knowledge. To address this
3981 problem, we propose a novel method using genetic algorithms (GAs) to
3982 automatically design a lightweight U-shaped CNN for retinal vessel
3983 segmentation, called Genetic U-Net. Here we first design a special search space
3984 containing the structure of U-Net and its corresponding operations, and then
3985 use genetic algorithm to search for superior architectures in this search
3986 space. Experimental results show that the proposed method outperforms the
3987 existing methods on three public datasets, DRIVE, CHASE_DB1 and STARE. In
3988 addition, the architectures obtained by the proposed method are more
3989 lightweight but robust than the state-of-the-art models.
3990 </p>
3991 </description>
3992 </item>
3993 <item>
3994 <title>Federated Transfer Learning: concept and applications. (arXiv:2010.15561v1 [cs.LG])</title>
3995 <link>http://fr.arxiv.org/abs/2010.15561</link>
3996 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Saha_S/0/1/0/all/0/1">Sudipan Saha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ahmad_T/0/1/0/all/0/1">Tahir Ahmad</a></p>
3997
3998 <p>Development of Artificial Intelligence (AI) is inherently tied to the
3999 development of data. However, in most industries data exists in form of
4000 isolated islands, with limited scope of sharing between different
4001 organizations. This is an hindrance to the further development of AI. Federated
4002 learning has emerged as a possible solution to this problem in the last few
4003 years without compromising user privacy. Among different variants of the
4004 federated learning, noteworthy is federated transfer learning (FTL) that allows
4005 knowledge to be transferred across domains that do not have many overlapping
4006 features and users. In this work we provide a comprehensive survey of the
4007 existing works on this topic. In more details, we study the background of FTL
4008 and its different existing applications. We further analyze FTL from privacy
4009 and machine learning perspective.
4010 </p>
4011 </description>
4012 </item>
4013 <item>
4014 <title>Limitations of the recall capabilities in delay based reservoir computing systems. (arXiv:2010.15562v1 [cs.ET])</title>
4015 <link>http://fr.arxiv.org/abs/2010.15562</link>
4016 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Koster_F/0/1/0/all/0/1">Felix K&#xf6;ster</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ehlert_D/0/1/0/all/0/1">Dominik Ehlert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ludge_K/0/1/0/all/0/1">Kathy L&#xfc;dge</a></p>
4017
4018 <p>We analyze the memory capacity of a delay based reservoir computer with a
4019 Hopf normal form as nonlinearity and numerically compute the linear as well as
4020 the higher order recall capabilities. A possible physical realisation could be
4021 a laser with external cavity, for which the information is fed via electrical
4022 injection. A task independent quantification of the computational capability of
4023 the reservoir system is done via a complete orthonormal set of basis functions.
4024 Our results suggest that even for constant readout dimension the total memory
4025 capacity is dependent on the ratio between the information input period, also
4026 called the clock cycle, and the time delay in the system. Optimal performance
4027 is found for a time delay about 1.6 times the clock cycle
4028 </p>
4029 </description>
4030 </item>
4031 <item>
4032 <title>Overcoming The Limitations of Neural Networks in Composite-Pattern Learning with Architopes. (arXiv:2010.15571v1 [cs.NE])</title>
4033 <link>http://fr.arxiv.org/abs/2010.15571</link>
4034 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kratsios_A/0/1/0/all/0/1">Anastasis Kratsios</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zamanlooy_B/0/1/0/all/0/1">Behnoosh Zamanlooy</a></p>
4035
4036 <p>The effectiveness of neural networks in solving complex problems is well
4037 recognized; however, little is known about their limitations. We demonstrate
4038 that the feed-forward architecture, for most commonly used activation
4039 functions, is incapable of approximating functions comprised of multiple
4040 sub-patterns while simultaneously respecting their composite-pattern structure.
4041 We overcome this bottleneck with a simple architecture modification that
4042 reallocates the neurons of any single feed-forward network across several
4043 smaller sub-networks, each specialized on a distinct part of the input-space.
4044 The modified architecture, called an Architope, is more expressive on two
4045 fronts. First, it is dense in an associated space of piecewise continuous
4046 functions in which the feed-forward architecture is not dense. Second, it
4047 achieves the same approximation rate as the feed-forward networks while only
4048 requiring $\mathscr{O}(N^{-1})$ fewer parameters in its hidden layers.
4049 Moreover, the architecture achieves these approximation improvements while
4050 preserving the target's composite-pattern structure.
4051 </p>
4052 </description>
4053 </item>
4054 <item>
4055 <title>Experimental Analysis of Communication Relaying Delay in Low-Energy Ad-hoc Networks. (arXiv:2010.15572v1 [cs.NI])</title>
4056 <link>http://fr.arxiv.org/abs/2010.15572</link>
4057 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Miya_T/0/1/0/all/0/1">Taichi Miya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ohshima_K/0/1/0/all/0/1">Kohta Ohshima</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kitaguchi_Y/0/1/0/all/0/1">Yoshiaki Kitaguchi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yamaoka_K/0/1/0/all/0/1">Katsunori Yamaoka</a></p>
4058
4059 <p>In recent years, more and more applications use ad-hoc networks for local M2M
4060 communications, but in some cases such as when using WSNs, the software
4061 processing delay induced by packets relaying may not be negligible. In this
4062 paper, we planned and carried out a delay measurement experiment using
4063 Raspberry Pi Zero W. The results demonstrated that, in low-energy ad-hoc
4064 networks, processing delay of the application is always too large to ignore; it
4065 is at least ten times greater than the kernel routing and corresponds to 30% of
4066 the transmission delay. Furthermore, if the task is CPU-intensive, such as
4067 packet encryption, the processing delay can be greater than the transmission
4068 delay and its behavior is represented by a simple linear model. Our findings
4069 indicate that the key factor for achieving QoS in ad-hoc networks is an
4070 appropriate node-to-node load balancing that takes into account the CPU
4071 performance and the amount of traffic passing through each node.
4072 </p>
4073 </description>
4074 </item>
4075 <item>
4076 <title>Import test questions into Moodle LMS. (arXiv:2010.15577v1 [cs.CY])</title>
4077 <link>http://fr.arxiv.org/abs/2010.15577</link>
4078 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mintii_I/0/1/0/all/0/1">Iryna S. Mintii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shokaliuk_S/0/1/0/all/0/1">Svitlana V. Shokaliuk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vakaliuk_T/0/1/0/all/0/1">Tetiana A. Vakaliuk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mintii_M/0/1/0/all/0/1">Mykhailo M. Mintii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Soloviev_V/0/1/0/all/0/1">Vladimir N. Soloviev</a></p>
4079
4080 <p>The purpose of the study is to highlight the theoretical and methodological
4081 aspects of preparing the test questions of the most common types in the form of
4082 text files for further import into learning management system (LMS) Moodle. The
4083 subject of the research is the automated filling of the Moodle LMS test
4084 database. The objectives of the study: to analyze the import files of test
4085 questions, their advantages and disadvantages; to develop guidelines for the
4086 preparation of test questions of common types in the form of text files for
4087 further import into Moodle LMS. The action algorithms for importing questions
4088 and instructions for submitting question files in such formats as Aiken, GIFT,
4089 Moodle XML, "True/False" questions, "Multiple Choice" (one of many and many of
4090 many), "Matching", with an open answer - "Numerical" or "Short answer" and
4091 "Essay" are offered in this article. The formats for submitting questions,
4092 examples of its designing and developed questions were demonstrated in view
4093 mode in Moodle LMS.
4094 </p>
4095 </description>
4096 </item>
4097 <item>
4098 <title>Exploring the Nuances of Designing (with/for) Artificial Intelligence. (arXiv:2010.15578v1 [cs.CY])</title>
4099 <link>http://fr.arxiv.org/abs/2010.15578</link>
4100 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Stoimenova_N/0/1/0/all/0/1">Niya Stoimenova</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Price_R/0/1/0/all/0/1">Rebecca Price</a></p>
4101
4102 <p>Solutions relying on artificial intelligence are devised to predict data
4103 patterns and answer questions that are clearly defined, involve an enumerable
4104 set of solutions, clear rules, and inherently binary decision mechanisms. Yet,
4105 as they become exponentially implemented in our daily activities, they begin to
4106 transcend these initial boundaries and to affect the larger sociotechnical
4107 system in which they are situated. In this arrangement, a solution is under
4108 pressure to surpass true or false criteria and move to an ethical evaluation of
4109 right and wrong. Neither algorithmic solutions, nor purely humanistic ones will
4110 be enough to fully mitigate undesirable outcomes in the narrow state of AI or
4111 its future incarnations. We must take a holistic view. In this paper we explore
4112 the construct of infrastructure as a means to simultaneously address
4113 algorithmic and societal issues when designing AI.
4114 </p>
4115 </description>
4116 </item>
4117 <item>
4118 <title>Modeling biomedical breathing signals with convolutional deep probabilistic autoencoders. (arXiv:2010.15579v1 [cs.LG])</title>
4119 <link>http://fr.arxiv.org/abs/2010.15579</link>
4120 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Pastor_Serrano_O/0/1/0/all/0/1">Oscar Pastor-Serrano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lathouwers_D/0/1/0/all/0/1">Danny Lathouwers</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Perko_Z/0/1/0/all/0/1">Zolt&#xe1;n Perk&#xf3;</a></p>
4121
4122 <p>One of the main problems with biomedical signals is the limited amount of
4123 patient-specific data and the significant amount of time needed to record a
4124 sufficient number of samples for diagnostic and treatment purposes. We explore
4125 the use of Variational Autoencoder (VAE) and Adversarial Autoencoder (AAE)
4126 algorithms based on one-dimensional convolutional neural networks in order to
4127 build generative models able to capture and represent the variability of a set
4128 of unlabeled quasi-periodic signals using as few as 10 parameters. Furthermore,
4129 we introduce a modified AAE architecture that allows simultaneous
4130 semi-supervised classification and generation of different types of signals.
4131 Our study is based on physical breathing signals, i.e. time series describing
4132 the position of chest markers, generally used to describe respiratory motion.
4133 The time series are discretized into a vector of periods, with each period
4134 containing 6 time and position values. These vectors can be transformed back
4135 into time series through an additional reconstruction neural network and allow
4136 to generate extended signals while simplifying the modeling task. The obtained
4137 models can be used to generate realistic breathing realizations from patient or
4138 population data and to classify new recordings. We show that by incorporating
4139 the labels from around 10-15\% of the dataset during training, the model can be
4140 guided to group data according to the patient it belongs to, or based on the
4141 presence of different types of breathing irregularities such as baseline
4142 shifts. Our specific motivation is to model breathing motion during
4143 radiotherapy lung cancer treatments, for which the developed model serves as an
4144 efficient tool to robustify plans against breathing uncertainties. However, the
4145 same methodology can in principle be applied to any other kind of
4146 quasi-periodic biomedical signal, representing a generically applicable tool.
4147 </p>
4148 </description>
4149 </item>
4150 <item>
4151 <title>The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research. (arXiv:2010.15581v1 [cs.CY])</title>
4152 <link>http://fr.arxiv.org/abs/2010.15581</link>
4153 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ahmed_N/0/1/0/all/0/1">Nur Ahmed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wahed_M/0/1/0/all/0/1">Muntasir Wahed</a></p>
4154
4155 <p>Increasingly, modern Artificial Intelligence (AI) research has become more
4156 computationally intensive. However, a growing concern is that due to unequal
4157 access to computing power, only certain firms and elite universities have
4158 advantages in modern AI research. Using a novel dataset of 171394 papers from
4159 57 prestigious computer science conferences, we document that firms, in
4160 particular, large technology firms and elite universities have increased
4161 participation in major AI conferences since deep learning's unanticipated rise
4162 in 2012. The effect is concentrated among elite universities, which are ranked
4163 1-50 in the QS World University Rankings. Further, we find two strategies
4164 through which firms increased their presence in AI research: first, they have
4165 increased firm-only publications; and second, firms are collaborating primarily
4166 with elite universities. Consequently, this increased presence of firms and
4167 elite universities in AI research has crowded out mid-tier (QS ranked 201-300)
4168 and lower-tier (QS ranked 301-500) universities. To provide causal evidence
4169 that deep learning's unanticipated rise resulted in this divergence, we
4170 leverage the generalized synthetic control method, a data-driven counterfactual
4171 estimator. Using machine learning based text analysis methods, we provide
4172 additional evidence that the divergence between these two groups - large firms
4173 and non-elite universities - is driven by access to computing power or compute,
4174 which we term as the "compute divide". This compute divide between large firms
4175 and non-elite universities increases concerns around bias and fairness within
4176 AI technology, and presents an obstacle towards "democratizing" AI. These
4177 results suggest that a lack of access to specialized equipment such as compute
4178 can de-democratize knowledge production.
4179 </p>
4180 </description>
4181 </item>
4182 <item>
4183 <title>Improving Accuracy of Federated Learning in Non-IID Settings. (arXiv:2010.15582v1 [cs.LG])</title>
4184 <link>http://fr.arxiv.org/abs/2010.15582</link>
4185 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ozdayi_M/0/1/0/all/0/1">Mustafa Safa Ozdayi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kantarcioglu_M/0/1/0/all/0/1">Murat Kantarcioglu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Iyer_R/0/1/0/all/0/1">Rishabh Iyer</a></p>
4186
4187 <p>Federated Learning (FL) is a decentralized machine learning protocol that
4188 allows a set of participating agents to collaboratively train a model without
4189 sharing their data. This makes FL particularly suitable for settings where data
4190 privacy is desired. However, it has been observed that the performance of FL is
4191 closely tied with the local data distributions of agents. Particularly, in
4192 settings where local data distributions vastly differ among agents, FL performs
4193 rather poorly with respect to the centralized training. To address this
4194 problem, we hypothesize the reasons behind the performance degradation, and
4195 develop some techniques to address these reasons accordingly. In this work, we
4196 identify four simple techniques that can improve the performance of trained
4197 models without incurring any additional communication overhead to FL, but
4198 rather, some light computation overhead either on the client, or the
4199 server-side. In our experimental analysis, combination of our techniques
4200 improved the validation accuracy of a model trained via FL by more than 12%
4201 with respect to our baseline. This is about 5% less than the accuracy of the
4202 model trained on centralized data.
4203 </p>
4204 </description>
4205 </item>
4206 <item>
4207 <title>Probabilistic Transformers. (arXiv:2010.15583v1 [cs.LG])</title>
4208 <link>http://fr.arxiv.org/abs/2010.15583</link>
4209 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Movellan_J/0/1/0/all/0/1">Javier R. Movellan</a></p>
4210
4211 <p>We show that Transformers are Maximum Posterior Probability estimators for
4212 Mixtures of Gaussian Models. This brings a probabilistic point of view to
4213 Transformers and suggests extensions to other probabilistic cases.
4214 </p>
4215 </description>
4216 </item>
4217 <item>
4218 <title>Future Directions of the Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Program. (arXiv:2010.15584v1 [cs.CY])</title>
4219 <link>http://fr.arxiv.org/abs/2010.15584</link>
4220 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Arora_R/0/1/0/all/0/1">Ritu Arora</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Li_X/0/1/0/all/0/1">Xiaosong Li</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Hurwitz_B/0/1/0/all/0/1">Bonnie Hurwitz</a> (3), <a href="http://fr.arxiv.org/find/cs/1/au:+Fay_D/0/1/0/all/0/1">Daniel Fay</a> (4), <a href="http://fr.arxiv.org/find/cs/1/au:+Panda_D/0/1/0/all/0/1">Dhabaleswar K. Panda</a> (5), <a href="http://fr.arxiv.org/find/cs/1/au:+Valeev_E/0/1/0/all/0/1">Edward Valeev</a> (6), <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shaowen Wang</a> (7), <a href="http://fr.arxiv.org/find/cs/1/au:+Moore_S/0/1/0/all/0/1">Shirley Moore</a> (8), <a href="http://fr.arxiv.org/find/cs/1/au:+Chandrasekaran_S/0/1/0/all/0/1">Sunita Chandrasekaran</a> (9), <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_T/0/1/0/all/0/1">Ting Cao</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Bik_H/0/1/0/all/0/1">Holly Bik</a> (10), <a href="http://fr.arxiv.org/find/cs/1/au:+Curry_M/0/1/0/all/0/1">Matthew Curry</a> (11), <a href="http://fr.arxiv.org/find/cs/1/au:+Islam_T/0/1/0/all/0/1">Tanzima Islam</a> (12) ((1) Texas Advanced Computing Center, (2) University of Washington, (3) University of Arizona, (4) Microsoft, (5) The Ohio State University, (6) Virginia Tech University, (7) University of Illinois, (8) Oak Ridge National Lab, (9) University of Delaware, (10) University of California, Riverside, (11) Sandia National Lab, (12) Texas State University)</p>
4221
4222 <p>The CSSI 2019 workshop was held on October 28-29, 2019, in Austin, Texas. The
4223 main objectives of this workshop were to (1) understand the impact of the CSSI
4224 program on the community over the last 9 years, (2) engage workshop
4225 participants in identifying gaps and opportunities in the current CSSI
4226 landscape, (3) gather ideas on the cyberinfrastructure needs and expectations
4227 of the community with respect to the CSSI program, and (4) prepare a report
4228 summarizing the feedback gathered from the community that can inform the future
4229 solicitations of the CSSI program. The workshop brought together different
4230 stakeholders interested in provisioning sustainable cyberinfrastructure that
4231 can power discoveries impacting the various fields of science and technology
4232 and maintaining the nation's competitiveness in the areas such as scientific
4233 software, HPC, networking, cybersecurity, and data/information science. The
4234 workshop served as a venue for gathering the community-feedback on the current
4235 state of the CSSI program and its future directions.
4236 </p>
4237 </description>
4238 </item>
4239 <item>
4240 <title>Panel: Economic Policy and Governance during Pandemics using AI. (arXiv:2010.15585v1 [cs.CY])</title>
4241 <link>http://fr.arxiv.org/abs/2010.15585</link>
4242 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Batarseh_F/0/1/0/all/0/1">Feras A. Batarseh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gopinath_M/0/1/0/all/0/1">Munisamy Gopinath</a></p>
4243
4244 <p>The global food supply chain (starting at farms and ending with consumers)
4245 has been seriously disrupted by many outlier events such as trade wars, the
4246 China demand shock, natural disasters, and pandemics. Outlier events create
4247 uncertainty along the entire supply chain in addition to intervening policy
4248 responses to mitigate their adverse effects. Artificial Intelligence (AI)
4249 methods (i.e. machine/reinforcement/deep learning) provide an opportunity to
4250 better understand outcomes during outlier events by identifying regular,
4251 irregular and contextual components. Employing AI can provide guidance to
4252 decision making suppliers, farmers, processors, wholesalers, and retailers
4253 along the supply chain, and policy makers to facilitate welfare-improving
4254 outcomes. This panel discusses these issues.
4255 </p>
4256 </description>
4257 </item>
4258 <item>
4259 <title>Event-Driven Learning of Systematic Behaviours in Stock Markets. (arXiv:2010.15586v1 [q-fin.ST])</title>
4260 <link>http://fr.arxiv.org/abs/2010.15586</link>
4261 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-fin/1/au:+Wu_X/0/1/0/all/0/1">Xianchao Wu</a></p>
4262
4263 <p>It is reported that financial news, especially financial events expressed in
4264 news, provide information to investors' long/short decisions and influence the
4265 movements of stock markets. Motivated by this, we leverage financial event
4266 streams to train a classification neural network that detects latent
4267 event-stock linkages and stock markets' systematic behaviours in the U.S. stock
4268 market. Our proposed pipeline includes (1) a combined event extraction method
4269 that utilizes Open Information Extraction and neural co-reference resolution,
4270 (2) a BERT/ALBERT enhanced representation of events, and (3) an extended
4271 hierarchical attention network that includes attentions on event, news and
4272 temporal levels. Our pipeline achieves significantly better accuracies and
4273 higher simulated annualized returns than state-of-the-art models when being
4274 applied to predicting Standard\&amp;Poor 500, Dow Jones, Nasdaq indices and 10
4275 individual stocks.
4276 </p>
4277 </description>
4278 </item>
4279 <item>
4280 <title>Impact of (SARS-CoV-2) COVID 19 on the indigenous language-speaking population in Mexico. (arXiv:2010.15588v1 [cs.CY])</title>
4281 <link>http://fr.arxiv.org/abs/2010.15588</link>
4282 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Medel_Ramirez_C/0/1/0/all/0/1">Carlos Medel-Ramirez</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Medel_Lopez_H/0/1/0/all/0/1">Hilario Medel-Lopez</a></p>
4283
4284 <p>The importance of the working document is that it allows the analysis of the
4285 information and the status of cases associated with (SARS-CoV-2) COVID-19 as
4286 open data at the municipal, state and national level, with a daily record of
4287 patients, according to a age, sex, comorbidities, for the condition of
4288 (SARS-CoV-2) COVID-19 according to the following characteristics: a) Positive,
4289 b) Negative, c) Suspicious. Likewise, it presents information related to the
4290 identification of an outpatient and / or hospitalized patient, attending to
4291 their medical development, identifying: a) Recovered, b) Deaths and c) Active,
4292 in Phase 3 and Phase 4, in the five main population areas speaker of indigenous
4293 language in the State of Veracruz - Mexico. The data analysis is carried out
4294 through the application of a data mining algorithm, which provides the
4295 information, fast and timely, required for the estimation of Medical Care
4296 Scenarios of (SARS-CoV-2) COVID-19, as well as for know the impact on the
4297 indigenous language-speaking population in Mexico.
4298 </p>
4299 </description>
4300 </item>
4301 <item>
4302 <title>Enjeux \'ethiques de l'IA en sant\'e : une humanisation du parcours de soin par l'intelligence artificielle ?. (arXiv:2010.15590v1 [cs.CY])</title>
4303 <link>http://fr.arxiv.org/abs/2010.15590</link>
4304 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Muhlenbach_F/0/1/0/all/0/1">Fabrice Muhlenbach</a></p>
4305
4306 <p>Considering the use of artificial intelligence for greater personalization of
4307 patient care and better management of human and material resources may seem
4308 like an opportunity not to be missed. In order to offer a better humanization
4309 of the care pathway, artificial intelligence is a tool that decision-makers in
4310 the hospital sector must appropriate by taking care of the new ethical issues
4311 and conflicts of values that this technology generates.
4312 </p>
4313 <p>Envisager le recours \`a l'intelligence artificielle pour une plus grande
4314 personnalisation de la prise en charge du patient et une meilleure gestion des
4315 ressources humaines et mat\'erielles peut sembler une opportunit\'e \`a ne pas
4316 manquer. Afin de proposer une meilleure humanisation du parcours de soin,
4317 l'intelligence artificielle est un outil que les d\'ecideurs du milieu
4318 hospitalier doivent s'approprier en veillant aux nouveaux enjeux \'ethiques et
4319 conflits de valeurs que cette technologie engendre.
4320 </p>
4321 </description>
4322 </item>
4323 <item>
4324 <title>Shared Space Transfer Learning for analyzing multi-site fMRI data. (arXiv:2010.15594v1 [cs.LG])</title>
4325 <link>http://fr.arxiv.org/abs/2010.15594</link>
4326 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yousefnezhad_M/0/1/0/all/0/1">Muhammad Yousefnezhad</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Selvitella_A/0/1/0/all/0/1">Alessandro Selvitella</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1">Daoqiang Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Greenshaw_A/0/1/0/all/0/1">Andrew J. Greenshaw</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Greiner_R/0/1/0/all/0/1">Russell Greiner</a></p>
4327
4328 <p>Multi-voxel pattern analysis (MVPA) learns predictive models from task-based
4329 functional magnetic resonance imaging (fMRI) data, for distinguishing when
4330 subjects are performing different cognitive tasks -- e.g., watching movies or
4331 making decisions. MVPA works best with a well-designed feature set and an
4332 adequate sample size. However, most fMRI datasets are noisy, high-dimensional,
4333 expensive to collect, and with small sample sizes. Further, training a robust,
4334 generalized predictive model that can analyze homogeneous cognitive tasks
4335 provided by multi-site fMRI datasets has additional challenges. This paper
4336 proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning
4337 (TL) approach that can functionally align homogeneous multi-site fMRI datasets,
4338 and so improve the prediction performance in every site. SSTL first extracts a
4339 set of common features for all subjects in each site. It then uses TL to map
4340 these site-specific features to a site-independent shared space in order to
4341 improve the performance of the MVPA. SSTL uses a scalable optimization
4342 procedure that works effectively for high-dimensional fMRI datasets. The
4343 optimization procedure extracts the common features for each site by using a
4344 single-iteration algorithm and maps these site-specific common features to the
4345 site-independent shared space. We evaluate the effectiveness of the proposed
4346 method for transferring between various cognitive tasks. Our comprehensive
4347 experiments validate that SSTL achieves superior performance to other
4348 state-of-the-art analysis techniques.
4349 </p>
4350 </description>
4351 </item>
4352 <item>
4353 <title>Verification of Patterns. (arXiv:2010.15596v1 [cs.LO])</title>
4354 <link>http://fr.arxiv.org/abs/2010.15596</link>
4355 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yong Wang</a></p>
4356
4357 <p>The software patterns provide building blocks to the design and
4358 implementation of a software system, and try to make the software engineering
4359 to progress from experience to science. The software patterns were made famous
4360 because of the introduction as the design patterns. After that, patterns have
4361 been researched and developed widely and rapidly. The series of books of
4362 pattern-oriented software architecture should be marked in the development of
4363 software patterns. As mentioned in these books, formalization of patterns and
4364 an intermediate pattern language are needed and should be developed in the
4365 future of patterns. So, in this book, we formalize software patterns according
4366 to the categories of the series of books of pattern-oriented software
4367 architecture, and verify the correctness of patterns based on truly concurrent
4368 process algebra. In one aspect, patterns are formalized and verified; in the
4369 other aspect, truly concurrent process algebra can play a role of an
4370 intermediate pattern language for its rigorous theory.
4371 </p>
4372 </description>
4373 </item>
4374 <item>
4375 <title>Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control. (arXiv:2010.15597v1 [cs.LG])</title>
4376 <link>http://fr.arxiv.org/abs/2010.15597</link>
4377 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rahmani_H/0/1/0/all/0/1">Hamid Radmard Rahmani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Koenke_C/0/1/0/all/0/1">Carsten Koenke</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wiering_M/0/1/0/all/0/1">Marco A. Wiering</a></p>
4378
4379 <p>In many reinforcement learning (RL) problems, it takes some time until a
4380 taken action by the agent reaches its maximum effect on the environment and
4381 consequently the agent receives the reward corresponding to that action by a
4382 delay called action-effect delay. Such delays reduce the performance of the
4383 learning algorithm and increase the computational costs, as the reinforcement
4384 learning agent values the immediate rewards more than the future reward that is
4385 more related to the taken action. This paper addresses this issue by
4386 introducing an applicable enhanced Q-learning method in which at the beginning
4387 of the learning phase, the agent takes a single action and builds a function
4388 that reflects the environments response to that action, called the reflexive
4389 $\gamma$ - function. During the training phase, the agent utilizes the created
4390 reflexive $\gamma$- function to update the Q-values. We have applied the
4391 developed method to a structural control problem in which the goal of the agent
4392 is to reduce the vibrations of a building subjected to earthquake excitations
4393 with a specified delay. Seismic control problems are considered as a complex
4394 task in structural engineering because of the stochastic and unpredictable
4395 nature of earthquakes and the complex behavior of the structure. Three
4396 scenarios are presented to study the effects of zero, medium, and long
4397 action-effect delays and the performance of the Enhanced method is compared to
4398 the standard Q-learning method. Both RL methods use neural network to learn to
4399 estimate the state-action value function that is used to control the structure.
4400 The results show that the enhanced method significantly outperforms the
4401 performance of the original method in all cases, and also improves the
4402 stability of the algorithm in dealing with action-effect delays.
4403 </p>
4404 </description>
4405 </item>
4406 <item>
4407 <title>May I Ask Who's Calling? Named Entity Recognition on Call Center Transcripts for Privacy Law Compliance. (arXiv:2010.15598v1 [cs.CL])</title>
4408 <link>http://fr.arxiv.org/abs/2010.15598</link>
4409 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kaplan_M/0/1/0/all/0/1">Micaela Kaplan</a></p>
4410
4411 <p>We investigate using Named Entity Recognition on a new type of user-generated
4412 text: a call center conversation. These conversations combine problems from
4413 spontaneous speech with problems novel to conversational Automated Speech
4414 Recognition, including incorrect recognition, alongside other common problems
4415 from noisy user-generated text. Using our own corpus with new annotations,
4416 training custom contextual string embeddings, and applying a BiLSTM-CRF, we
4417 match state-of-the-art results on our novel task.
4418 </p>
4419 </description>
4420 </item>
4421 <item>
4422 <title>Expert Selection in High-Dimensional Markov Decision Processes. (arXiv:2010.15599v1 [cs.LG])</title>
4423 <link>http://fr.arxiv.org/abs/2010.15599</link>
4424 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rubies_Royo_V/0/1/0/all/0/1">Vicenc Rubies-Royo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mazumdar_E/0/1/0/all/0/1">Eric Mazumdar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_R/0/1/0/all/0/1">Roy Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tomlin_C/0/1/0/all/0/1">Claire Tomlin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sastry_S/0/1/0/all/0/1">S. Shankar Sastry</a></p>
4425
4426 <p>In this work we present a multi-armed bandit framework for online expert
4427 selection in Markov decision processes and demonstrate its use in
4428 high-dimensional settings. Our method takes a set of candidate expert policies
4429 and switches between them to rapidly identify the best performing expert using
4430 a variant of the classical upper confidence bound algorithm, thus ensuring low
4431 regret in the overall performance of the system. This is useful in applications
4432 where several expert policies may be available, and one needs to be selected at
4433 run-time for the underlying environment.
4434 </p>
4435 </description>
4436 </item>
4437 <item>
4438 <title>Three computational models and its equivalence. (arXiv:2010.15600v1 [cs.LO])</title>
4439 <link>http://fr.arxiv.org/abs/2010.15600</link>
4440 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lopez_C/0/1/0/all/0/1">Ciro Ivan Garcia Lopez</a></p>
4441
4442 <p>The study of computability has its origin in Hilbert's conference of 1900,
4443 where an adjacent question, to the ones he asked, is to give a precise
4444 description of the notion of algorithm. In the search for a good definition
4445 arose three independent theories: Turing and the Turing machines, G\"odel and
4446 the recursive functions, Church and the Lambda Calculus.
4447 </p>
4448 <p>Later there were established by Kleene that the classic models of computation
4449 are equivalent. This fact is widely accepted by many textbooks and the proof is
4450 omitted since the proof is tedious and unreadable. We intend to fill this gap
4451 presenting the proof in a modern way, without forgetting the mathematical
4452 details.
4453 </p>
4454 </description>
4455 </item>
4456 <item>
4457 <title>Using a Binary Classification Model to Predict the Likelihood of Enrolment to the Undergraduate Program of a Philippine University. (arXiv:2010.15601v1 [cs.CY])</title>
4458 <link>http://fr.arxiv.org/abs/2010.15601</link>
4459 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Esquivel_D/0/1/0/all/0/1">Dr.Joseph A. Esquivel</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Esquivel_D/0/1/0/all/0/1">Dr. James A. Esquivel</a></p>
4460
4461 <p>With the recent implementation of the K to 12 Program, academic institutions,
4462 specifically, Colleges and Universities in the Philippines have been faced with
4463 difficulties in determining projected freshmen enrollees vis-a-vis
4464 decision-making factors for efficient resource management. Enrollment targets
4465 directly impacts success factors of Higher Education Institutions. This study
4466 covered an analysis of various characteristics of freshmen applicants affecting
4467 their admission status in a Philippine university. A predictive model was
4468 developed using Logistic Regression to evaluate the probability that an
4469 admitted student will pursue to enroll in the Institution or not. The dataset
4470 used was acquired from the University Admissions Office. The office designed an
4471 online application form to capture applicants' details. The online form was
4472 distributed to all student applicants, and most often, students, tend to
4473 provide incomplete information. Despite this fact, student characteristics, as
4474 well as geographic and demographic data based on the students' location are
4475 significant predictors of enrollment decision. The results of the study show
4476 that given limited information about prospective students, Higher Education
4477 Institutions can implement machine learning techniques to supplement management
4478 decisions and provide estimates of class sizes, in this way, it will allow the
4479 institution to optimize the allocation of resources and will have better
4480 control over net tuition revenue.
4481 </p>
4482 </description>
4483 </item>
4484 <item>
4485 <title>Designing learning experiences for online teaching and learning. (arXiv:2010.15602v1 [cs.CY])</title>
4486 <link>http://fr.arxiv.org/abs/2010.15602</link>
4487 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sockalingam_N/0/1/0/all/0/1">Nachamma Sockalingam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_J/0/1/0/all/0/1">Junhua Liu</a></p>
4488
4489 <p>Teaching is about constantly innovating strategies, ways and means to engage
4490 diverse students in active and meaningful learning. In line with this, SUTD
4491 adopts various student-centric teaching and learning teaching methods and
4492 approaches. This means that our graduate/undergraduate instructors have to be
4493 ready to teach using these student student-centric teaching and learning
4494 pedagogies. In this article, I share my experiences of redesigning this
4495 teaching course that is typically conducted face-to-face to a synchronous
4496 online course and also invite one of the participant in this course to reflect
4497 on his experience as a student.
4498 </p>
4499 </description>
4500 </item>
4501 <item>
4502 <title>Suppressing Mislabeled Data via Grouping and Self-Attention. (arXiv:2010.15603v1 [cs.CV])</title>
4503 <link>http://fr.arxiv.org/abs/2010.15603</link>
4504 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_X/0/1/0/all/0/1">Xiaojiang Peng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_K/0/1/0/all/0/1">Kai Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zeng_Z/0/1/0/all/0/1">Zhaoyang Zeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qing Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_J/0/1/0/all/0/1">Jianfei Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qiao_Y/0/1/0/all/0/1">Yu Qiao</a></p>
4505
4506 <p>Deep networks achieve excellent results on large-scale clean data but degrade
4507 significantly when learning from noisy labels. To suppressing the impact of
4508 mislabeled data, this paper proposes a conceptually simple yet efficient
4509 training block, termed as Attentive Feature Mixup (AFM), which allows paying
4510 more attention to clean samples and less to mislabeled ones via sample
4511 interactions in small groups. Specifically, this plug-and-play AFM first
4512 leverages a \textit{group-to-attend} module to construct groups and assign
4513 attention weights for group-wise samples, and then uses a \textit{mixup} module
4514 with the attention weights to interpolate massive noisy-suppressed samples. The
4515 AFM has several appealing benefits for noise-robust deep learning. (i) It does
4516 not rely on any assumptions and extra clean subset. (ii) With massive
4517 interpolations, the ratio of useless samples is reduced dramatically compared
4518 to the original noisy ratio. (iii) \pxj{It jointly optimizes the interpolation
4519 weights with classifiers, suppressing the influence of mislabeled data via low
4520 attention weights. (iv) It partially inherits the vicinal risk minimization of
4521 mixup to alleviate over-fitting while improves it by sampling fewer
4522 feature-target vectors around mislabeled data from the mixup vicinal
4523 distribution.} Extensive experiments demonstrate that AFM yields
4524 state-of-the-art results on two challenging real-world noisy datasets: Food101N
4525 and Clothing1M. The code will be available at
4526 https://github.com/kaiwang960112/AFM.
4527 </p>
4528 </description>
4529 </item>
4530 <item>
4531 <title>Autoregressive Asymmetric Linear Gaussian Hidden Markov Models. (arXiv:2010.15604v1 [cs.LG])</title>
4532 <link>http://fr.arxiv.org/abs/2010.15604</link>
4533 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Puerto_Santana_C/0/1/0/all/0/1">Carlos Puerto-Santana</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Larranaga_P/0/1/0/all/0/1">Pedro Larra&#xf1;aga</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bielza_C/0/1/0/all/0/1">Concha Bielza</a></p>
4534
4535 <p>In a real life process evolving over time, the relationship between its
4536 relevant variables may change. Therefore, it is advantageous to have different
4537 inference models for each state of the process. Asymmetric hidden Markov models
4538 fulfil this dynamical requirement and provide a framework where the trend of
4539 the process can be expressed as a latent variable. In this paper, we modify
4540 these recent asymmetric hidden Markov models to have an asymmetric
4541 autoregressive component, allowing the model to choose the order of
4542 autoregression that maximizes its penalized likelihood for a given training
4543 set. Additionally, we show how inference, hidden states decoding and parameter
4544 learning must be adapted to fit the proposed model. Finally, we run experiments
4545 with synthetic and real data to show the capabilities of this new model.
4546 </p>
4547 </description>
4548 </item>
4549 <item>
4550 <title>Manifold learning-based feature extraction for structural defect reconstruction. (arXiv:2010.15605v1 [cs.CE])</title>
4551 <link>http://fr.arxiv.org/abs/2010.15605</link>
4552 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qi Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_D/0/1/0/all/0/1">Dianzi Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_Z/0/1/0/all/0/1">Zhenghua Qian</a></p>
4553
4554 <p>Data-driven quantitative defect reconstructions using ultrasonic guided waves
4555 has recently demonstrated great potential in the area of non-destructive
4556 testing. In this paper, we develop an efficient deep learning-based defect
4557 reconstruction framework, called NetInv, which recasts the inverse guided wave
4558 scattering problem as a data-driven supervised learning progress that realizes
4559 a mapping between reflection coefficients in wavenumber domain and defect
4560 profiles in the spatial domain. The superiorities of the proposed NetInv over
4561 conventional reconstruction methods for defect reconstruction have been
4562 demonstrated by several examples. Results show that NetInv has the ability to
4563 achieve the higher quality of defect profiles with remarkable efficiency and
4564 provides valuable insight into the development of effective data driven
4565 structural health monitoring and defect reconstruction using machine learning.
4566 </p>
4567 </description>
4568 </item>
4569 <item>
4570 <title>Design and Evaluation of Electric Bus Systems for Metropolitan Cities. (arXiv:2010.15606v1 [cs.CY])</title>
4571 <link>http://fr.arxiv.org/abs/2010.15606</link>
4572 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Menon_U/0/1/0/all/0/1">Unnikrishnan Menon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Panda_D/0/1/0/all/0/1">Divyani Panda</a></p>
4573
4574 <p>Over the past decade, most of the metropolitan cities across the world have
4575 been witnessing a degrading trend in air quality index. Exhaust emission data
4576 observations show that promotion of public transport could be a potential way
4577 out of this gridlock. Due to environmental concerns, numerous public transport
4578 authorities harbor a great interest in introducing zero emission electric
4579 buses. A shift from conventional diesel buses to electric buses comes with
4580 several benefits in terms of reduction in local pollution, noise, and fuel
4581 consumption. This paper proposes the relevant vehicle technologies, powertrain,
4582 and charging systems, which, in combination, provides a comprehensive
4583 methodology to design an Electric Bus that can be deployed in metropolitan
4584 cities to mitigate emission concerns.
4585 </p>
4586 </description>
4587 </item>
4588 <item>
4589 <title>CRICTRS: Embeddings based Statistical and Semi Supervised Cricket Team Recommendation System. (arXiv:2010.15607v1 [cs.CY])</title>
4590 <link>http://fr.arxiv.org/abs/2010.15607</link>
4591 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chhabra_P/0/1/0/all/0/1">Prazwal Chhabra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ali_R/0/1/0/all/0/1">Rizwan Ali</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pudi_V/0/1/0/all/0/1">Vikram Pudi</a></p>
4592
4593 <p>Team Recommendation has always been a challenging aspect in team sports. Such
4594 systems aim to recommend a player combination best suited against the
4595 opposition players, resulting in an optimal outcome. In this paper, we propose
4596 a semi-supervised statistical approach to build a team recommendation system
4597 for cricket by modelling players into embeddings. To build these embeddings, we
4598 design a qualitative and quantitative rating system which considers the
4599 strength of opposition also for evaluating player performance. The embeddings
4600 obtained, describes the strengths and weaknesses of the players based on past
4601 performances of the player. We also embark on a critical aspect of team
4602 composition, which includes the number of batsmen and bowlers in the team. The
4603 team composition changes over time, depending on different factors which are
4604 tough to predict, so we take this input from the user and use the player
4605 embeddings to decide the best possible team combination with the given team
4606 composition.
4607 </p>
4608 </description>
4609 </item>
4610 <item>
4611 <title>An Overview Of 3D Object Detection. (arXiv:2010.15614v1 [cs.CV])</title>
4612 <link>http://fr.arxiv.org/abs/2010.15614</link>
4613 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yilin Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ye_J/0/1/0/all/0/1">Jiayi Ye</a></p>
4614
4615 <p>Point cloud 3D object detection has recently received major attention and
4616 becomes an active research topic in 3D computer vision community. However,
4617 recognizing 3D objects in LiDAR (Light Detection and Ranging) is still a
4618 challenge due to the complexity of point clouds. Objects such as pedestrians,
4619 cyclists, or traffic cones are usually represented by quite sparse points,
4620 which makes the detection quite complex using only point cloud. In this
4621 project, we propose a framework that uses both RGB and point cloud data to
4622 perform multiclass object recognition. We use existing 2D detection models to
4623 localize the region of interest (ROI) on the RGB image, followed by a pixel
4624 mapping strategy in the point cloud, and finally, lift the initial 2D bounding
4625 box to 3D space. We use the recently released nuScenes dataset---a large-scale
4626 dataset contains many data formats---to training and evaluate our proposed
4627 architecture.
4628 </p>
4629 </description>
4630 </item>
4631 <item>
4632 <title>Sampling and Reconstruction of Sparse Signals in Shift-Invariant Spaces: Generalized Shannon's Theorem Meets Compressive Sensing. (arXiv:2010.15618v1 [eess.SP])</title>
4633 <link>http://fr.arxiv.org/abs/2010.15618</link>
4634 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Vlasic_T/0/1/0/all/0/1">Tin Vla&#x161;i&#x107;</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sersic_D/0/1/0/all/0/1">Damir Ser&#x161;i&#x107;</a></p>
4635
4636 <p>This paper introduces a novel framework and corresponding methods for
4637 sampling and reconstruction of sparse signals in shift-invariant (SI) spaces.
4638 We reinterpret the random demodulator, a system that acquires sparse
4639 bandlimited signals, as a system for acquisition of linear combinations of the
4640 samples in the SI setting with the box function as the sampling kernel. The
4641 sparsity assumption is exploited by compressive sensing (CS) framework for
4642 recovery of the SI samples from a reduced set of measurements. The samples are
4643 subsequently filtered by a discrete-time correction filter in order to
4644 reconstruct expansion coefficients of an observed signal. Furthermore, we offer
4645 a generalization of the proposed framework to other sampling kernels that lie
4646 in arbitrary SI spaces. The generalized method embeds the correction filter in
4647 a CS optimization problem which directly reconstructs expansion coefficients of
4648 the signal. Both approaches recast an inherently infinite-dimensional inverse
4649 problem as a finite-dimensional CS problem in an exact way. Finally, we conduct
4650 numerical experiments on signals in B-spline spaces whose expansion
4651 coefficients are assumed to be sparse in a certain transform domain. The
4652 coefficients can be regarded as parametric models of an underlying continuous
4653 signal, obtained from a reduced set of measurements. Such continuous signal
4654 representations are particularly suitable for signal processing without
4655 converting them into samples.
4656 </p>
4657 </description>
4658 </item>
4659 <item>
4660 <title>CAFE: Coarse-to-Fine Neural Symbolic Reasoning for Explainable Recommendation. (arXiv:2010.15620v1 [cs.IR])</title>
4661 <link>http://fr.arxiv.org/abs/2010.15620</link>
4662 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Xian_Y/0/1/0/all/0/1">Yikun Xian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_Z/0/1/0/all/0/1">Zuohui Fu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_H/0/1/0/all/0/1">Handong Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ge_Y/0/1/0/all/0/1">Yingqiang Ge</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_X/0/1/0/all/0/1">Xu Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_Q/0/1/0/all/0/1">Qiaoying Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Geng_S/0/1/0/all/0/1">Shijie Geng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_Z/0/1/0/all/0/1">Zhou Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Melo_G/0/1/0/all/0/1">Gerard de Melo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Muthukrishnan_S/0/1/0/all/0/1">S. Muthukrishnan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yongfeng Zhang</a></p>
4663
4664 <p>Recent research explores incorporating knowledge graphs (KG) into e-commerce
4665 recommender systems, not only to achieve better recommendation performance, but
4666 more importantly to generate explanations of why particular decisions are made.
4667 This can be achieved by explicit KG reasoning, where a model starts from a user
4668 node, sequentially determines the next step, and walks towards an item node of
4669 potential interest to the user. However, this is challenging due to the huge
4670 search space, unknown destination, and sparse signals over the KG, so
4671 informative and effective guidance is needed to achieve a satisfactory
4672 recommendation quality. To this end, we propose a CoArse-to-FinE neural
4673 symbolic reasoning approach (CAFE). It first generates user profiles as coarse
4674 sketches of user behaviors, which subsequently guide a path-finding process to
4675 derive reasoning paths for recommendations as fine-grained predictions. User
4676 profiles can capture prominent user behaviors from the history, and provide
4677 valuable signals about which kinds of path patterns are more likely to lead to
4678 potential items of interest for the user. To better exploit the user profiles,
4679 an improved path-finding algorithm called Profile-guided Path Reasoning (PPR)
4680 is also developed, which leverages an inventory of neural symbolic reasoning
4681 modules to effectively and efficiently find a batch of paths over a large-scale
4682 KG. We extensively experiment on four real-world benchmarks and observe
4683 substantial gains in the recommendation performance compared with
4684 state-of-the-art methods.
4685 </p>
4686 </description>
4687 </item>
4688 <item>
4689 <title>Low-Variance Policy Gradient Estimation with World Models. (arXiv:2010.15622v1 [stat.ML])</title>
4690 <link>http://fr.arxiv.org/abs/2010.15622</link>
4691 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Nauman_M/0/1/0/all/0/1">Michal Nauman</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Hengst_F/0/1/0/all/0/1">Floris Den Hengst</a></p>
4692
4693 <p>In this paper, we propose World Model Policy Gradient (WMPG), an approach to
4694 reduce the variance of policy gradient estimates using learned world models
4695 (WM's). In WMPG, a WM is trained online and used to imagine trajectories. The
4696 imagined trajectories are used in two ways. Firstly, to calculate a
4697 without-replacement estimator of the policy gradient. Secondly, the return of
4698 the imagined trajectories is used as an informed baseline. We compare the
4699 proposed approach with AC and MAC on a set of environments of increasing
4700 complexity (CartPole, LunarLander and Pong) and find that WMPG has better
4701 sample efficiency. Based on these results, we conclude that WMPG can yield
4702 increased sample efficiency in cases where a robust latent representation of
4703 the environment can be learned.
4704 </p>
4705 </description>
4706 </item>
4707 <item>
4708 <title>Fast Minimal Presentations of Bi-graded Persistence Modules. (arXiv:2010.15623v1 [math.AT])</title>
4709 <link>http://fr.arxiv.org/abs/2010.15623</link>
4710 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Kerber_M/0/1/0/all/0/1">Michael Kerber</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Rolle_A/0/1/0/all/0/1">Alexander Rolle</a></p>
4711
4712 <p>Multi-parameter persistent homology is a recent branch of topological data
4713 analysis. In this area, data sets are investigated through the lens of homology
4714 with respect to two or more scale parameters. The high computational cost of
4715 many algorithms calls for a preprocessing step to reduce the input size. In
4716 general, a minimal presentation is the smallest possible representation of a
4717 persistence module. Lesnick and Wright proposed recently an algorithm (the
4718 LW-algorithm) for computing minimal presentations based on matrix reduction. In
4719 this work, we propose, implement and benchmark several improvements over the
4720 LW-algorithm. Most notably, we propose the use of priority queues to avoid
4721 extensive scanning of the matrix columns, which constitutes the computational
4722 bottleneck in the LW-algorithm, and we combine their algorithm with ideas from
4723 the multi-parameter chunk algorithm by Fugacci and Kerber. Our extensive
4724 experiments show that our algorithm outperforms the LW-algorithm and computes
4725 the minimal presentation for data sets with millions of simplices within a few
4726 seconds. Our software is publicly available.
4727 </p>
4728 </description>
4729 </item>
4730 <item>
4731 <title>Abstract Value Iteration for Hierarchical Reinforcement Learning. (arXiv:2010.15638v1 [cs.LG])</title>
4732 <link>http://fr.arxiv.org/abs/2010.15638</link>
4733 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jothimurugan_K/0/1/0/all/0/1">Kishor Jothimurugan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bastani_O/0/1/0/all/0/1">Osbert Bastani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Alur_R/0/1/0/all/0/1">Rajeev Alur</a></p>
4734
4735 <p>We propose a novel hierarchical reinforcement learning framework for control
4736 with continuous state and action spaces. In our framework, the user specifies
4737 subgoal regions which are subsets of states; then, we (i) learn options that
4738 serve as transitions between these subgoal regions, and (ii) construct a
4739 high-level plan in the resulting abstract decision process (ADP). A key
4740 challenge is that the ADP may not be Markov, which we address by proposing two
4741 algorithms for planning in the ADP. Our first algorithm is conservative,
4742 allowing us to prove theoretical guarantees on its performance, which help
4743 inform the design of subgoal regions. Our second algorithm is a practical one
4744 that interweaves planning at the abstract level and learning at the concrete
4745 level. In our experiments, we demonstrate that our approach outperforms
4746 state-of-the-art hierarchical reinforcement learning algorithms on several
4747 challenging benchmarks.
4748 </p>
4749 </description>
4750 </item>
4751 <item>
4752 <title>Teaching a GAN What Not to Learn. (arXiv:2010.15639v1 [stat.ML])</title>
4753 <link>http://fr.arxiv.org/abs/2010.15639</link>
4754 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Asokan_S/0/1/0/all/0/1">Siddarth Asokan</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Seelamantula_C/0/1/0/all/0/1">Chandra Sekhar Seelamantula</a></p>
4755
4756 <p>Generative adversarial networks (GANs) were originally envisioned as
4757 unsupervised generative models that learn to follow a target distribution.
4758 Variants such as conditional GANs, auxiliary-classifier GANs (ACGANs) project
4759 GANs on to supervised and semi-supervised learning frameworks by providing
4760 labelled data and using multi-class discriminators. In this paper, we approach
4761 the supervised GAN problem from a different perspective, one that is motivated
4762 by the philosophy of the famous Persian poet Rumi who said, "The art of knowing
4763 is knowing what to ignore." In the GAN framework, we not only provide the GAN
4764 positive data that it must learn to model, but also present it with so-called
4765 negative samples that it must learn to avoid - we call this "The Rumi
4766 Framework." This formulation allows the discriminator to represent the
4767 underlying target distribution better by learning to penalize generated samples
4768 that are undesirable - we show that this capability accelerates the learning
4769 process of the generator. We present a reformulation of the standard GAN (SGAN)
4770 and least-squares GAN (LSGAN) within the Rumi setting. The advantage of the
4771 reformulation is demonstrated by means of experiments conducted on MNIST,
4772 Fashion MNIST, CelebA, and CIFAR-10 datasets. Finally, we consider an
4773 application of the proposed formulation to address the important problem of
4774 learning an under-represented class in an unbalanced dataset. The Rumi approach
4775 results in substantially lower FID scores than the standard GAN frameworks
4776 while possessing better generalization capability.
4777 </p>
4778 </description>
4779 </item>
4780 <item>
4781 <title>Free-Form Image Inpainting via Contrastive Attention Network. (arXiv:2010.15643v1 [cs.CV])</title>
4782 <link>http://fr.arxiv.org/abs/2010.15643</link>
4783 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_X/0/1/0/all/0/1">Xin Ma</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_X/0/1/0/all/0/1">Xiaoqiang Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_H/0/1/0/all/0/1">Huaibo Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chai_Z/0/1/0/all/0/1">Zhenhua Chai</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_X/0/1/0/all/0/1">Xiaolin Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_R/0/1/0/all/0/1">Ran He</a></p>
4784
4785 <p>Most deep learning based image inpainting approaches adopt autoencoder or its
4786 variants to fill missing regions in images. Encoders are usually utilized to
4787 learn powerful representational spaces, which are important for dealing with
4788 sophisticated learning tasks. Specifically, in image inpainting tasks, masks
4789 with any shapes can appear anywhere in images (i.e., free-form masks) which
4790 form complex patterns. It is difficult for encoders to capture such powerful
4791 representations under this complex situation. To tackle this problem, we
4792 propose a self-supervised Siamese inference network to improve the robustness
4793 and generalization. It can encode contextual semantics from full resolution
4794 images and obtain more discriminative representations. we further propose a
4795 multi-scale decoder with a novel dual attention fusion module (DAF), which can
4796 combine both the restored and known regions in a smooth way. This multi-scale
4797 architecture is beneficial for decoding discriminative representations learned
4798 by encoders into images layer by layer. In this way, unknown regions will be
4799 filled naturally from outside to inside. Qualitative and quantitative
4800 experiments on multiple datasets, including facial and natural datasets (i.e.,
4801 Celeb-HQ, Pairs Street View, Places2 and ImageNet), demonstrate that our
4802 proposed method outperforms state-of-the-art methods in generating high-quality
4803 inpainting results.
4804 </p>
4805 </description>
4806 </item>
4807 <item>
4808 <title>Brain Tumor Segmentation Network Using Attention-based Fusion and Spatial Relationship Constraint. (arXiv:2010.15647v1 [eess.IV])</title>
4809 <link>http://fr.arxiv.org/abs/2010.15647</link>
4810 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_C/0/1/0/all/0/1">Chenyu Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ding_W/0/1/0/all/0/1">Wangbin Ding</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_L/0/1/0/all/0/1">Lei Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_Z/0/1/0/all/0/1">Zhen Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pei_C/0/1/0/all/0/1">Chenhao Pei</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Huang_L/0/1/0/all/0/1">Liqin Huang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhuang_X/0/1/0/all/0/1">Xiahai Zhuang</a></p>
4811
4812 <p>Delineating the brain tumor from magnetic resonance (MR) images is critical
4813 for the treatment of gliomas. However, automatic delineation is challenging due
4814 to the complex appearance and ambiguous outlines of tumors. Considering that
4815 multi-modal MR images can reflect different tumor biological properties, we
4816 develop a novel multi-modal tumor segmentation network (MMTSN) to robustly
4817 segment brain tumors based on multi-modal MR images. The MMTSN is composed of
4818 three sub-branches and a main branch. Specifically, the sub-branches are used
4819 to capture different tumor features from multi-modal images, while in the main
4820 branch, we design a spatial-channel fusion block (SCFB) to effectively
4821 aggregate multi-modal features. Additionally, inspired by the fact that the
4822 spatial relationship between sub-regions of tumor is relatively fixed, e.g.,
4823 the enhancing tumor is always in the tumor core, we propose a spatial loss to
4824 constrain the relationship between different sub-regions of tumor. We evaluate
4825 our method on the test set of multi-modal brain tumor segmentation challenge
4826 2020 (BraTs2020). The method achieves 0.8764, 0.8243 and 0.773 dice score for
4827 whole tumor, tumor core and enhancing tumor, respectively.
4828 </p>
4829 </description>
4830 </item>
4831 <item>
4832 <title>Reliable Graph Neural Networks via Robust Aggregation. (arXiv:2010.15651v1 [cs.LG])</title>
4833 <link>http://fr.arxiv.org/abs/2010.15651</link>
4834 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Geisler_S/0/1/0/all/0/1">Simon Geisler</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zugner_D/0/1/0/all/0/1">Daniel Z&#xfc;gner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gunnemann_S/0/1/0/all/0/1">Stephan G&#xfc;nnemann</a></p>
4835
4836 <p>Perturbations targeting the graph structure have proven to be extremely
4837 effective in reducing the performance of Graph Neural Networks (GNNs), and
4838 traditional defenses such as adversarial training do not seem to be able to
4839 improve robustness. This work is motivated by the observation that
4840 adversarially injected edges effectively can be viewed as additional samples to
4841 a node's neighborhood aggregation function, which results in distorted
4842 aggregations accumulating over the layers. Conventional GNN aggregation
4843 functions, such as a sum or mean, can be distorted arbitrarily by a single
4844 outlier. We propose a robust aggregation function motivated by the field of
4845 robust statistics. Our approach exhibits the largest possible breakdown point
4846 of 0.5, which means that the bias of the aggregation is bounded as long as the
4847 fraction of adversarial edges of a node is less than 50\%. Our novel
4848 aggregation function, Soft Medoid, is a fully differentiable generalization of
4849 the Medoid and therefore lends itself well for end-to-end deep learning.
4850 Equipping a GNN with our aggregation improves the robustness with respect to
4851 structure perturbations on Cora ML by a factor of 3 (and 5.5 on Citeseer) and
4852 by a factor of 8 for low-degree nodes.
4853 </p>
4854 </description>
4855 </item>
4856 <item>
4857 <title>Semi-Supervised Speech Recognition via Graph-based Temporal Classification. (arXiv:2010.15653v1 [cs.LG])</title>
4858 <link>http://fr.arxiv.org/abs/2010.15653</link>
4859 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Moritz_N/0/1/0/all/0/1">Niko Moritz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hori_T/0/1/0/all/0/1">Takaaki Hori</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Roux_J/0/1/0/all/0/1">Jonathan Le Roux</a></p>
4860
4861 <p>Semi-supervised learning has demonstrated promising results in automatic
4862 speech recognition (ASR) by self-training using a seed ASR model with
4863 pseudo-labels generated for unlabeled data. The effectiveness of this approach
4864 largely relies on the pseudo-label accuracy, for which typically only the
4865 1-best ASR hypothesis is used. However, alternative ASR hypotheses of an N-best
4866 list can provide more accurate labels for an unlabeled speech utterance and
4867 also reflect uncertainties of the seed ASR model. In this paper, we propose a
4868 generalized form of the connectionist temporal classification (CTC) objective
4869 that accepts a graph representation of the training targets. The newly proposed
4870 graph-based temporal classification (GTC) objective is applied for
4871 self-training with WFST-based supervision, which is generated from an N-best
4872 list of pseudo-labels. In this setup, GTC is used to learn not only a temporal
4873 alignment, similarly to CTC, but also a label alignment to obtain the optimal
4874 pseudo-label sequence from the weighted graph. Results show that this approach
4875 can effectively exploit an N-best list of pseudo-labels with associated scores,
4876 outperforming standard pseudo-labeling by a large margin, with ASR results
4877 close to an oracle experiment in which the best hypotheses of the N-best lists
4878 are selected manually.
4879 </p>
4880 </description>
4881 </item>
4882 <item>
4883 <title>Identification of complex mixtures for Raman spectroscopy using a novel scheme based on a new multi-label deep neural network. (arXiv:2010.15654v1 [eess.SP])</title>
4884 <link>http://fr.arxiv.org/abs/2010.15654</link>
4885 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Pan_L/0/1/0/all/0/1">Liangrui Pan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pipitsunthonsan_P/0/1/0/all/0/1">Pronthep Pipitsunthonsan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Daengngam_C/0/1/0/all/0/1">Chalongrat Daengngam</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chongcheawchamnan_M/0/1/0/all/0/1">Mitchai Chongcheawchamnan</a></p>
4886
4887 <p>With noisy environment caused by fluoresence and additive white noise as well
4888 as complicated spectrum fingerprints, the identification of complex mixture
4889 materials remains a major challenge in Raman spectroscopy application. In this
4890 paper, we propose a new scheme based on a constant wavelet transform (CWT) and
4891 a deep network for classifying complex mixture. The scheme first transforms the
4892 noisy Raman spectrum to a two-dimensional scale map using CWT. A multi-label
4893 deep neural network model (MDNN) is then applied for classifying material. The
4894 proposed model accelerates the feature extraction and expands the feature graph
4895 using the global averaging pooling layer. The Sigmoid function is implemented
4896 in the last layer of the model. The MDNN model was trained, validated and
4897 tested with data collected from the samples prepared from substances in palm
4898 oil. During training and validating process, data augmentation is applied to
4899 overcome the imbalance of data and enrich the diversity of Raman spectra. From
4900 the test results, it is found that the MDNN model outperforms previously
4901 proposed deep neural network models in terms of Hamming loss, one error,
4902 coverage, ranking loss, average precision, F1 macro averaging and F1 micro
4903 averaging, respectively. The average detection time obtained from our model is
4904 5.31 s, which is much faster than the detection time of the previously proposed
4905 models.
4906 </p>
4907 </description>
4908 </item>
4909 <item>
4910 <title>Generalization bounds for deep thresholding networks. (arXiv:2010.15658v1 [math.ST])</title>
4911 <link>http://fr.arxiv.org/abs/2010.15658</link>
4912 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Behboodi_A/0/1/0/all/0/1">Arash Behboodi</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Rauhut_H/0/1/0/all/0/1">Holger Rauhut</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Schnoor_E/0/1/0/all/0/1">Ekkehard Schnoor</a></p>
4913
4914 <p>We consider compressive sensing in the scenario where the sparsity basis
4915 (dictionary) is not known in advance, but needs to be learned from examples.
4916 Motivated by the well-known iterative soft thresholding algorithm for the
4917 reconstruction, we define deep networks parametrized by the dictionary, which
4918 we call deep thresholding networks. Based on training samples, we aim at
4919 learning the optimal sparsifying dictionary and thereby the optimal network
4920 that reconstructs signals from their low-dimensional linear measurements. The
4921 dictionary learning is performed via minimizing the empirical risk. We derive
4922 generalization bounds by analyzing the Rademacher complexity of hypothesis
4923 classes consisting of such deep networks. We obtain estimates of the sample
4924 complexity that depend only linearly on the dimensions and on the depth.
4925 </p>
4926 </description>
4927 </item>
4928 <item>
4929 <title>Independence Tests Without Ground Truth for Noisy Learners. (arXiv:2010.15662v1 [stat.ML])</title>
4930 <link>http://fr.arxiv.org/abs/2010.15662</link>
4931 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Corrada_Emmanuel_A/0/1/0/all/0/1">Andr&#xe9;s Corrada-Emmanuel</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Pantridge_E/0/1/0/all/0/1">Edward Pantridge</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Zahrebelski_E/0/1/0/all/0/1">Eddie Zahrebelski</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Chaganti_A/0/1/0/all/0/1">Aditya Chaganti</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Simeonov_S/0/1/0/all/0/1">Simeon Simeonov</a></p>
4932
4933 <p>Exact ground truth invariant polynomial systems can be written for
4934 arbitrarily correlated binary classifiers. Their solutions give estimates for
4935 sample statistics that require knowledge of the ground truth of the correct
4936 labels in the sample. Of these polynomial systems, only a few have been solved
4937 in closed form. Here we discuss the exact solution for independent binary
4938 classifiers - resolving an outstanding problem that has been presented at this
4939 conference and others. Its practical applicability is hampered by its sole
4940 remaining assumption - the classifiers need to be independent in their sample
4941 errors. We discuss how to use the closed form solution to create a
4942 self-consistent test that can validate the independence assumption itself
4943 absent the correct labels ground truth. It can be cast as an algebraic geometry
4944 conjecture for binary classifiers that remains unsolved. A similar conjecture
4945 for the ground truth invariant algebraic system for scalar regressors is
4946 solvable, and we present the solution here. We also discuss experiments on the
4947 Penn ML Benchmark classification tasks that provide further evidence that the
4948 conjecture may be true for the polynomial system of binary classifiers.
4949 </p>
4950 </description>
4951 </item>
4952 <item>
4953 <title>Machine Ethics and Automated Vehicles. (arXiv:2010.15665v1 [cs.CY])</title>
4954 <link>http://fr.arxiv.org/abs/2010.15665</link>
4955 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goodall_N/0/1/0/all/0/1">Noah J. Goodall</a></p>
4956
4957 <p>Road vehicle travel at a reasonable speed involves some risk, even when using
4958 computer-controlled driving with failure-free hardware and perfect sensing. A
4959 fully-automated vehicle must continuously decide how to allocate this risk
4960 without a human driver's oversight. These are ethical decisions, particularly
4961 in instances where an automated vehicle cannot avoid crashing. In this chapter,
4962 I introduce the concept of moral behavior for an automated vehicle, argue the
4963 need for research in this area through responses to anticipated critiques, and
4964 discuss relevant applications from machine ethics and moral modeling research.
4965 </p>
4966 </description>
4967 </item>
4968 <item>
4969 <title>PeopleXploit -- A hybrid tool to collect public data. (arXiv:2010.15668v1 [cs.CY])</title>
4970 <link>http://fr.arxiv.org/abs/2010.15668</link>
4971 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+V_A/0/1/0/all/0/1">Arjun Anand V</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+K_B/0/1/0/all/0/1">Buvanasri A K</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+R_M/0/1/0/all/0/1">Meenakshi R</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+S_D/0/1/0/all/0/1">Dr. Karthika S</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohan_A/0/1/0/all/0/1">Ashok Kumar Mohan</a></p>
4972
4973 <p>This paper introduces the concept of Open Source Intelligence (OSINT) as an
4974 important application in intelligent profiling of individuals. With a variety
4975 of tools available, significant data shall be obtained on an individual as a
4976 consequence of analyzing his/her internet presence but all of this comes at the
4977 cost of low relevance. To increase the relevance score in profiling,
4978 PeopleXploit is being introduced. PeopleXploit is a hybrid tool which helps in
4979 collecting the publicly available information that is reliable and relevant to
4980 the given input. This tool is used to track and trace the given target with
4981 their digital footprints like Name, Email, Phone Number, User IDs etc. and the
4982 tool will scan &amp; search other associated data from public available records
4983 from the internet and create a summary report against the target. PeopleXploit
4984 profiles a person using authorship analysis and finds the best matching guess.
4985 Also, the type of analysis performed (professional/matrimonial/criminal entity)
4986 varies with the requirement of the user.
4987 </p>
4988 </description>
4989 </item>
4990 <item>
4991 <title>Using Twitter to Analyze Political Polarization During National Crises. (arXiv:2010.15669v1 [cs.CY])</title>
4992 <link>http://fr.arxiv.org/abs/2010.15669</link>
4993 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shisode_P/0/1/0/all/0/1">Parth Shisode</a></p>
4994
4995 <p>Democrats and Republicans have seemed to grow apart in the past three
4996 decades. Since the United States as we know it today is undeniably bipartisan,
4997 this phenomenon would not appear as a surprise to most. However, there are
4998 triggers which can cause spikes in disagreements between Democrats and
4999 Republicans at a higher rate than how the two parties have been growing apart
5000 gradually over time. This study has analyzed the idea that national events
5001 which generally are detrimental to all individuals can be one of those
5002 triggers. By testing polarization before and after three events (Hurricane
5003 Sandy [2012], N. Korea Missile Test Surge [2019], COVID-19 [2020]) using
5004 Twitter data, we show that a measurable spike in polarization occurs between
5005 the Democrat and Republican party. In order to measure polarization, sentiments
5006 of Twitter users aligned to the Democrat and Republican parties are compared on
5007 identical entities (events, people, locations, etc.). Using hundreds of
5008 thousands of data samples, a 2.8% increase in polarization was measured during
5009 times of crisis compared to times where no crises were occurring. Regardless of
5010 the reasoning that the gap between political parties can increase so much
5011 during times of suffering and stress, it is definitely alarming to see that
5012 among other aspects of life, the partisan gap worsens during detrimental
5013 national events.
5014 </p>
5015 </description>
5016 </item>
5017 <item>
5018 <title>Detecting Individuals with Depressive Disorder fromPersonal Google Search and YouTube History Logs. (arXiv:2010.15670v1 [cs.CY])</title>
5019 <link>http://fr.arxiv.org/abs/2010.15670</link>
5020 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_B/0/1/0/all/0/1">Boyu Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zaman_A/0/1/0/all/0/1">Anis Zaman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Acharyya_R/0/1/0/all/0/1">Rupam Acharyya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hoque_E/0/1/0/all/0/1">Ehsan Hoque</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Silenzio_V/0/1/0/all/0/1">Vincent Silenzio</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kautz_H/0/1/0/all/0/1">Henry Kautz</a></p>
5021
5022 <p>Depressive disorder is one of the most prevalent mental illnesses among the
5023 global population. However, traditional screening methods require exacting
5024 in-person interviews and may fail to provide immediate interventions. In this
5025 work, we leverage ubiquitous personal longitudinal Google Search and YouTube
5026 engagement logs to detect individuals with depressive disorder. We collected
5027 Google Search and YouTube history data and clinical depression evaluation
5028 results from $212$ participants ($99$ of them suffered from moderate to severe
5029 depressions). We then propose a personalized framework for classifying
5030 individuals with and without depression symptoms based on mutual-exciting point
5031 process that captures both the temporal and semantic aspects of online
5032 activities. Our best model achieved an average F1 score of $0.77 \pm 0.04$ and
5033 an AUC ROC of $0.81 \pm 0.02$.
5034 </p>
5035 </description>
5036 </item>
5037 <item>
5038 <title>Computing Crisp Bisimulations for Fuzzy Structures. (arXiv:2010.15671v1 [cs.DS])</title>
5039 <link>http://fr.arxiv.org/abs/2010.15671</link>
5040 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_L/0/1/0/all/0/1">Linh Anh Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_D/0/1/0/all/0/1">Dat Xuan Tran</a></p>
5041
5042 <p>Fuzzy structures such as fuzzy automata, fuzzy transition systems, weighted
5043 social networks and fuzzy interpretations in fuzzy description logics have been
5044 widely studied. For such structures, bisimulation is a natural notion for
5045 characterizing indiscernibility between states or individuals. There are two
5046 kinds of bisimulations for fuzzy structures: crisp bisimulations and fuzzy
5047 bisimulations. While the latter fits to the fuzzy paradigm, the former has also
5048 attracted attention due to the application of crisp equivalence relations, for
5049 example, in minimizing structures. Bisimulations can be formulated for fuzzy
5050 labeled graphs and then adapted to other fuzzy structures. In this article, we
5051 present an efficient algorithm for computing the partition corresponding to the
5052 largest crisp bisimulation of a given finite fuzzy labeled graph. Its
5053 complexity is of order $O((m\log{l} + n)\log{n})$, where $n$, $m$ and $l$ are
5054 the number of vertices, the number of nonzero edges and the number of different
5055 fuzzy degrees of edges of the input graph, respectively. We also study a
5056 similar problem for the setting with counting successors, which corresponds to
5057 the case with qualified number restrictions in description logics and graded
5058 modalities in modal logics. In particular, we provide an efficient algorithm
5059 with the complexity $O((m\log{m} + n)\log{n})$ for the considered problem in
5060 that setting.
5061 </p>
5062 </description>
5063 </item>
5064 <item>
5065 <title>FD Cell-Free mMIMO: Analysis and Optimization. (arXiv:2010.15672v1 [eess.SP])</title>
5066 <link>http://fr.arxiv.org/abs/2010.15672</link>
5067 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Datta_S/0/1/0/all/0/1">Soumyadeep Datta</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sharma_E/0/1/0/all/0/1">Ekant Sharma</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Amudala_D/0/1/0/all/0/1">Dheeraj Naidu Amudala</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Budhiraja_R/0/1/0/all/0/1">Rohit Budhiraja</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Panwar_S/0/1/0/all/0/1">Shivendra S. Panwar</a></p>
5068
5069 <p>We consider a full-duplex cell-free massive multiple-input-multiple-output
5070 system with limited capacity fronthaul links. We derive its downlink/uplink
5071 closed-form spectral efficiency (SE) lower bounds with maximum-ratio
5072 transmission/maximum-ratio combining and optimal uniform quantization. To
5073 reduce carbon footprint, this paper maximizes the non-convex weighted sum
5074 energy efficiency (WSEE) via downlink and uplink power control, and successive
5075 convex approximation framework. We show that with low fronthaul capacity, the
5076 system requires a higher number of fronthaul quantization bits to achieve high
5077 SE and WSEE. For high fronthaul capacity, higher number of bits, however,
5078 achieves high SE but a reduced WSEE.
5079 </p>
5080 </description>
5081 </item>
5082 <item>
5083 <title>Machine Learning Based Demand Modelling for On-Demand Transit Services: A Case Study of Belleville, Ontario. (arXiv:2010.15673v1 [cs.CY])</title>
5084 <link>http://fr.arxiv.org/abs/2010.15673</link>
5085 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Alsaleh_N/0/1/0/all/0/1">Nael Alsaleh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farooq_B/0/1/0/all/0/1">Bilal Farooq</a></p>
5086
5087 <p>The use of mobile applications apps and GPS service on smartphones for
5088 transportation management applications has enabled the new "on-demand mobility"
5089 service, where the transportation supply is following the users' schedule and
5090 routes. In September 2018, the City of Belleville in Canada and Pantonium
5091 operationalized the same idea, but for the public transit service in the city
5092 to develop an on-demand transit (ODT) service. An existing fixed route (RT 11)
5093 public transit service was converted into an on-demand service during the night
5094 as a pilot project to maintain a higher demand sensitivity and highest
5095 operation cost efficiency per trip. In this study, Random Forest (RF), Bagging,
5096 Artificial Neural Network (ANN), and Deep Neural Network (DNN) machine learning
5097 algorithms were adopted to develop a pickup demand model (trip generation) and
5098 a trip demand model (trip distribution model) for Belleville ODT service based
5099 on the dissemination areas' demographic characteristics and the existing trip
5100 characteristics. The developed models aim to explain the demand behavior,
5101 investigate the main factors affecting the trip pattern and their relative
5102 importance, and to predict the number of generated trips from any dissemination
5103 area as well as between any two dissemination areas. The results indicate that
5104 the developed models can predict 63% and 70% of the pickup and trip demand
5105 levels, respectively. Both models are most affected by the month of the year
5106 and the day of the week variables. In addition, the population density has a
5107 higher impact on the ODT service pickup demand levels than the other
5108 demographic characteristics followed by the working age percentages and median
5109 income characteristics. Whereas, the distribution of the trips depends on the
5110 demographic characteristics of the destination area more than the origin area.
5111 </p>
5112 </description>
5113 </item>
5114 <item>
5115 <title>Analyzing Societal Impact of COVID-19: A Study During the Early Days of the Pandemic. (arXiv:2010.15674v1 [cs.SI])</title>
5116 <link>http://fr.arxiv.org/abs/2010.15674</link>
5117 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shanthakumar_S/0/1/0/all/0/1">Swaroop Gowdra Shanthakumar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Seetharam_A/0/1/0/all/0/1">Anand Seetharam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramesh_A/0/1/0/all/0/1">Arti Ramesh</a></p>
5118
5119 <p>In this paper, we collect and study Twitter communications to understand the
5120 societal impact of COVID-19 in the United States during the early days of the
5121 pandemic. With infections soaring rapidly, users took to Twitter asking people
5122 to self isolate and quarantine themselves. Users also demanded closure of
5123 schools, bars, and restaurants as well as lockdown of cities and states. We
5124 methodically collect tweets by identifying and tracking trending COVID-related
5125 hashtags. We first manually group the hashtags into six main categories,
5126 namely, 1) General COVID, 2) Quarantine, 3) Panic Buying, 4) School Closures,
5127 5) Lockdowns, and 6) Frustration and Hope}, and study the temporal evolution of
5128 tweets in these hashtags. We conduct a linguistic analysis of words common to
5129 all hashtag groups and specific to each hashtag group and identify the chief
5130 concerns of people as the pandemic gripped the nation (e.g., exploring bidets
5131 as an alternative to toilet paper). We conduct sentiment analysis and our
5132 investigation reveals that people reacted positively to school closures and
5133 negatively to the lack of availability of essential goods due to panic buying.
5134 We adopt a state-of-the-art semantic role labeling approach to identify the
5135 action words and then leverage a LSTM-based dependency parsing model to analyze
5136 the context of action words (e.g., verb deal is accompanied by nouns such as
5137 anxiety, stress, and crisis). Finally, we develop a scalable seeded topic
5138 modeling approach to automatically categorize and isolate tweets into hashtag
5139 groups and experimentally validate that our topic model provides a grouping
5140 similar to our manual grouping. Our study presents a systematic way to
5141 construct an aggregated picture of peoples' response to the pandemic and lays
5142 the groundwork for future fine-grained linguistic and behavioral analysis.
5143 </p>
5144 </description>
5145 </item>
5146 <item>
5147 <title>Deep DA for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labeled Videos. (arXiv:2010.15675v1 [cs.CV])</title>
5148 <link>http://fr.arxiv.org/abs/2010.15675</link>
5149 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+R_G/0/1/0/all/0/1">Gnana Praveen R</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Granger_E/0/1/0/all/0/1">Eric Granger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cardinal_P/0/1/0/all/0/1">Patrick Cardinal</a></p>
5150
5151 <p>Automatic estimation of pain intensity from facial expressions in videos has
5152 an immense potential in health care applications. However, domain adaptation
5153 (DA) is needed to alleviate the problem of domain shifts that typically occurs
5154 between video data captured in source and target do-mains. Given the laborious
5155 task of collecting and annotating videos, and the subjective bias due to
5156 ambiguity among adjacent intensity levels, weakly-supervised learning (WSL)is
5157 gaining attention in such applications. Yet, most state-of-the-art WSL models
5158 are typically formulated as regression problems, and do not leverage the
5159 ordinal relation between intensity levels, nor the temporal coherence of
5160 multiple consecutive frames. This paper introduces a new deep learn-ing model
5161 for weakly-supervised DA with ordinal regression(WSDA-OR), where videos in
5162 target domain have coarse la-bels provided on a periodic basis. The WSDA-OR
5163 model enforces ordinal relationships among the intensity levels as-signed to
5164 the target sequences, and associates multiple relevant frames to sequence-level
5165 labels (instead of a single frame). In particular, it learns discriminant and
5166 domain-invariant feature representations by integrating multiple in-stance
5167 learning with deep adversarial DA, where soft Gaussian labels are used to
5168 efficiently represent the weak ordinal sequence-level labels from the target
5169 domain. The proposed approach was validated on the RECOLA video dataset as
5170 fully-labeled source domain, and UNBC-McMaster video data as weakly-labeled
5171 target domain. We have also validated WSDA-OR on BIOVID and Fatigue (private)
5172 datasets for sequence level estimation. Experimental results indicate that our
5173 approach can provide a significant improvement over the state-of-the-art
5174 models, allowing to achieve a greater localization accuracy.
5175 </p>
5176 </description>
5177 </item>
5178 <item>
5179 <title>Optimization Fabrics for Behavioral Design. (arXiv:2010.15676v1 [cs.RO])</title>
5180 <link>http://fr.arxiv.org/abs/2010.15676</link>
5181 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ratliff_N/0/1/0/all/0/1">Nathan D. Ratliff</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wyk_K/0/1/0/all/0/1">Karl Van Wyk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_M/0/1/0/all/0/1">Mandy Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_A/0/1/0/all/0/1">Anqi Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rana_A/0/1/0/all/0/1">Asif Muhammad Rana</a></p>
5182
5183 <p>Second-order differential equations define smooth system behavior. In
5184 general, there is no guarantee that a system will behave well when forced by a
5185 potential function, but in some cases they do and may exhibit smooth
5186 optimization properties such as convergence to a local minimum of the
5187 potential. Such a property is desirable in system design since it is inherently
5188 linked to asymptotic stability. This paper presents a comprehensive theory of
5189 optimization fabrics which are second-order differential equations that encode
5190 nominal behaviors on a space and are guaranteed to optimize when forced away
5191 from those nominal trajectories by a potential function. Optimization fabrics,
5192 or fabrics for short, can encode commonalities among optimization problems that
5193 reflect the structure of the space itself, enabling smooth optimization
5194 processes to intelligently navigate each problem even when the potential
5195 function is simple and relatively naive. Importantly, optimization over a
5196 fabric is asymptotically stable, so optimization fabrics constitute a building
5197 block for provably stable system design.
5198 </p>
5199 </description>
5200 </item>
5201 <item>
5202 <title>On the Failure of the Smart Approach of the GPT Cryptosystem. (arXiv:2010.15678v1 [cs.CR])</title>
5203 <link>http://fr.arxiv.org/abs/2010.15678</link>
5204 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kalachi_H/0/1/0/all/0/1">Herve Tale Kalachi</a></p>
5205
5206 <p>This paper describes a new algorithm for breaking the smart approach of the
5207 GPT cryptosystem. We show that by puncturing the public code several times on
5208 specific positions, we get a public code on which applying the Frobenius
5209 operator appropriately allows to build an alternative secret key.
5210 </p>
5211 </description>
5212 </item>
5213 <item>
5214 <title>Lie-Trotter Splitting for the Nonlinear Stochastic Manakov System. (arXiv:2010.15679v1 [math.AP])</title>
5215 <link>http://fr.arxiv.org/abs/2010.15679</link>
5216 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Berg_A/0/1/0/all/0/1">Andr&#xe9; Berg</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Cohen_D/0/1/0/all/0/1">David Cohen</a> (Chalmers), <a href="http://fr.arxiv.org/find/math/1/au:+Dujardin_G/0/1/0/all/0/1">Guillaume Dujardin</a> (LPP)</p>
5217
5218 <p>This article analyses the convergence of the Lie-Trotter splitting scheme for
5219 the stochastic Manakov equation, a system arising in the study of pulse
5220 propagation in randomly birefringent optical fibers. First, we prove that the
5221 strong order of the numerical approximation is 1/2 if the nonlinear term in the
5222 system is globally Lipschitz. Then, we show that the splitting scheme has
5223 convergence order 1/2 in probability and almost sure order 1/2- in the case of
5224 a cubic nonlinearity. We provide several numerical experiments illustrating the
5225 aforementioned results and the efficiency of the Lie-Trotter splitting scheme.
5226 Finally, we numerically investigate the possible blowup of solutions for some
5227 power-law nonlinearities.
5228 </p>
5229 </description>
5230 </item>
5231 <item>
5232 <title>LSTM for Model-Based Anomaly Detection in Cyber-Physical Systems. (arXiv:2010.15680v1 [cs.LG])</title>
5233 <link>http://fr.arxiv.org/abs/2010.15680</link>
5234 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Eiteneuer_B/0/1/0/all/0/1">Benedikt Eiteneuer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Niggemann_O/0/1/0/all/0/1">Oliver Niggemann</a></p>
5235
5236 <p>Anomaly detection is the task of detecting data which differs from the normal
5237 behaviour of a system in a given context. In order to approach this problem,
5238 data-driven models can be learned to predict current or future observations.
5239 Oftentimes, anomalous behaviour depends on the internal dynamics of the system
5240 and looks normal in a static context. To address this problem, the model should
5241 also operate depending on state. Long Short-Term Memory (LSTM) neural networks
5242 have been shown to be particularly useful to learn time sequences with varying
5243 length of temporal dependencies and are therefore an interesting general
5244 purpose approach to learn the behaviour of arbitrarily complex Cyber-Physical
5245 Systems. In order to perform anomaly detection, we slightly modify the standard
5246 norm 2 error to incorporate an estimate of model uncertainty. We analyse the
5247 approach on artificial and real data.
5248 </p>
5249 </description>
5250 </item>
5251 <item>
5252 <title>Maximum a posteriori signal recovery for optical coherence tomography angiography image generation and denoising. (arXiv:2010.15682v1 [eess.IV])</title>
5253 <link>http://fr.arxiv.org/abs/2010.15682</link>
5254 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Husvogt_L/0/1/0/all/0/1">Lennart Husvogt</a> (1 and 2), <a href="http://fr.arxiv.org/find/eess/1/au:+Ploner_S/0/1/0/all/0/1">Stefan B. Ploner</a> (1), <a href="http://fr.arxiv.org/find/eess/1/au:+Chen_S/0/1/0/all/0/1">Siyu Chen</a> (2), <a href="http://fr.arxiv.org/find/eess/1/au:+Stromer_D/0/1/0/all/0/1">Daniel Stromer</a> (1, 2), <a href="http://fr.arxiv.org/find/eess/1/au:+Schottenhamml_J/0/1/0/all/0/1">Julia Schottenhamml</a> (1), <a href="http://fr.arxiv.org/find/eess/1/au:+Alibhai_A/0/1/0/all/0/1">A. Yasin Alibhai</a> (3), <a href="http://fr.arxiv.org/find/eess/1/au:+Moult_E/0/1/0/all/0/1">Eric Moult</a> (2), <a href="http://fr.arxiv.org/find/eess/1/au:+Waheed_N/0/1/0/all/0/1">Nadia K. Waheed</a> (3), <a href="http://fr.arxiv.org/find/eess/1/au:+Fujimoto_J/0/1/0/all/0/1">James G. Fujimoto</a> (2), <a href="http://fr.arxiv.org/find/eess/1/au:+Maier_A/0/1/0/all/0/1">Andreas Maier</a> (1) ((1) Friedrich-Alexander-Universit&#xe4;t Erlangen-N&#xfc;rnberg Germany, (2) Massachusetts Institute of Technology USA, (3) Tufts School of Medicine USA)</p>
5255
5256 <p>Optical coherence tomography angiography (OCTA) is a novel and clinically
5257 promising imaging modality to image retinal and sub-retinal vasculature. Based
5258 on repeated optical coherence tomography (OCT) scans, intensity changes are
5259 observed over time and used to compute OCTA image data. OCTA data are prone to
5260 noise and artifacts caused by variations in flow speed and patient movement. We
5261 propose a novel iterative maximum a posteriori signal recovery algorithm in
5262 order to generate OCTA volumes with reduced noise and increased image quality.
5263 This algorithm is based on previous work on probabilistic OCTA signal models
5264 and maximum likelihood estimates. Reconstruction results using total variation
5265 minimization and wavelet shrinkage for regularization were compared against an
5266 OCTA ground truth volume, merged from six co-registered single OCTA volumes.
5267 The results show a significant improvement in peak signal-to-noise ratio and
5268 structural similarity. The presented algorithm brings together OCTA image
5269 generation and Bayesian statistics and can be developed into new OCTA image
5270 generation and denoising algorithms.
5271 </p>
5272 </description>
5273 </item>
5274 <item>
5275 <title>Resilient Energy Efficient Healthcare Monitoring Infrastructure with Server and Network Protection. (arXiv:2010.15683v1 [eess.SY])</title>
5276 <link>http://fr.arxiv.org/abs/2010.15683</link>
5277 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Isa_I/0/1/0/all/0/1">Ida Syafiza M. Isa</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+El_Gorashi_T/0/1/0/all/0/1">Taisir E.H. El-Gorashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Musa_M/0/1/0/all/0/1">Mohamed O.I. Musa</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Elmirghani_J/0/1/0/all/0/1">J.M.H. Elmirghani</a></p>
5278
5279 <p>In this paper, a 1+1 server protection scheme is considered where two
5280 servers, a primary and a secondary processing server are used to serve ECG
5281 monitoring applications concurrently. The infrastructure is designed to be
5282 resilient against server failure under two scenarios related to the geographic
5283 location of primary and secondary servers and resilient against both server and
5284 network failures. A Mixed Integer Linear Programming (MILP) model is used to
5285 optimise the number and locations of both primary and secondary processing
5286 servers so that the energy consumption of the networking equipment and
5287 processing are minimised. The results show that considering a scenario for
5288 server protection without geographical constraints compared to the
5289 non-resilient scenario has resulted in both network and processing energy
5290 penalty as the traffic is doubled. The results also reveal that increasing the
5291 level of resilience to consider geographical constraints compared to case
5292 without geographical constraints resulted in higher network energy penalty when
5293 the demand is low as more nodes are utilised to place the processing servers
5294 under the geographic constraints. Also, increasing the level of resilience to
5295 consider network protection with link and node disjoint selection has resulted
5296 in a low network energy penalty at high demands due to the activation of a
5297 large part of the network in any case due to the demands. However, the results
5298 show that the network energy penalty is reduced with the increasing number of
5299 processing servers at each candidate node. Meanwhile, the same energy for
5300 processing is consumed regardless of the increasing level of resilience as the
5301 same number of processing servers are utilised. A heuristic is developed for
5302 each resilience scenario for real-time implementation where the results show
5303 that the performance of the heuristic is approaching the results of the MILP
5304 model.
5305 </p>
5306 </description>
5307 </item>
5308 <item>
5309 <title>Governance & Autonomy: Towards a Governance-based Analysis of Autonomy in Cyber-Physical Systems-of-Systems. (arXiv:2010.15684v1 [cs.SE])</title>
5310 <link>http://fr.arxiv.org/abs/2010.15684</link>
5311 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gharib_M/0/1/0/all/0/1">Mohamad Gharib</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lollini_P/0/1/0/all/0/1">Paolo Lollini</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ceccarelli_A/0/1/0/all/0/1">Andrea Ceccarelli</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bondavalli_A/0/1/0/all/0/1">Andrea Bondavalli</a></p>
5312
5313 <p>One of the main challenges in integrating Cyber-Physical System-of-Systems
5314 (CPSoS) to function as a single unified system is the autonomy of its
5315 Cyber-Physical Systems (CPSs), which may lead to a lack of coordination among
5316 CPSs and results in various kinds of conflicts. We advocate that to efficiently
5317 integrate CPSs within the CPSoS, we may need to adjust the autonomy of some
5318 CPSs in a way that allows them to coordinate their activities to avoid any
5319 potential conflict among one another. To achieve that, we need to incorporate
5320 the notion of governance within the design of CPSoS, which defines rules that
5321 can be used for clearly specifying who and how can adjust the autonomy of a
5322 CPS. In this paper, we try to tackle this problem by proposing a new conceptual
5323 model that can be used for performing a governance-based analysis of autonomy
5324 for CPSs within CPSoS. We illustrate the utility of the model with an example
5325 from the automotive domain.
5326 </p>
5327 </description>
5328 </item>
5329 <item>
5330 <title>Deep Autofocus for Synthetic Aperture Sonar. (arXiv:2010.15687v1 [eess.IV])</title>
5331 <link>http://fr.arxiv.org/abs/2010.15687</link>
5332 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Gerg_I/0/1/0/all/0/1">Isaac Gerg</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Monga_V/0/1/0/all/0/1">Vishal Monga</a></p>
5333
5334 <p>Synthetic aperture sonar (SAS) requires precise positional and environmental
5335 information to produce well-focused output during the image reconstruction
5336 step. However, errors in these measurements are commonly present resulting in
5337 defocused imagery. To overcome these issues, an \emph{autofocus} algorithm is
5338 employed as a post-processing step after image reconstruction for the purpose
5339 of improving image quality using the image content itself. These algorithms are
5340 usually iterative and metric-based in that they seek to optimize an image
5341 sharpness metric. In this letter, we demonstrate the potential of machine
5342 learning, specifically deep learning, to address the autofocus problem. We
5343 formulate the problem as a self-supervised, phase error estimation task using a
5344 deep network we call Deep Autofocus. Our formulation has the advantages of
5345 being non-iterative (and thus fast) and not requiring ground truth
5346 focused-defocused images pairs as often required by other deblurring deep
5347 learning methods. We compare our technique against a set of common sharpness
5348 metrics optimized using gradient descent over a real-world dataset. Our results
5349 demonstrate Deep Autofocus can produce imagery that is perceptually as good as
5350 benchmark iterative techniques but at a substantially lower computational cost.
5351 We conclude that our proposed Deep Autofocus can provide a more favorable
5352 cost-quality trade-off than state-of-the-art alternatives with significant
5353 potential of future research.
5354 </p>
5355 </description>
5356 </item>
5357 <item>
5358 <title>Learning Deep Interleaved Networks with Asymmetric Co-Attention for Image Restoration. (arXiv:2010.15689v1 [cs.CV])</title>
5359 <link>http://fr.arxiv.org/abs/2010.15689</link>
5360 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_F/0/1/0/all/0/1">Feng Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cong_R/0/1/0/all/0/1">Runmin Cong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bai_H/0/1/0/all/0/1">Huihui Bai</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_Y/0/1/0/all/0/1">Yifan He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_Y/0/1/0/all/0/1">Yao Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_C/0/1/0/all/0/1">Ce Zhu</a></p>
5361
5362 <p>Recently, convolutional neural network (CNN) has demonstrated significant
5363 success for image restoration (IR) tasks (e.g., image super-resolution, image
5364 deblurring, rain streak removal, and dehazing). However, existing CNN based
5365 models are commonly implemented as a single-path stream to enrich feature
5366 representations from low-quality (LQ) input space for final predictions, which
5367 fail to fully incorporate preceding low-level contexts into later high-level
5368 features within networks, thereby producing inferior results. In this paper, we
5369 present a deep interleaved network (DIN) that learns how information at
5370 different states should be combined for high-quality (HQ) images
5371 reconstruction. The proposed DIN follows a multi-path and multi-branch pattern
5372 allowing multiple interconnected branches to interleave and fuse at different
5373 states. In this way, the shallow information can guide deep representative
5374 features prediction to enhance the feature expression ability. Furthermore, we
5375 propose asymmetric co-attention (AsyCA) which is attached at each interleaved
5376 node to model the feature dependencies. Such AsyCA can not only adaptively
5377 emphasize the informative features from different states, but also improves the
5378 discriminative ability of networks. Our presented DIN can be trained end-to-end
5379 and applied to various IR tasks. Comprehensive evaluations on public benchmarks
5380 and real-world datasets demonstrate that the proposed DIN perform favorably
5381 against the state-of-the-art methods quantitatively and qualitatively.
5382 </p>
5383 </description>
5384 </item>
5385 <item>
5386 <title>Analyzing the tree-layer structure of Deep Forests. (arXiv:2010.15690v1 [cs.LG])</title>
5387 <link>http://fr.arxiv.org/abs/2010.15690</link>
5388 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Arnould_L/0/1/0/all/0/1">Ludovic Arnould</a> (LPSM UMR 8001), <a href="http://fr.arxiv.org/find/cs/1/au:+Boyer_C/0/1/0/all/0/1">Claire Boyer</a> (LPSM UMR 8001), <a href="http://fr.arxiv.org/find/cs/1/au:+Scornet_E/0/1/0/all/0/1">Erwan Scornet</a> (CMAP)</p>
5389
5390 <p>Random forests on the one hand, and neural networks on the other hand, have
5391 met great success in the machine learning community for their predictive
5392 performance. Combinations of both have been proposed in the literature, notably
5393 leading to the so-called deep forests (DF) [25]. In this paper, we investigate
5394 the mechanisms at work in DF and outline that DF architecture can generally be
5395 simplified into more simple and computationally efficient shallow forests
5396 networks. Despite some instability, the latter may outperform standard
5397 predictive tree-based methods. In order to precisely quantify the improvement
5398 achieved by these light network configurations over standard tree learners, we
5399 theoretically study the performance of a shallow tree network made of two
5400 layers, each one composed of a single centered tree. We provide tight
5401 theoretical lower and upper bounds on its excess risk. These theoretical
5402 results show the interest of tree-network architectures for well-structured
5403 data provided that the first layer, acting as a data encoder, is rich enough.
5404 </p>
5405 </description>
5406 </item>
5407 <item>
5408 <title>Unveiling process insights from refactoring practices. (arXiv:2010.15692v1 [cs.SE])</title>
5409 <link>http://fr.arxiv.org/abs/2010.15692</link>
5410 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Caldeira_J/0/1/0/all/0/1">Jo&#xe3;o Caldeira</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Abreu_F/0/1/0/all/0/1">Fernando Brito e Abreu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cardoso_J/0/1/0/all/0/1">Jorge Cardoso</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Reis_J/0/1/0/all/0/1">Jos&#xe9; Reis</a></p>
5411
5412 <p>Context : Software comprehension and maintenance activities, such as
5413 refactoring, are said to be negatively impacted by software complexity. The
5414 methods used to measure software product and processes complexity have been
5415 thoroughly debated in the literature. However, the discernment about the
5416 possible links between these two dimensions, particularly on the benefits of
5417 using the process perspective, has a long journey ahead. Objective: To improve
5418 the understanding of the liaison of developers' activities and software
5419 complexity within a refactoring task, namely by evaluating if process metrics
5420 gathered from the IDE, using process mining methods and tools, are suitable to
5421 accurately classify different refactoring practices and the resulting software
5422 complexity. Method: We mined source code metrics from a software product after
5423 a quality improvement task was given in parallel to (117) software developers,
5424 organized in (71) teams. Simultaneously, we collected events from their IDE
5425 work sessions (320) and used process mining to model their processes and
5426 extract the correspondent metrics. Results: Most teams using a plugin for
5427 refactoring (JDeodorant) reduced software complexity more effectively and with
5428 simpler processes than the ones that performed refactoring using only Eclipse
5429 native features. We were able to find moderate correlations (43%) between
5430 software cyclomatic complexity and process cyclomatic complexity. The best
5431 models found for the refactoring method and cyclomatic complexity level
5432 predictions, had an accuracy of 92.95% and 94.36%, respectively. Conclusions:
5433 Our approach agnostic to programming languages, geographic location, or
5434 development practices. Initial findings are encouraging, and lead us to suggest
5435 practitioners may use our method in other development tasks, such as, defect
5436 analysis and unit or integration tests.
5437 </p>
5438 </description>
5439 </item>
5440 <item>
5441 <title>Learning interaction kernels in mean-field equations of 1st-order systems of interacting particles. (arXiv:2010.15694v1 [stat.ML])</title>
5442 <link>http://fr.arxiv.org/abs/2010.15694</link>
5443 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Lang_Q/0/1/0/all/0/1">Quanjun Lang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Lu_F/0/1/0/all/0/1">Fei Lu</a></p>
5444
5445 <p>We introduce a nonparametric algorithm to learn interaction kernels of
5446 mean-field equations for 1st-order systems of interacting particles. The data
5447 consist of discrete space-time observations of the solution. By least squares
5448 with regularization, the algorithm learns the kernel on data-adaptive
5449 hypothesis spaces efficiently. A key ingredient is a probabilistic error
5450 functional derived from the likelihood of the mean-field equation's diffusion
5451 process. The estimator converges, in a reproducing kernel Hilbert space and an
5452 L2 space under an identifiability condition, at a rate optimal in the sense
5453 that it equals the numerical integrator's order. We demonstrate our algorithm
5454 on three typical examples: the opinion dynamics with a piecewise linear kernel,
5455 the granular media model with a quadratic kernel, and the aggregation-diffusion
5456 with a repulsive-attractive kernel.
5457 </p>
5458 </description>
5459 </item>
5460 <item>
5461 <title>Generalized Insider Attack Detection Implementation using NetFlow Data. (arXiv:2010.15697v1 [cs.CR])</title>
5462 <link>http://fr.arxiv.org/abs/2010.15697</link>
5463 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Samtani_Y/0/1/0/all/0/1">Yash Samtani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Elwell_J/0/1/0/all/0/1">Jesse Elwell</a></p>
5464
5465 <p>Insider Attack Detection in commercial networks is a critical problem that
5466 does not have any good solutions at this current time. The problem is
5467 challenging due to the lack of visibility into live networks and a lack of a
5468 standard feature set to distinguish between different attacks. In this paper,
5469 we study an approach centered on using network data to identify attacks. Our
5470 work builds on unsupervised machine learning techniques such as One-Class SVM
5471 and bi-clustering as weak indicators of insider network attacks. We combine
5472 these techniques to limit the number of false positives to an acceptable level
5473 required for real-world deployments by using One-Class SVM to check for
5474 anomalies detected by the proposed Bi-clustering algorithm. We present a
5475 prototype implementation in Python and associated results for two different
5476 real-world representative data sets. We show that our approach is a promising
5477 tool for insider attack detection in realistic settings.
5478 </p>
5479 </description>
5480 </item>
5481 <item>
5482 <title>Constrained Online Learning to Mitigate Distortion Effects in Pulse-Agile Cognitive Radar. (arXiv:2010.15698v1 [cs.IT])</title>
5483 <link>http://fr.arxiv.org/abs/2010.15698</link>
5484 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Thornton_C/0/1/0/all/0/1">Charles E. Thornton</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Buehrer_R/0/1/0/all/0/1">R. Michael Buehrer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Martone_A/0/1/0/all/0/1">Anthony F. Martone</a></p>
5485
5486 <p>Pulse-agile radar systems have demonstrated favorable performance in dynamic
5487 electromagnetic scenarios. However, the use of non-identical waveforms within a
5488 radar's coherent processing interval may lead to harmful distortion effects
5489 when pulse-Doppler processing is used. This paper presents an online learning
5490 framework to optimize detection performance while mitigating harmful sidelobe
5491 levels. The radar waveform selection process is formulated as a linear
5492 contextual bandit problem, within which waveform adaptations which exceed a
5493 tolerable level of expected distortion are eliminated. The constrained online
5494 learning approach is effective and computationally feasible, evidenced by
5495 simulations in a radar-communication coexistence scenario and in the presence
5496 of intentional adaptive jamming. This approach is applied to both stochastic
5497 and adversarial contextual bandit learning models and the detection performance
5498 in dynamic scenarios is evaluated.
5499 </p>
5500 </description>
5501 </item>
5502 <item>
5503 <title>Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks. (arXiv:2010.15703v1 [cs.CV])</title>
5504 <link>http://fr.arxiv.org/abs/2010.15703</link>
5505 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Martinez_J/0/1/0/all/0/1">Julieta Martinez</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shewakramani_J/0/1/0/all/0/1">Jashan Shewakramani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_T/0/1/0/all/0/1">Ting Wei Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Barsan_I/0/1/0/all/0/1">Ioan Andrei B&#xe2;rsan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zeng_W/0/1/0/all/0/1">Wenyuan Zeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Urtasun_R/0/1/0/all/0/1">Raquel Urtasun</a></p>
5506
5507 <p>Compressing large neural networks is an important step for their deployment
5508 in resource-constrained computational platforms. In this context, vector
5509 quantization is an appealing framework that expresses multiple parameters using
5510 a single code, and has recently achieved state-of-the-art network compression
5511 on a range of core vision and natural language processing tasks. Key to the
5512 success of vector quantization is deciding which parameter groups should be
5513 compressed together. Previous work has relied on heuristics that group the
5514 spatial dimension of individual convolutional filters, but a general solution
5515 remains unaddressed. This is desirable for pointwise convolutions (which
5516 dominate modern architectures), linear layers (which have no notion of spatial
5517 dimension), and convolutions (when more than one filter is compressed to the
5518 same codeword). In this paper we make the observation that the weights of two
5519 adjacent layers can be permuted while expressing the same function. We then
5520 establish a connection to rate-distortion theory and search for permutations
5521 that result in networks that are easier to compress. Finally, we rely on an
5522 annealed quantization algorithm to better compress the network and achieve
5523 higher final accuracy. We show results on image classification, object
5524 detection, and segmentation, reducing the gap with the uncompressed model by 40
5525 to 70% with respect to the current state of the art.
5526 </p>
5527 </description>
5528 </item>
5529 <item>
5530 <title>5W1H-based Expression for the Effective Sharing of Information in Digital Forensic Investigations. (arXiv:2010.15711v1 [cs.CR])</title>
5531 <link>http://fr.arxiv.org/abs/2010.15711</link>
5532 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Han_J/0/1/0/all/0/1">Jaehyeok Han</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_J/0/1/0/all/0/1">Jieon Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_S/0/1/0/all/0/1">Sangjin Lee</a></p>
5533
5534 <p>Digital forensic investigation is used in various areas related to digital
5535 devices including the cyber crime. This is an investigative process using many
5536 techniques, which have implemented as tools. The types of files covered by the
5537 digital forensic investigation are wide and varied, however, there is no way to
5538 express the results into a standardized format. The standardization are
5539 different by types of device, file system, or application. Different outputs
5540 make it time-consuming and difficult to share information and to implement
5541 integration. In addition, it could weaken cyber security. Thus, it is important
5542 to define normalization and to present data in the same format. In this paper,
5543 a 5W1H-based expression for information sharing for effective digital forensic
5544 investigation is proposed to analyze digital forensic information using six
5545 questions--what, who, where, when, why and how. Based on the 5W1H-based
5546 expression, digital information from different types of files is converted and
5547 represented in the same format of outputs. As the 5W1H is the basic writing
5548 principle, application of the 5W1H-based expression on the case studies shows
5549 that this expression enhances clarity and correctness for information sharing.
5550 Furthermore, in the case of security incidents, this expression has an
5551 advantage in being compatible with STIX.
5552 </p>
5553 </description>
5554 </item>
5555 <item>
5556 <title>Playing a Part: Speaker Verification at the Movies. (arXiv:2010.15716v1 [cs.SD])</title>
5557 <link>http://fr.arxiv.org/abs/2010.15716</link>
5558 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Brown_A/0/1/0/all/0/1">Andrew Brown</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huh_J/0/1/0/all/0/1">Jaesung Huh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nagrani_A/0/1/0/all/0/1">Arsha Nagrani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chung_J/0/1/0/all/0/1">Joon Son Chung</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zisserman_A/0/1/0/all/0/1">Andrew Zisserman</a></p>
5559
5560 <p>The goal of this work is to investigate the performance of popular speaker
5561 recognition models on speech segments from movies, where often actors
5562 intentionally disguise their voice to play a character. We make the following
5563 three contributions: (i) We collect a novel, challenging speaker recognition
5564 dataset called VoxMovies, with speech for 856 identities from almost 4000 movie
5565 clips. VoxMovies contains utterances with varying emotion, accents and
5566 background noise, and therefore comprises an entirely different domain to the
5567 interview-style, emotionally calm utterances in current speaker recognition
5568 datasets such as VoxCeleb; (ii) We provide a number of domain adaptation
5569 evaluation sets, and benchmark the performance of state-of-the-art speaker
5570 recognition models on these evaluation pairs. We demonstrate that both speaker
5571 verification and identification performance drops steeply on this new data,
5572 showing the challenge in transferring models across domains; and finally (iii)
5573 We show that simple domain adaptation paradigms improve performance, but there
5574 is still large room for improvement.
5575 </p>
5576 </description>
5577 </item>
5578 <item>
5579 <title>What can we learn from gradients?. (arXiv:2010.15718v1 [cs.CR])</title>
5580 <link>http://fr.arxiv.org/abs/2010.15718</link>
5581 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_J/0/1/0/all/0/1">Jia Qian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hansen_L/0/1/0/all/0/1">Lars Kai Hansen</a></p>
5582
5583 <p>Recent work (\cite{zhu2019deep}) has shown that it is possible to reconstruct
5584 the input (image) from the gradient of a neural network. In this paper, our aim
5585 is to better understand the limits to reconstruction and to speed up image
5586 reconstruction by imposing prior image information and improved initialization.
5587 Firstly, we show that for the \textbf{non-linear} neural network,
5588 gradient-based reconstruction approximates to solving a high-dimension
5589 \textbf{linear} equations for both fully-connected neural network and
5590 convolutional neural network. Exploring the theoretical limits of input
5591 reconstruction, we show that a fully-connected neural network with a
5592 \textbf{one} hidden node is enough to reconstruct a \textbf{single} input
5593 image, regardless of the number of nodes in the output layer. Then we
5594 generalize this result to a gradient averaged over mini-batches of size B. In
5595 this case, the full mini-batch can be reconstructed in a fully-connected
5596 network if the number of hidden units exceeds B. For a convolutional neural
5597 network, the required number of filters in the first convolutional layer again
5598 is decided by the batch size B, however, in this case, input width d and the
5599 width after filter $d^{'}$ also play the role $h=(\frac{d}{d^{'}})^2BC$, where
5600 C is channel number of input. Finally, we validate and underpin our theoretical
5601 analysis on bio-medical data (fMRI, ECG signals, and cell images) and on
5602 benchmark data (MNIST, CIFAR100, and face images).
5603 </p>
5604 </description>
5605 </item>
5606 <item>
5607 <title>Attentive Clustering Processes. (arXiv:2010.15727v1 [stat.ML])</title>
5608 <link>http://fr.arxiv.org/abs/2010.15727</link>
5609 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Pakman_A/0/1/0/all/0/1">Ari Pakman</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wang_Y/0/1/0/all/0/1">Yueqi Wang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Lee_Y/0/1/0/all/0/1">Yoonho Lee</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Basu_P/0/1/0/all/0/1">Pallab Basu</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Lee_J/0/1/0/all/0/1">Juho Lee</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Teh_Y/0/1/0/all/0/1">Yee Whye Teh</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Paninski_L/0/1/0/all/0/1">Liam Paninski</a></p>
5610
5611 <p>Amortized approaches to clustering have recently received renewed attention
5612 thanks to novel objective functions that exploit the expressiveness of deep
5613 learning models. In this work we revisit a recent proposal for fast amortized
5614 probabilistic clustering, the Clusterwise Clustering Process (CCP), which
5615 yields samples from the posterior distribution of cluster labels for sets of
5616 arbitrary size using only O(K) forward network evaluations, where K is an
5617 arbitrary number of clusters. While adequate in simple datasets, we show that
5618 the model can severely underfit complex datasets, and hypothesize that this
5619 limitation can be traced back to the implicit assumption that the probability
5620 of a point joining a cluster is equally sensitive to all the points available
5621 to join the same cluster. We propose an improved model, the Attentive
5622 Clustering Process (ACP), that selectively pays more attention to relevant
5623 points while preserving the invariance properties of the generative model. We
5624 illustrate the advantages of the new model in applications to spike-sorting in
5625 multi-electrode arrays and community discovery in networks. The latter case
5626 combines the ACP model with graph convolutional networks, and to our knowledge
5627 is the first deep learning model that handles an arbitrary number of
5628 communities.
5629 </p>
5630 </description>
5631 </item>
5632 <item>
5633 <title>Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation. (arXiv:2010.15728v1 [cs.CL])</title>
5634 <link>http://fr.arxiv.org/abs/2010.15728</link>
5635 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_H/0/1/0/all/0/1">Hang Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Suarez_Paniagua_V/0/1/0/all/0/1">V&#xed;ctor Su&#xe1;rez-Paniagua</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Whiteley_W/0/1/0/all/0/1">William Whiteley</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_H/0/1/0/all/0/1">Honghan Wu</a></p>
5636
5637 <p>Diagnostic or procedural coding of clinical notes aims to derive a coded
5638 summary of disease-related information about patients. Such coding is usually
5639 done manually in hospitals but could potentially be automated to improve the
5640 efficiency and accuracy of medical coding. Recent studies on deep learning for
5641 automated medical coding achieved promising performances. However, the
5642 explainability of these models is usually poor, preventing them to be used
5643 confidently in supporting clinical practice. Another limitation is that these
5644 models mostly assume independence among labels, ignoring the complex
5645 correlation among medical codes which can potentially be exploited to improve
5646 the performance. We propose a Hierarchical Label-wise Attention Network (HLAN),
5647 which aimed to interpret the model by quantifying importance (as attention
5648 weights) of words and sentences related to each of the labels. Secondly, we
5649 propose to enhance the major deep learning models with a label embedding (LE)
5650 initialisation approach, which learns a dense, continuous vector representation
5651 and then injects the representation into the final layers and the label-wise
5652 attention layers in the models. We evaluated the methods using three settings
5653 on the MIMIC-III discharge summaries: full codes, top-50 codes, and the UK NHS
5654 COVID-19 shielding codes. Experiments were conducted to compare HLAN and LE
5655 initialisation to the state-of-the-art neural network based methods. HLAN
5656 achieved the best Micro-level AUC and $F_1$ on the top-50 code prediction and
5657 comparable results on the NHS COVID-19 shielding code prediction to other
5658 models. By highlighting the most salient words and sentences for each label,
5659 HLAN showed more meaningful and comprehensive model interpretation compared to
5660 its downgraded baselines and the CNN-based models. LE initialisation
5661 consistently boosted most deep learning models for automated medical coding.
5662 </p>
5663 </description>
5664 </item>
5665 <item>
5666 <title>Fundamental limitations to key distillation from Gaussian states with Gaussian operations. (arXiv:2010.15729v1 [quant-ph])</title>
5667 <link>http://fr.arxiv.org/abs/2010.15729</link>
5668 <description><p>Authors: <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Lami_L/0/1/0/all/0/1">Ludovico Lami</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Mista_L/0/1/0/all/0/1">Ladislav Mi&#x161;ta, Jr.</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Adesso_G/0/1/0/all/0/1">Gerardo Adesso</a></p>
5669
5670 <p>We establish fundamental upper bounds on the amount of secret key that can be
5671 extracted from continuous variable quantum Gaussian states by using only local
5672 Gaussian operations, local classical processing, and public communication. For
5673 one-way communication, we prove that the key is bounded by the R\'enyi-$2$
5674 Gaussian entanglement of formation $E_{F,2}^{\mathrm{\scriptscriptstyle G}}$,
5675 with the inequality being saturated for pure Gaussian states. The same is true
5676 if two-way public communication is allowed but Alice and Bob employ protocols
5677 that start with destructive local Gaussian measurements. In the most general
5678 setting of two-way communication and arbitrary interactive protocols, we argue
5679 that $2 E_{F,2}^{\mathrm{\scriptscriptstyle G}}$ is still a bound on the
5680 extractable key, although we conjecture that the factor of $2$ is superfluous.
5681 Finally, for a wide class of Gaussian states that includes all two-mode states,
5682 we prove a recently proposed conjecture on the equality between
5683 $E_{F,2}^{\mathrm{\scriptscriptstyle G}}$ and the Gaussian intrinsic
5684 entanglement, thus endowing both measures with a more solid operational
5685 meaning.
5686 </p>
5687 </description>
5688 </item>
5689 <item>
5690 <title>The Agile Coach Role: Coaching for Agile Performance Impact. (arXiv:2010.15738v1 [cs.SE])</title>
5691 <link>http://fr.arxiv.org/abs/2010.15738</link>
5692 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Stray_V/0/1/0/all/0/1">Viktoria Stray</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tkalich_A/0/1/0/all/0/1">Anastasiia Tkalich</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Moe_N/0/1/0/all/0/1">Nils Brede Moe</a></p>
5693
5694 <p>It is increasingly common to introduce agile coaches to help gain speed and
5695 advantage in agile companies. Following the success of Spotify, the role of the
5696 agile coach has branched out in terms of tasks and responsibilities, but little
5697 research has been conducted to examine how this role is practiced. This paper
5698 examines the role of the agile coach through 19 semistructured interviews with
5699 agile coaches from ten different companies. We describe the role in terms of
5700 the tasks the coach has in agile projects, valuable traits, skills, tools, and
5701 the enablers of agile coaching. Our findings indicate that agile coaches
5702 perform at the team and organizational levels. They affect effort, strategies,
5703 knowledge, and skills of the agile teams. The most essential traits of an agile
5704 coach are being emphatic, people-oriented, able to listen, diplomatic, and
5705 persistent. We suggest empirically based advice for agile coaching, for example
5706 companies giving their agile coaches the authority to implement the required
5707 organizational changes within and outside the teams.
5708 </p>
5709 </description>
5710 </item>
5711 <item>
5712 <title>Recurrent Neural Networks for video object detection. (arXiv:2010.15740v1 [cs.CV])</title>
5713 <link>http://fr.arxiv.org/abs/2010.15740</link>
5714 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qasim_A/0/1/0/all/0/1">Ahmad B Qasim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pettirsch_A/0/1/0/all/0/1">Arnd Pettirsch</a></p>
5715
5716 <p>There is lots of scientific work about object detection in images. For many
5717 applications like for example autonomous driving the actual data on which
5718 classification has to be done are videos. This work compares different methods,
5719 especially those which use Recurrent Neural Networks to detect objects in
5720 videos. We differ between feature-based methods, which feed feature maps of
5721 different frames into the recurrent units, box-level methods, which feed
5722 bounding boxes with class probabilities into the recurrent units and methods
5723 which use flow networks. This study indicates common outcomes of the compared
5724 methods like the benefit of including the temporal context into object
5725 detection and states conclusions and guidelines for video object detection
5726 networks.
5727 </p>
5728 </description>
5729 </item>
5730 <item>
5731 <title>Causal variables from reinforcement learning using generalized Bellman equations. (arXiv:2010.15745v1 [cs.LG])</title>
5732 <link>http://fr.arxiv.org/abs/2010.15745</link>
5733 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Herlau_T/0/1/0/all/0/1">Tue Herlau</a></p>
5734
5735 <p>Many open problems in machine learning are intrinsically related to
5736 causality, however, the use of causal analysis in machine learning is still in
5737 its early stage. Within a general reinforcement learning setting, we consider
5738 the problem of building a general reinforcement learning agent which uses
5739 experience to construct a causal graph of the environment, and use this graph
5740 to inform its policy. Our approach has three characteristics: First, we learn a
5741 simple, coarse-grained causal graph, in which the variables reflect states at
5742 many time instances, and the interventions happen at the level of policies,
5743 rather than individual actions. Secondly, we use mediation analysis to obtain
5744 an optimization target. By minimizing this target, we define the causal
5745 variables. Thirdly, our approach relies on estimating conditional expectations
5746 rather the familiar expected return from reinforcement learning, and we
5747 therefore apply a generalization of Bellman's equations. We show the method can
5748 learn a plausible causal graph in a grid-world environment, and the agent
5749 obtains an improvement in performance when using the causally informed policy.
5750 To our knowledge, this is the first attempt to apply causal analysis in a
5751 reinforcement learning setting without strict restrictions on the number of
5752 states. We have observed that mediation analysis provides a promising avenue
5753 for transforming the problem of causal acquisition into one of cost-function
5754 minimization, but importantly one which involves estimating conditional
5755 expectations. This is a new challenge, and we think that causal reinforcement
5756 learning will involve development methods suited for online estimation of such
5757 conditional expectations. Finally, a benefit of our approach is the use of very
5758 simple causal models, which are arguably a more natural model of human causal
5759 understanding.
5760 </p>
5761 </description>
5762 </item>
5763 <item>
5764 <title>Gaussian Process Bandit Optimization of theThermodynamic Variational Objective. (arXiv:2010.15750v1 [cs.LG])</title>
5765 <link>http://fr.arxiv.org/abs/2010.15750</link>
5766 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_V/0/1/0/all/0/1">Vu Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Masrani_V/0/1/0/all/0/1">Vaden Masrani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Brekelmans_R/0/1/0/all/0/1">Rob Brekelmans</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Osborne_M/0/1/0/all/0/1">Michael A. Osborne</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wood_F/0/1/0/all/0/1">Frank Wood</a></p>
5767
5768 <p>Achieving the full promise of the Thermodynamic Variational Objective (TVO),a
5769 recently proposed variational lower bound on the log evidence involving a
5770 one-dimensional Riemann integral approximation, requires choosing a "schedule"
5771 ofsorted discretization points. This paper introduces a bespoke Gaussian
5772 processbandit optimization method for automatically choosing these points. Our
5773 approach not only automates their one-time selection, but also dynamically
5774 adaptstheir positions over the course of optimization, leading to improved
5775 model learning and inference. We provide theoretical guarantees that our bandit
5776 optimizationconverges to the regret-minimizing choice of integration points.
5777 Empirical validation of our algorithm is provided in terms of improved learning
5778 and inference inVariational Autoencoders and Sigmoid Belief Networks.
5779 </p>
5780 </description>
5781 </item>
5782 <item>
5783 <title>A more Pragmatic Implementation of the Lock-free, Ordered, Linked List. (arXiv:2010.15755v1 [cs.DS])</title>
5784 <link>http://fr.arxiv.org/abs/2010.15755</link>
5785 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Traff_J/0/1/0/all/0/1">Jesper Larsson Tr&#xe4;ff</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Poter_M/0/1/0/all/0/1">Manuel P&#xf6;ter</a></p>
5786
5787 <p>The lock-free, ordered, linked list is an important, standard example of a
5788 concurrent data structure. An obvious, practical drawback of textbook
5789 implementations is that failed compare-and-swap (CAS) operations lead to
5790 retraversal of the entire list (retries), which is particularly harmful for
5791 such a linear-time data structure. We alleviate this drawback by first
5792 observing that failed CAS operations under some conditions do not require a
5793 full retry, and second by maintaining approximate backwards pointers that are
5794 used to find a closer starting position in the list for operation retry.
5795 Experiments with both a worst-case deterministic benchmark, and a standard,
5796 randomized, mixed-operation throughput benchmark on three shared-memory systems
5797 (Intel Xeon, AMD EPYC, SPARC-T5) show practical improvements ranging from
5798 significant, to dramatic, several orders of magnitude.
5799 </p>
5800 </description>
5801 </item>
5802 <item>
5803 <title>Identifying Transition States of Chemical Kinetic Systems using Network Embedding Techniques. (arXiv:2010.15760v1 [math.NA])</title>
5804 <link>http://fr.arxiv.org/abs/2010.15760</link>
5805 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Mercurio_P/0/1/0/all/0/1">Paula Mercurio</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Liu_D/0/1/0/all/0/1">Di Liu</a></p>
5806
5807 <p>Using random walk sampling methods for feature learning on networks, we
5808 develop a method for generating low-dimensional node embeddings for directed
5809 graphs and identifying transition states of stochastic chemical reacting
5810 systems. We modified objective functions adopted in existing random walk based
5811 network embedding methods to handle directed graphs and neighbors of different
5812 degrees. Through optimization via gradient ascent, we embed the weighted graph
5813 vertices into a low-dimensional vector space Rd while preserving the
5814 neighborhood of each node. We then demonstrate the effectiveness of the method
5815 on dimension reduction through several examples regarding identification of
5816 transition states of chemical reactions, especially for entropic systems.
5817 </p>
5818 </description>
5819 </item>
5820 <item>
5821 <title>A Helmholtz equation solver using unsupervised learning: Application to transcranial ultrasound. (arXiv:2010.15761v1 [physics.comp-ph])</title>
5822 <link>http://fr.arxiv.org/abs/2010.15761</link>
5823 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Stanziola_A/0/1/0/all/0/1">Antonio Stanziola</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Arridge_S/0/1/0/all/0/1">Simon R. Arridge</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Cox_B/0/1/0/all/0/1">Ben T. Cox</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Treeby_B/0/1/0/all/0/1">Bradley E. Treeby</a></p>
5824
5825 <p>Transcranial ultrasound therapy is increasingly used for the non-invasive
5826 treatment of brain disorders. However, conventional numerical wave solvers are
5827 currently too computationally expensive to be used online during treatments to
5828 predict the acoustic field passing through the skull (e.g., to account for
5829 subject-specific dose and targeting variations). As a step towards real-time
5830 predictions, in the current work, a fast iterative solver for the heterogeneous
5831 Helmholtz equation in 2D is developed using a fully-learned optimizer. The
5832 lightweight network architecture is based on a modified UNet that includes a
5833 learned hidden state. The network is trained using a physics-based loss
5834 function and a set of idealized sound speed distributions with fully
5835 unsupervised training (no knowledge of the true solution is required). The
5836 learned optimizer shows excellent performance on the test set, and is capable
5837 of generalization well outside the training examples, including to much larger
5838 computational domains, and more complex source and sound speed distributions,
5839 for example, those derived from x-ray computed tomography images of the skull.
5840 </p>
5841 </description>
5842 </item>
5843 <item>
5844 <title>Domain adaptation under structural causal models. (arXiv:2010.15764v1 [stat.ML])</title>
5845 <link>http://fr.arxiv.org/abs/2010.15764</link>
5846 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Chen_Y/0/1/0/all/0/1">Yuansi Chen</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Buhlmann_P/0/1/0/all/0/1">Peter B&#xfc;hlmann</a></p>
5847
5848 <p>Domain adaptation (DA) arises as an important problem in statistical machine
5849 learning when the source data used to train a model is different from the
5850 target data used to test the model. Recent advances in DA have mainly been
5851 application-driven and have largely relied on the idea of a common subspace for
5852 source and target data. To understand the empirical successes and failures of
5853 DA methods, we propose a theoretical framework via structural causal models
5854 that enables analysis and comparison of the prediction performance of DA
5855 methods. This framework also allows us to itemize the assumptions needed for
5856 the DA methods to have a low target error. Additionally, with insights from our
5857 theory, we propose a new DA method called CIRM that outperforms existing DA
5858 methods when both the covariates and label distributions are perturbed in the
5859 target data. We complement the theoretical analysis with extensive simulations
5860 to show the necessity of the devised assumptions. Reproducible synthetic and
5861 real data experiments are also provided to illustrate the strengths and
5862 weaknesses of DA methods when parts of the assumptions of our theory are
5863 violated.
5864 </p>
5865 </description>
5866 </item>
5867 <item>
5868 <title>A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems. (arXiv:2010.15768v1 [math.OC])</title>
5869 <link>http://fr.arxiv.org/abs/2010.15768</link>
5870 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Zhang_J/0/1/0/all/0/1">Jiawei Zhang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Xiao_P/0/1/0/all/0/1">Peijun Xiao</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Sun_R/0/1/0/all/0/1">Ruoyu Sun</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Luo_Z/0/1/0/all/0/1">Zhi-Quan Luo</a></p>
5871
5872 <p>Nonconvex-concave min-max problem arises in many machine learning
5873 applications including minimizing a pointwise maximum of a set of nonconvex
5874 functions and robust adversarial training of neural networks. A popular
5875 approach to solve this problem is the gradient descent-ascent (GDA) algorithm
5876 which unfortunately can exhibit oscillation in case of nonconvexity. In this
5877 paper, we introduce a "smoothing" scheme which can be combined with GDA to
5878 stabilize the oscillation and ensure convergence to a stationary solution. We
5879 prove that the stabilized GDA algorithm can achieve an $O(1/\epsilon^2)$
5880 iteration complexity for minimizing the pointwise maximum of a finite
5881 collection of nonconvex functions. Moreover, the smoothed GDA algorithm
5882 achieves an $O(1/\epsilon^4)$ iteration complexity for general
5883 nonconvex-concave problems. Extensions of this stabilized GDA algorithm to
5884 multi-block cases are presented. To the best of our knowledge, this is the
5885 first algorithm to achieve $O(1/\epsilon^2)$ for a class of nonconvex-concave
5886 problem. We illustrate the practical efficiency of the stabilized GDA algorithm
5887 on robust training.
5888 </p>
5889 </description>
5890 </item>
5891 <item>
5892 <title>Recursive Random Contraction Revisited. (arXiv:2010.15770v1 [cs.DS])</title>
5893 <link>http://fr.arxiv.org/abs/2010.15770</link>
5894 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Karger_D/0/1/0/all/0/1">David R. Karger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Williamson_D/0/1/0/all/0/1">David P. Williamson</a></p>
5895
5896 <p>In this note, we revisit the recursive random contraction algorithm of Karger
5897 and Stein for finding a minimum cut in a graph. Our revisit is occasioned by a
5898 paper of Fox, Panigrahi, and Zhang which gives an extension of the Karger-Stein
5899 algorithm to minimum cuts and minimum $k$-cuts in hypergraphs. When specialized
5900 to the case of graphs, the algorithm is somewhat different than the original
5901 Karger-Stein algorithm. We show that the analysis becomes particularly clean in
5902 this case: we can prove that the probability that a fixed minimum cut in an $n$
5903 node graph is returned by the algorithm is bounded below by $1/(2H_n-2)$, where
5904 $H_n$ is the $n$th harmonic number. We also consider other similar variants of
5905 the algorithm, and show that no such algorithm can achieve an asymptotically
5906 better probability of finding a fixed minimum cut.
5907 </p>
5908 </description>
5909 </item>
5910 <item>
5911 <title>GANs & Reels: Creating Irish Music using a Generative Adversarial Network. (arXiv:2010.15772v1 [cs.SD])</title>
5912 <link>http://fr.arxiv.org/abs/2010.15772</link>
5913 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kolokolova_A/0/1/0/all/0/1">Antonina Kolokolova</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Billard_M/0/1/0/all/0/1">Mitchell Billard</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bishop_R/0/1/0/all/0/1">Robert Bishop</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Elsisy_M/0/1/0/all/0/1">Moustafa Elsisy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Northcott_Z/0/1/0/all/0/1">Zachary Northcott</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Graves_L/0/1/0/all/0/1">Laura Graves</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nagisetty_V/0/1/0/all/0/1">Vineel Nagisetty</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Patey_H/0/1/0/all/0/1">Heather Patey</a></p>
5914
5915 <p>In this paper we present a method for algorithmic melody generation using a
5916 generative adversarial network without recurrent components. Music generation
5917 has been successfully done using recurrent neural networks, where the model
5918 learns sequence information that can help create authentic sounding melodies.
5919 Here, we use DC-GAN architecture with dilated convolutions and towers to
5920 capture sequential information as spatial image information, and learn
5921 long-range dependencies in fixed-length melody forms such as Irish traditional
5922 reel.
5923 </p>
5924 </description>
5925 </item>
5926 <item>
5927 <title>WaveTransform: Crafting Adversarial Examples via Input Decomposition. (arXiv:2010.15773v1 [cs.CV])</title>
5928 <link>http://fr.arxiv.org/abs/2010.15773</link>
5929 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Anshumaan_D/0/1/0/all/0/1">Divyam Anshumaan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Agarwal_A/0/1/0/all/0/1">Akshay Agarwal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vatsa_M/0/1/0/all/0/1">Mayank Vatsa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_R/0/1/0/all/0/1">Richa Singh</a></p>
5930
5931 <p>Frequency spectrum has played a significant role in learning unique and
5932 discriminating features for object recognition. Both low and high frequency
5933 information present in images have been extracted and learnt by a host of
5934 representation learning techniques, including deep learning. Inspired by this
5935 observation, we introduce a novel class of adversarial attacks, namely
5936 `WaveTransform', that creates adversarial noise corresponding to low-frequency
5937 and high-frequency subbands, separately (or in combination). The frequency
5938 subbands are analyzed using wavelet decomposition; the subbands are corrupted
5939 and then used to construct an adversarial example. Experiments are performed
5940 using multiple databases and CNN models to establish the effectiveness of the
5941 proposed WaveTransform attack and analyze the importance of a particular
5942 frequency component. The robustness of the proposed attack is also evaluated
5943 through its transferability and resiliency against a recent adversarial defense
5944 algorithm. Experiments show that the proposed attack is effective against the
5945 defense algorithm and is also transferable across CNNs.
5946 </p>
5947 </description>
5948 </item>
5949 <item>
5950 <title>Understanding the Failure Modes of Out-of-Distribution Generalization. (arXiv:2010.15775v1 [cs.LG])</title>
5951 <link>http://fr.arxiv.org/abs/2010.15775</link>
5952 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nagarajan_V/0/1/0/all/0/1">Vaishnavh Nagarajan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Andreassen_A/0/1/0/all/0/1">Anders Andreassen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Neyshabur_B/0/1/0/all/0/1">Behnam Neyshabur</a></p>
5953
5954 <p>Empirical studies suggest that machine learning models often rely on
5955 features, such as the background, that may be spuriously correlated with the
5956 label only during training time, resulting in poor accuracy during test-time.
5957 In this work, we identify the fundamental factors that give rise to this
5958 behavior, by explaining why models fail this way {\em even} in easy-to-learn
5959 tasks where one would expect these models to succeed. In particular, through a
5960 theoretical study of gradient-descent-trained linear classifiers on some
5961 easy-to-learn tasks, we uncover two complementary failure modes. These modes
5962 arise from how spurious correlations induce two kinds of skews in the data: one
5963 geometric in nature, and another, statistical in nature. Finally, we construct
5964 natural modifications of image classification datasets to understand when these
5965 failure modes can arise in practice. We also design experiments to isolate the
5966 two failure modes when training modern neural networks on these datasets.
5967 </p>
5968 </description>
5969 </item>
5970 <item>
5971 <title>Quantum advantage for differential equation analysis. (arXiv:2010.15776v1 [quant-ph])</title>
5972 <link>http://fr.arxiv.org/abs/2010.15776</link>
5973 <description><p>Authors: <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Kiani_B/0/1/0/all/0/1">Bobak T. Kiani</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Palma_G/0/1/0/all/0/1">Giacomo De Palma</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Englund_D/0/1/0/all/0/1">Dirk Englund</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Kaminsky_W/0/1/0/all/0/1">William Kaminsky</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Marvian_M/0/1/0/all/0/1">Milad Marvian</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Lloyd_S/0/1/0/all/0/1">Seth Lloyd</a></p>
5974
5975 <p>Quantum algorithms for both differential equation solving and for machine
5976 learning potentially offer an exponential speedup over all known classical
5977 algorithms. However, there also exist obstacles to obtaining this potential
5978 speedup in useful problem instances. The essential obstacle for quantum
5979 differential equation solving is that outputting useful information may require
5980 difficult post-processing, and the essential obstacle for quantum machine
5981 learning is that inputting the training set is a difficult task just by itself.
5982 In this paper, we demonstrate, when combined, these difficulties solve one
5983 another. We show how the output of quantum differential equation solving can
5984 serve as the input for quantum machine learning, allowing dynamical analysis in
5985 terms of principal components, power spectra, and wavelet decompositions. To
5986 illustrate this, we consider continuous time Markov processes on
5987 epidemiological and social networks. These quantum algorithms provide an
5988 exponential advantage over existing classical Monte Carlo methods.
5989 </p>
5990 </description>
5991 </item>
5992 <item>
5993 <title>Contextual BERT: Conditioning the Language Model Using a Global State. (arXiv:2010.15778v1 [cs.CL])</title>
5994 <link>http://fr.arxiv.org/abs/2010.15778</link>
5995 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Denk_T/0/1/0/all/0/1">Timo I. Denk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramallo_A/0/1/0/all/0/1">Ana Peleteiro Ramallo</a></p>
5996
5997 <p>BERT is a popular language model whose main pre-training task is to fill in
5998 the blank, i.e., predicting a word that was masked out of a sentence, based on
5999 the remaining words. In some applications, however, having an additional
6000 context can help the model make the right prediction, e.g., by taking the
6001 domain or the time of writing into account. This motivates us to advance the
6002 BERT architecture by adding a global state for conditioning on a fixed-sized
6003 context. We present our two novel approaches and apply them to an industry
6004 use-case, where we complete fashion outfits with missing articles, conditioned
6005 on a specific customer. An experimental comparison to other methods from the
6006 literature shows that our methods improve personalization significantly.
6007 </p>
6008 </description>
6009 </item>
6010 <item>
6011 <title>Stable and efficient Petrov-Galerkin methods for a kinetic Fokker-Planck equation. (arXiv:2010.15784v1 [math.NA])</title>
6012 <link>http://fr.arxiv.org/abs/2010.15784</link>
6013 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Brunken_J/0/1/0/all/0/1">Julia Brunken</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Smetana_K/0/1/0/all/0/1">Kathrin Smetana</a></p>
6014
6015 <p>We propose a stable Petrov-Galerkin discretization of a kinetic Fokker-Planck
6016 equation constructed in such a way that uniform inf-sup stability can be
6017 inferred directly from the variational formulation. Inspired by well-posedness
6018 results for parabolic equations, we derive a lower bound for the dual inf-sup
6019 constant of the Fokker-Planck bilinear form by means of stable pairs of trial
6020 and test functions. The trial function of such a pair is constructed by
6021 applying the kinetic transport operator and the inverse velocity
6022 Laplace-Beltrami operator to a given test function. For the Petrov-Galerkin
6023 projection we choose an arbitrary discrete test space and then define the
6024 discrete trial space using the same application of transport and inverse
6025 Laplace-Beltrami operator. As a result, the spaces replicate the stable pairs
6026 of the continuous level and we obtain a well-posed numerical method with a
6027 discrete inf-sup constant identical to the inf-sup constant of the continuous
6028 problem independently of the mesh size. We show how the specific basis
6029 functions can be efficiently computed by low-dimensional elliptic problems, and
6030 confirm the practicability and performance of the method for a numerical
6031 example.
6032 </p>
6033 </description>
6034 </item>
6035 <item>
6036 <title>Quickest detection of false data injection attack in remote state estimation. (arXiv:2010.15785v1 [eess.SY])</title>
6037 <link>http://fr.arxiv.org/abs/2010.15785</link>
6038 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Gupta_A/0/1/0/all/0/1">Akanshu Gupta</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sikdar_A/0/1/0/all/0/1">Abhinava Sikdar</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chattopadhyay_A/0/1/0/all/0/1">Arpan Chattopadhyay</a></p>
6039
6040 <p>In this paper, quickest detection of false data injection attack on remote
6041 state estimation is considered. A set of $N$ sensors make noisy linear
6042 observations of a discrete-time linear process with Gaussian noise, and report
6043 the observations to a remote estimator. The challenge is the presence of a few
6044 potentially malicious sensors which can start strategically manipulating their
6045 observations at a random time in order to skew the estimates. The quickest
6046 attack detection problem for a known linear attack scheme is posed as a
6047 constrained Markov decision process in order to minimise the expected detection
6048 delay subject to a false alarm constraint, with the state involving the
6049 probability belief at the estimator that the system is under attack. State
6050 transition probabilities are derived in terms of system parameters, and the
6051 structure of the optimal policy is derived analytically. It turns out that the
6052 optimal policy amounts to checking whether the probability belief exceeds a
6053 threshold. Numerical results demonstrate significant performance gain under the
6054 proposed algorithm against competing algorithms.
6055 </p>
6056 </description>
6057 </item>
6058 <item>
6059 <title>Light-Weight DDoS Mitigation at Network Edge with Limited Resources. (arXiv:2010.15786v1 [cs.NI])</title>
6060 <link>http://fr.arxiv.org/abs/2010.15786</link>
6061 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yaegashi_R/0/1/0/all/0/1">Ryo Yaegashi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hisano_D/0/1/0/all/0/1">Daisuke Hisano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nakayama_Y/0/1/0/all/0/1">Yu Nakayama</a></p>
6062
6063 <p>The Internet of Things (IoT) has been growing rapidly in recent years. With
6064 the appearance of 5G, it is expected to become even more indispensable to
6065 people's lives. In accordance with the increase of Distributed
6066 Denial-of-Service (DDoS) attacks from IoT devices, DDoS defense has become a
6067 hot research topic. DDoS detection mechanisms executed on routers and SDN
6068 environments have been intensely studied. However, these methods have the
6069 disadvantage of requiring the cost and performance of the devices. In addition,
6070 there is no existing DDoS mitigation algorithm on the network edge that can be
6071 performed with the low-cost and low performance equipments. Therefore, this
6072 paper proposes a light-weight DDoS mitigation scheme at the network edge using
6073 limited resources of inexpensive devices such as home gateways. The goal of the
6074 proposed scheme is to simply detect and mitigate flooding attacks. It utilizes
6075 unused queue resources to detect malicious flows by random shuffling of queue
6076 allocation and discard the packets of the detected flows. The performance of
6077 the proposed scheme was confirmed via theoretical analysis and computer
6078 simulation. The simulation results match the theoretical results and the
6079 proposed algorithm can efficiently detect malicious flows using limited
6080 resources.
6081 </p>
6082 </description>
6083 </item>
6084 <item>
6085 <title>A Framework for Learning Predator-prey Agents from Simulation to Real World. (arXiv:2010.15792v1 [cs.RO])</title>
6086 <link>http://fr.arxiv.org/abs/2010.15792</link>
6087 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_J/0/1/0/all/0/1">Jiunhan Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gao_Z/0/1/0/all/0/1">Zhenyu Gao</a></p>
6088
6089 <p>In this paper, we propose an evolutionary predatorprey robot system which can
6090 be generally implemented from simulation to the real world. We design the
6091 closed-loop robot system with camera and infrared sensors as inputs of
6092 controller. Both the predators and prey are co-evolved by NeuroEvolution of
6093 Augmenting Topologies (NEAT) to learn the expected behaviours. We design a
6094 framework that integrate Gym of OpenAI, Robot Operating System (ROS), Gazebo.
6095 In such a framework, users only need to focus on algorithms without being
6096 worried about the detail of manipulating robots in both simulation and the real
6097 world. Combining simulations, real-world evolution, and robustness analysis, it
6098 can be applied to develop the solutions for the predator-prey tasks. For the
6099 convenience of users, the source code and videos of the simulated and real
6100 world are published on Github.
6101 </p>
6102 </description>
6103 </item>
6104 <item>
6105 <title>A computational periporomechanics model for localized failure in unsaturated porous media. (arXiv:2010.15793v1 [math.NA])</title>
6106 <link>http://fr.arxiv.org/abs/2010.15793</link>
6107 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Menon_S/0/1/0/all/0/1">Shashank Menon</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Song_X/0/1/0/all/0/1">Xiaoyu Song</a></p>
6108
6109 <p>We implement a computational periporomechanics model for simulating localized
6110 failure in unsaturated porous media. The coupled periporomechanics model is
6111 based on the peridynamic state concept and the effective force state concept.
6112 The coupled governing equations are integral-differential equations without
6113 assuming the continuity of solid displacement and fluid pressures. The fluid
6114 flow and effective force states are determined by nonlocal fluid pressure and
6115 deformation gradients through the recently formulated multiphase constitutive
6116 correspondence principle. The coupled peri-poromechanics is implemented
6117 numerically for high-performance computing by an implicit multiphase meshfree
6118 method utilizing the message passing interface. The numerical implementation is
6119 validated by simulating classical poromechanics problems and comparing the
6120 numerical results with analytical solutions and experimental data. Numerical
6121 examples are presented to demonstrate the robustness of the fully coupled
6122 peri-poromechanics in modeling localized failures in unsaturated porous media.
6123 </p>
6124 </description>
6125 </item>
6126 <item>
6127 <title>Eccentricity queries and beyond using Hub Labels. (arXiv:2010.15794v1 [cs.DS])</title>
6128 <link>http://fr.arxiv.org/abs/2010.15794</link>
6129 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ducoffe_G/0/1/0/all/0/1">Guillaume Ducoffe</a></p>
6130
6131 <p>Hub labeling schemes are popular methods for computing distances on road
6132 networks and other large complex networks, often answering to a query within a
6133 few microseconds for graphs with millions of edges. In this work, we study
6134 their algorithmic applications beyond distance queries. We focus on
6135 eccentricity queries and distance-sum queries, for several versions of these
6136 problems on directed weighted graphs, that is in part motivated by their
6137 importance in facility location problems. On the negative side, we show
6138 conditional lower bounds for these above problems on unweighted undirected
6139 sparse graphs, via standard constructions from "Fine-grained" complexity.
6140 However, things take a different turn when the hub labels have a sublogarithmic
6141 size. Indeed, given a hub labeling of maximum label size $\leq k$, after
6142 pre-processing the labels in total $2^{{O}(k)} \cdot |V|^{1+o(1)}$ time, we can
6143 compute both the eccentricity and the distance-sum of any vertex in $2^{{O}(k)}
6144 \cdot |V|^{o(1)}$ time. It can also be applied to the fast global computation
6145 of some topological indices. Finally, as a by-product of our approach, on any
6146 fixed class of unweighted graphs with bounded expansion, we can decide whether
6147 the diameter of an $n$-vertex graph in the class is at most $k$ in $f(k) \cdot
6148 n^{1+o(1)}$ time, for some "explicit" function $f$.
6149 </p>
6150 </description>
6151 </item>
6152 <item>
6153 <title>Ray-marching Thurston geometries. (arXiv:2010.15801v1 [math.GT])</title>
6154 <link>http://fr.arxiv.org/abs/2010.15801</link>
6155 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Coulon_R/0/1/0/all/0/1">R&#xe9;mi Coulon</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Matsumoto_E/0/1/0/all/0/1">Elisabetta A. Matsumoto</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Segerman_H/0/1/0/all/0/1">Henry Segerman</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Trettel_S/0/1/0/all/0/1">Steve J. Trettel</a></p>
6156
6157 <p>We describe algorithms that produce accurate real-time interactive in-space
6158 views of the eight Thurston geometries using ray-marching. We give a
6159 theoretical framework for our algorithms, independent of the geometry involved.
6160 In addition to scenes within a geometry $X$, we also consider scenes within
6161 quotient manifolds and orbifolds $X / \Gamma$. We adapt the Phong lighting
6162 model to non-euclidean geometries. The most difficult part of this is the
6163 calculation of light intensity, which relates to the area density of geodesic
6164 spheres. We also give extensive practical details for each geometry.
6165 </p>
6166 </description>
6167 </item>
6168 <item>
6169 <title>Isometric embeddings in trees and their use in the diameter problem. (arXiv:2010.15803v1 [cs.DS])</title>
6170 <link>http://fr.arxiv.org/abs/2010.15803</link>
6171 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ducoffe_G/0/1/0/all/0/1">Guillaume Ducoffe</a></p>
6172
6173 <p>We prove that given a discrete space with $n$ points which is either embedded
6174 in a system of $k$ trees, or the Cartesian product of $k$ trees, we can compute
6175 all eccentricities in ${\cal O}(2^{{\cal O}(k\log{k})}(N+n)^{1+o(1)})$ time,
6176 where $N$ is the cumulative total order over all these $k$ trees. This is near
6177 optimal under the Strong Exponential-Time Hypothesis, even in the very special
6178 case of an $n$-vertex graph embedded in a system of $\omega(\log{n})$ spanning
6179 trees. However, given such an embedding in the strong product of $k$ trees,
6180 there is a much faster ${\cal O}(N + kn)$-time algorithm for this problem. All
6181 our positive results can be turned into approximation algorithms for the graphs
6182 and finite spaces with a quasi isometric embedding in trees, if such embedding
6183 is given as input, where the approximation factor (resp., the approximation
6184 constant) depends on the distortion of the embedding (resp., of its stretch).
6185 The existence of embeddings in the Cartesian product of finitely many trees has
6186 been thoroughly investigated for cube-free median graphs. We give the
6187 first-known quasi linear-time algorithm for computing the diameter within this
6188 graph class. It does not require an embedding in a product of trees to be given
6189 as part of the input. On our way, being given an $n$-node tree $T$, we propose
6190 a data structure with ${\cal O}(n\log{n})$ pre-processing time in order to
6191 compute in ${\cal O}(k\log^2{n})$ time the eccentricity of any subset of $k$
6192 nodes. We combine the latter technical contribution, of independent interest,
6193 with a recent distance-labeling scheme that was designed for cube-free median
6194 graphs.
6195 </p>
6196 </description>
6197 </item>
6198 <item>
6199 <title>A Local Search Framework for Experimental Design. (arXiv:2010.15805v1 [cs.DS])</title>
6200 <link>http://fr.arxiv.org/abs/2010.15805</link>
6201 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lau_L/0/1/0/all/0/1">Lap Chi Lau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_H/0/1/0/all/0/1">Hong Zhou</a></p>
6202
6203 <p>We present a local search framework to design and analyze both combinatorial
6204 algorithms and rounding algorithms for experimental design problems. This
6205 framework provides a unifying approach to match and improve all known results
6206 in D/A/E-design and to obtain new results in previously unknown settings.
6207 </p>
6208 <p>For combinatorial algorithms, we provide a new analysis of the classical
6209 Fedorov's exchange method. We prove that this simple local search algorithm
6210 works well as long as there exists an almost optimal solution with good
6211 condition number. Moreover, we design a new combinatorial local search
6212 algorithm for E-design using the regret minimization framework.
6213 </p>
6214 <p>For rounding algorithms, we provide a unified randomized exchange algorithm
6215 to match and improve previous results for D/A/E-design. Furthermore, the
6216 algorithm works in the more general setting to approximately satisfy multiple
6217 knapsack constraints, which can be used for weighted experimental design and
6218 for incorporating fairness constraints into experimental design.
6219 </p>
6220 </description>
6221 </item>
6222 <item>
6223 <title>The ins and outs of speaker recognition: lessons from VoxSRC 2020. (arXiv:2010.15809v1 [cs.SD])</title>
6224 <link>http://fr.arxiv.org/abs/2010.15809</link>
6225 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kwon_Y/0/1/0/all/0/1">Yoohwan Kwon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heo_H/0/1/0/all/0/1">Hee-Soo Heo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_B/0/1/0/all/0/1">Bong-Jin Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chung_J/0/1/0/all/0/1">Joon Son Chung</a></p>
6226
6227 <p>The VoxCeleb Speaker Recognition Challenge (VoxSRC) at Interspeech 2020
6228 offers a challenging evaluation for speaker recognition systems, which includes
6229 celebrities playing different parts in movies. The goal of this work is robust
6230 speaker recognition of utterances recorded in these challenging environments.
6231 We utilise variants of the popular ResNet architecture for speaker recognition
6232 and perform extensive experiments using a range of loss functions and training
6233 parameters. To this end, we optimise an efficient training framework that
6234 allows powerful models to be trained with limited time and resources. Our
6235 trained models demonstrate improvements over most existing works with lighter
6236 models and a simple pipeline. The paper shares the lessons learned from our
6237 participation in the challenge.
6238 </p>
6239 </description>
6240 </item>
6241 <item>
6242 <title>Algorithmic pure states for the negative spherical perceptron. (arXiv:2010.15811v1 [math.PR])</title>
6243 <link>http://fr.arxiv.org/abs/2010.15811</link>
6244 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Alaoui_A/0/1/0/all/0/1">Ahmed El Alaoui</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Sellke_M/0/1/0/all/0/1">Mark Sellke</a></p>
6245
6246 <p>We consider the spherical perceptron with Gaussian disorder. This is the set
6247 $S$ of points $\sigma \in \mathbb{R}^N$ on the sphere of radius $\sqrt{N}$
6248 satisfying $\langle g_a , \sigma \rangle \ge \kappa\sqrt{N}\,$ for all $1 \le a
6249 \le M$, where $(g_a)_{a=1}^M$ are independent standard gaussian vectors and
6250 $\kappa \in \mathbb{R}$ is fixed. Various characteristics of $S$ such as its
6251 surface measure and the largest $M$ for which it is non-empty, were computed
6252 heuristically in statistical physics in the asymptotic regime $N \to \infty$,
6253 $M/N \to \alpha$. The case $\kappa&lt;0$ is of special interest as $S$ is
6254 conjectured to exhibit a hierarchical tree-like geometry known as "full
6255 replica-symmetry breaking" (FRSB) close to the satisfiability threshold
6256 $\alpha_{\text{SAT}}(\kappa)$, and whose characteristics are captured by a
6257 Parisi variational principle akin to the one appearing in the
6258 Sherrington-Kirkpatrick model. In this paper we design an efficient algorithm
6259 which, given oracle access to the solution of the Parisi variational principle,
6260 exploits this conjectured FRSB structure for $\kappa&lt;0$ and outputs a vector
6261 $\hat{\sigma}$ satisfying $\langle g_a , \hat{\sigma}\rangle \ge \kappa
6262 \sqrt{N}$ for all $1\le a \le M$ and lying on a sphere of non-trivial radius
6263 $\sqrt{\bar{q} N}$, where $\bar{q} \in (0,1)$ is the right-end of the support
6264 of the associated Parisi measure. We expect $\hat{\sigma}$ to be approximately
6265 the barycenter of a pure state of the spherical perceptron. Moreover we expect
6266 that $\bar{q} \to 1$ as $\alpha \to \alpha_{\text{SAT}}(\kappa)$, so that
6267 $\big\langle g_a,\hat{\sigma}/|\hat{\sigma}|\big\rangle \geq
6268 (\kappa-o(1))\sqrt{N}$ near criticality.
6269 </p>
6270 </description>
6271 </item>
6272 <item>
6273 <title>Around the diameter of AT-free graphs. (arXiv:2010.15814v1 [cs.DS])</title>
6274 <link>http://fr.arxiv.org/abs/2010.15814</link>
6275 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ducoffe_G/0/1/0/all/0/1">Guillaume Ducoffe</a></p>
6276
6277 <p>A graph algorithm is truly subquadratic if it runs in ${\cal O}(m^b)$ time on
6278 connected $m$-edge graphs, for some positive $b &lt; 2$. Roditty and Vassilevska
6279 Williams (STOC'13) proved that under plausible complexity assumptions, there is
6280 no truly subquadratic algorithm for computing the diameter of general graphs.
6281 In this work, we present positive and negative results on the existence of such
6282 algorithms for computing the diameter on some special graph classes.
6283 Specifically, three vertices in a graph form an asteroidal triple (AT) if
6284 between any two of them there exists a path that avoids the closed
6285 neighbourhood of the third one. We call a graph AT-free if it does not contain
6286 an AT. We first prove that for all $m$-edge AT-free graphs, one can compute all
6287 the eccentricities in truly subquadratic ${\cal O}(m^{3/2})$ time. Then, we
6288 extend our study to several subclasses of chordal graphs -- all of them
6289 generalizing interval graphs in various ways --, as an attempt to understand
6290 which of the properties of AT-free graphs, or natural generalizations of the
6291 latter, can help in the design of fast algorithms for the diameter problem on
6292 broader graph classes. For instance, for all chordal graphs with a dominating
6293 shortest path, there is a linear-time algorithm for computing a diametral pair
6294 if the diameter is at least four. However, already for split graphs with a
6295 dominating edge, under plausible complexity assumptions, there is no truly
6296 subquadratic algorithm for deciding whether the diameter is either $2$ or $3$.
6297 </p>
6298 </description>
6299 </item>
6300 <item>
6301 <title>Tensor Completion via Tensor Networks with a Tucker Wrapper. (arXiv:2010.15819v1 [stat.ML])</title>
6302 <link>http://fr.arxiv.org/abs/2010.15819</link>
6303 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Cai_Y/0/1/0/all/0/1">Yunfeng Cai</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Li_P/0/1/0/all/0/1">Ping Li</a></p>
6304
6305 <p>In recent years, low-rank tensor completion (LRTC) has received considerable
6306 attention due to its applications in image/video inpainting, hyperspectral data
6307 recovery, etc. With different notions of tensor rank (e.g., CP, Tucker, tensor
6308 train/ring, etc.), various optimization based numerical methods are proposed to
6309 LRTC. However, tensor network based methods have not been proposed yet. In this
6310 paper, we propose to solve LRTC via tensor networks with a Tucker wrapper. Here
6311 by "Tucker wrapper" we mean that the outermost factor matrices of the tensor
6312 network are all orthonormal. We formulate LRTC as a problem of solving a system
6313 of nonlinear equations, rather than a constrained optimization problem. A
6314 two-level alternative least square method is then employed to update the
6315 unknown factors. The computation of the method is dominated by tensor matrix
6316 multiplications and can be efficiently performed. Also, under proper
6317 assumptions, it is shown that with high probability, the method converges to
6318 the exact solution at a linear rate. Numerical simulations show that the
6319 proposed algorithm is comparable with state-of-the-art methods.
6320 </p>
6321 </description>
6322 </item>
6323 <item>
6324 <title>Down the bot hole: actionable insights from a 1-year analysis of bots activity on Twitter. (arXiv:2010.15820v1 [cs.SI])</title>
6325 <link>http://fr.arxiv.org/abs/2010.15820</link>
6326 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Luceri_L/0/1/0/all/0/1">Luca Luceri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cardoso_F/0/1/0/all/0/1">Felipe Cardoso</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Giordano_S/0/1/0/all/0/1">Silvia Giordano</a></p>
6327
6328 <p>Nowadays, social media represent persuasive tools that have been
6329 progressively weaponized to affect people's beliefs, spread manipulative
6330 narratives, and sow conflicts along divergent factions. Software-controlled
6331 accounts (i.e., bots) are one of the main actors associated with manipulation
6332 campaigns, especially in the political context. Uncovering the strategies
6333 behind bots' activities is of paramount importance to detect and curb such
6334 campaigns. In this paper, we present a long term (one year) analysis of bots
6335 activity on Twitter in the run-up to the 2018 U.S. Midterm Elections. We
6336 identify different classes of accounts based on their nature (bot vs. human)
6337 and engagement within the online discussion and we observe that hyperactive
6338 bots played a pivotal role in the dissemination of conspiratorial narratives,
6339 while dominating the political debate since the year before the election. Our
6340 analysis, on the horizon of the upcoming U.S. 2020 Presidential Election,
6341 reveals both alarming findings of humans' susceptibility to bots and actionable
6342 insights that can contribute to curbing coordinated campaigns.
6343 </p>
6344 </description>
6345 </item>
6346 <item>
6347 <title>Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search. (arXiv:2010.15821v1 [cs.CV])</title>
6348 <link>http://fr.arxiv.org/abs/2010.15821</link>
6349 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_H/0/1/0/all/0/1">Houwen Peng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Du_H/0/1/0/all/0/1">Hao Du</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_H/0/1/0/all/0/1">Hongyuan Yu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qi Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liao_J/0/1/0/all/0/1">Jing Liao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_J/0/1/0/all/0/1">Jianlong Fu</a></p>
6350
6351 <p>One-shot weight sharing methods have recently drawn great attention in neural
6352 architecture search due to high efficiency and competitive performance.
6353 However, weight sharing across models has an inherent deficiency, i.e.,
6354 insufficient training of subnetworks in the hypernetwork. To alleviate this
6355 problem, we present a simple yet effective architecture distillation method.
6356 The central idea is that subnetworks can learn collaboratively and teach each
6357 other throughout the training process, aiming to boost the convergence of
6358 individual models. We introduce the concept of prioritized path, which refers
6359 to the architecture candidates exhibiting superior performance during training.
6360 Distilling knowledge from the prioritized paths is able to boost the training
6361 of subnetworks. Since the prioritized paths are changed on the fly depending on
6362 their performance and complexity, the final obtained paths are the cream of the
6363 crop. We directly select the most promising one from the prioritized paths as
6364 the final architecture, without using other complex search methods, such as
6365 reinforcement learning or evolution algorithms. The experiments on ImageNet
6366 verify such path distillation method can improve the convergence ratio and
6367 performance of the hypernetwork, as well as boosting the training of
6368 subnetworks. The discovered architectures achieve superior performance compared
6369 to the recent MobileNetV3 and EfficientNet families under aligned settings.
6370 Moreover, the experiments on object detection and more challenging search space
6371 show the generality and robustness of the proposed method. Code and models are
6372 available at https://github.com/microsoft/cream.git.
6373 </p>
6374 </description>
6375 </item>
6376 <item>
6377 <title>Black-Box Optimization of Object Detector Scales. (arXiv:2010.15823v1 [cs.CV])</title>
6378 <link>http://fr.arxiv.org/abs/2010.15823</link>
6379 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Muthuraja_M/0/1/0/all/0/1">Mohandass Muthuraja</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Arriaga_O/0/1/0/all/0/1">Octavio Arriaga</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ploger_P/0/1/0/all/0/1">Paul Pl&#xf6;ger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kirchner_F/0/1/0/all/0/1">Frank Kirchner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Valdenegro_Toro_M/0/1/0/all/0/1">Matias Valdenegro-Toro</a></p>
6380
6381 <p>Object detectors have improved considerably in the last years by using
6382 advanced CNN architectures. However, many detector hyper-parameters are
6383 generally manually tuned, or they are used with values set by the detector
6384 authors. Automatic Hyper-parameter optimization has not been explored in
6385 improving CNN-based object detectors hyper-parameters. In this work, we propose
6386 the use of Black-box optimization methods to tune the prior/default box scales
6387 in Faster R-CNN and SSD, using Bayesian Optimization, SMAC, and CMA-ES. We show
6388 that by tuning the input image size and prior box anchor scale on Faster R-CNN
6389 mAP increases by 2% on PASCAL VOC 2007, and by 3% with SSD. On the COCO dataset
6390 with SSD there are mAP improvement in the medium and large objects, but mAP
6391 decreases by 1% in small objects. We also perform a regression analysis to find
6392 the significant hyper-parameters to tune.
6393 </p>
6394 </description>
6395 </item>
6396 <item>
6397 <title>Passport-aware Normalization for Deep Model Protection. (arXiv:2010.15824v1 [cs.CV])</title>
6398 <link>http://fr.arxiv.org/abs/2010.15824</link>
6399 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jie Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_D/0/1/0/all/0/1">Dongdong Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liao_J/0/1/0/all/0/1">Jing Liao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_W/0/1/0/all/0/1">Weiming Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hua_G/0/1/0/all/0/1">Gang Hua</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_N/0/1/0/all/0/1">Nenghai Yu</a></p>
6400
6401 <p>Despite tremendous success in many application scenarios, deep learning faces
6402 serious intellectual property (IP) infringement threats. Considering the cost
6403 of designing and training a good model, infringements will significantly
6404 infringe the interests of the original model owner. Recently, many impressive
6405 works have emerged for deep model IP protection. However, they either are
6406 vulnerable to ambiguity attacks, or require changes in the target network
6407 structure by replacing its original normalization layers and hence cause
6408 significant performance drops. To this end, we propose a new passport-aware
6409 normalization formulation, which is generally applicable to most existing
6410 normalization layers and only needs to add another passport-aware branch for IP
6411 protection. This new branch is jointly trained with the target model but
6412 discarded in the inference stage. Therefore it causes no structure change in
6413 the target model. Only when the model IP is suspected to be stolen by someone,
6414 the private passport-aware branch is added back for ownership verification.
6415 Through extensive experiments, we verify its effectiveness in both image and 3D
6416 point recognition models. It is demonstrated to be robust not only to common
6417 attack techniques like fine-tuning and model compression, but also to ambiguity
6418 attacks. By further combining it with trigger-set based methods, both black-box
6419 and white-box verification can be achieved for enhanced security of deep
6420 learning models deployed in real systems. Code can be found at
6421 https://github.com/ZJZAC/Passport-aware-Normalization.
6422 </p>
6423 </description>
6424 </item>
6425 <item>
6426 <title>RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder. (arXiv:2010.15831v1 [cs.CV])</title>
6427 <link>http://fr.arxiv.org/abs/2010.15831</link>
6428 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chi_C/0/1/0/all/0/1">Cheng Chi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_F/0/1/0/all/0/1">Fangyun Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_H/0/1/0/all/0/1">Han Hu</a></p>
6429
6430 <p>Existing object detection frameworks are usually built on a single format of
6431 object/part representation, i.e., anchor/proposal rectangle boxes in RetinaNet
6432 and Faster R-CNN, center points in FCOS and RepPoints, and corner points in
6433 CornerNet. While these different representations usually drive the frameworks
6434 to perform well in different aspects, e.g., better classification or finer
6435 localization, it is in general difficult to combine these representations in a
6436 single framework to make good use of each strength, due to the heterogeneous or
6437 non-grid feature extraction by different representations. This paper presents
6438 an attention-based decoder module similar as that in
6439 Transformer~\cite{vaswani2017attention} to bridge other representations into a
6440 typical object detector built on a single representation format, in an
6441 end-to-end fashion. The other representations act as a set of \emph{key}
6442 instances to strengthen the main \emph{query} representation features in the
6443 vanilla detectors. Novel techniques are proposed towards efficient computation
6444 of the decoder module, including a \emph{key sampling} approach and a
6445 \emph{shared location embedding} approach. The proposed module is named
6446 \emph{bridging visual representations} (BVR). It can perform in-place and we
6447 demonstrate its broad effectiveness in bridging other representations into
6448 prevalent object detection frameworks, including RetinaNet, Faster R-CNN, FCOS
6449 and ATSS, where about $1.5\sim3.0$ AP improvements are achieved. In particular,
6450 we improve a state-of-the-art framework with a strong backbone by about $2.0$
6451 AP, reaching $52.7$ AP on COCO test-dev. The resulting network is named
6452 RelationNet++. The code will be available at
6453 https://github.com/microsoft/RelationNet2.
6454 </p>
6455 </description>
6456 </item>
6457 <item>
6458 <title>Proceedings 9th International Workshop on Theorem Proving Components for Educational Software. (arXiv:2010.15832v1 [cs.AI])</title>
6459 <link>http://fr.arxiv.org/abs/2010.15832</link>
6460 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Quaresma_P/0/1/0/all/0/1">Pedro Quaresma</a> (University of Coimbra, Portugal), <a href="http://fr.arxiv.org/find/cs/1/au:+Neuper_W/0/1/0/all/0/1">Walther Neuper</a> (JKU Johannes Kepler University, Linz, Austria), <a href="http://fr.arxiv.org/find/cs/1/au:+Marcos_J/0/1/0/all/0/1">Jo&#xe3;o Marcos</a> (UFRN, Brazil)</p>
6461
6462 <p>The 9th International Workshop on Theorem-Proving Components for Educational
6463 Software (ThEdu'20) was scheduled to happen on June 29 as a satellite of the
6464 IJCAR-FSCD 2020 joint meeting, in Paris. The COVID-19 pandemic came by
6465 surprise, though, and the main conference was virtualised. Fearing that an
6466 online meeting would not allow our community to fully reproduce the usual
6467 face-to-face networking opportunities of the ThEdu initiative, the Steering
6468 Committee of ThEdu decided to cancel our workshop. Given that many of us had
6469 already planned and worked for that moment, we decided that ThEdu'20 could
6470 still live in the form of an EPTCS volume. The EPTCS concurred with us,
6471 recognising this very singular situation, and accepted our proposal of
6472 organising a special issue with papers submitted to ThEdu'20. An open call for
6473 papers was then issued, and attracted five submissions, all of which have been
6474 accepted by our reviewers, who produced three careful reports on each of the
6475 contributions. The resulting revised papers are collected in the present
6476 volume. We, the volume editors, hope that this collection of papers will help
6477 further promoting the development of theorem-proving-based software, and that
6478 it will collaborate to improve the mutual understanding between computer
6479 mathematicians and stakeholders in education. With some luck, we would actually
6480 expect that the very special circumstances set up by the worst sanitary crisis
6481 in a century will happen to reinforce the need for the application of certified
6482 components and of verification methods for the production of educational
6483 software that would be available even when the traditional on-site learning
6484 experiences turn out not to be recommendable.
6485 </p>
6486 </description>
6487 </item>
6488 <item>
6489 <title>Property Checking Without Invariant Generation. (arXiv:1602.05829v3 [cs.LO] UPDATED)</title>
6490 <link>http://fr.arxiv.org/abs/1602.05829</link>
6491 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goldberg_E/0/1/0/all/0/1">Eugene Goldberg</a></p>
6492
6493 <p>We introduce a procedure for proving safety properties. This procedure is
6494 based on a technique called Partial Quantifier Elimination (PQE). In contrast
6495 to complete quantifier elimination, in PQE, only a part of the formula is taken
6496 out of the scope of quantifiers. So, PQE can be dramatically more efficient
6497 than complete quantifier elimination. The appeal of our procedure is twofold.
6498 First, it can prove a property without generating an inductive invariant.
6499 Second, it employs depth-first search and so can be used to find deep bugs.
6500 </p>
6501 </description>
6502 </item>
6503 <item>
6504 <title>Minimax Rate-Optimal Estimation of Divergences between Discrete Distributions. (arXiv:1605.09124v4 [cs.IT] UPDATED)</title>
6505 <link>http://fr.arxiv.org/abs/1605.09124</link>
6506 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Han_Y/0/1/0/all/0/1">Yanjun Han</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jiao_J/0/1/0/all/0/1">Jiantao Jiao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weissman_T/0/1/0/all/0/1">Tsachy Weissman</a></p>
6507
6508 <p>We study the minimax estimation of $\alpha$-divergences between discrete
6509 distributions for integer $\alpha\ge 1$, which include the Kullback--Leibler
6510 divergence and the $\chi^2$-divergences as special examples. Dropping the usual
6511 theoretical tricks to acquire independence, we construct the first minimax
6512 rate-optimal estimator which does not require any Poissonization, sample
6513 splitting, or explicit construction of approximating polynomials. The estimator
6514 uses a hybrid approach which solves a problem-independent linear program based
6515 on moment matching in the non-smooth regime, and applies a problem-dependent
6516 bias-corrected plug-in estimator in the smooth regime, with a soft decision
6517 boundary between these regimes.
6518 </p>
6519 </description>
6520 </item>
6521 <item>
6522 <title>Sequence Graph Transform (SGT): A Feature Embedding Function for Sequence Data Mining. (arXiv:1608.03533v13 [stat.ML] UPDATED)</title>
6523 <link>http://fr.arxiv.org/abs/1608.03533</link>
6524 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Ranjan_C/0/1/0/all/0/1">Chitta Ranjan</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Ebrahimi_S/0/1/0/all/0/1">Samaneh Ebrahimi</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Paynabar_K/0/1/0/all/0/1">Kamran Paynabar</a></p>
6525
6526 <p>Sequence feature embedding is a challenging task due to un-structuredness of
6527 sequences -- arbitrary strings of arbitrary length. Existing methods are
6528 efficient in extracting short-term dependencies but typically suffer from
6529 computation issues for the long-term. Sequence Graph Transform (SGT), a feature
6530 embedding function, that can extract any amount of short- to long- term
6531 dependencies without increasing the computation -- proved theoretically -- is
6532 proposed. SGT features yield significantly superior results in sequence
6533 clustering and classification with higher accuracy and lower computation as
6534 compared to the existing methods, including the state-of-the-art
6535 sequence/string Kernels and LSTM.
6536 </p>
6537 </description>
6538 </item>
6539 <item>
6540 <title>Time-Space Trade-Offs for Computing Euclidean Minimum Spanning Trees. (arXiv:1712.06431v3 [cs.CG] UPDATED)</title>
6541 <link>http://fr.arxiv.org/abs/1712.06431</link>
6542 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Banyassady_B/0/1/0/all/0/1">Bahareh Banyassady</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Barba_L/0/1/0/all/0/1">Luis Barba</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mulzer_W/0/1/0/all/0/1">Wolfgang Mulzer</a></p>
6543
6544 <p>We present time-space trade-offs for computing the Euclidean minimum spanning
6545 tree of a set $S$ of $n$ point-sites in the plane. More precisely, we assume
6546 that $S$ resides in a random-access memory that can only be read. The edges of
6547 the Euclidean minimum spanning tree $\text{EMST}(S)$ have to be reported
6548 sequentially, and they cannot be accessed or modified afterwards. There is a
6549 parameter $s \in \{1, \dots, n\}$ so that the algorithm may use $O(s)$ cells of
6550 read-write memory (called the workspace) for its computations. Our goal is to
6551 find an algorithm that has the best possible running time for any given $s$
6552 between $1$ and $n$.
6553 </p>
6554 <p>We show how to compute $\text{EMST}(S)$ in $O\big((n^3/s^2)\log s \big)$ time
6555 with $O(s)$ cells of workspace, giving a smooth trade-off between the two best
6556 known bounds $O(n^3)$ for $s = 1$ and $O(n \log n)$ for $s = n$. For this, we
6557 run Kruskal's algorithm on the relative neighborhood graph (RNG) of $S$. It is
6558 a classic fact that the minimum spanning tree of $\text{RNG}(S)$ is exactly
6559 $\text{EMST}(S)$. To implement Kruskal's algorithm with $O(s)$ cells of
6560 workspace, we define $s$-nets, a compact representation of planar graphs. This
6561 allows us to efficiently maintain and update the components of the current
6562 minimum spanning forest as the edges are being inserted.
6563 </p>
6564 </description>
6565 </item>
6566 <item>
6567 <title>Type-two polynomial-time and restricted lookahead. (arXiv:1801.07485v2 [cs.CC] UPDATED)</title>
6568 <link>http://fr.arxiv.org/abs/1801.07485</link>
6569 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kapron_B/0/1/0/all/0/1">Bruce M. Kapron</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Steinberg_F/0/1/0/all/0/1">Florian Steinberg</a></p>
6570
6571 <p>This paper provides an alternate characterization of type-two polynomial-time
6572 computability, with the goal of making second-order complexity theory more
6573 approachable. We rely on the usual oracle machines to model programs with
6574 subroutine calls. In contrast to previous results, the use of higher-order
6575 objects as running times is avoided, either explicitly or implicitly. Instead,
6576 regular polynomials are used. This is achieved by refining the notion of
6577 oracle-polynomial-time introduced by Cook. We impose a further restriction on
6578 the oracle interactions to force feasibility. Both the restriction as well as
6579 its purpose are very simple: it is well-known that Cook's model allows
6580 polynomial depth iteration of functional inputs with no restrictions on size,
6581 and thus does not guarantee that polynomial-time computability is preserved. To
6582 mend this we restrict the number of lookahead revisions, that is the number of
6583 times a query can be asked that is bigger than any of the previous queries. We
6584 prove that this leads to a class of feasible functionals and that all feasible
6585 problems can be solved within this class if one is allowed to separate a task
6586 into efficiently solvable subtasks. Formally put: the closure of our class
6587 under lambda-abstraction and application includes all feasible operations. We
6588 also revisit the very similar class of strongly polynomial-time computable
6589 operators previously introduced by Kawamura and Steinberg. We prove it to be
6590 strictly included in our class and, somewhat surprisingly, to have the same
6591 closure property. This can be attributed to properties of the limited recursion
6592 operator: It is not strongly polynomial-time computable but decomposes into two
6593 such operations and lies in our class.
6594 </p>
6595 </description>
6596 </item>
6597 <item>
6598 <title>Comparing Type Systems for Deadlock Freedom. (arXiv:1810.00635v3 [cs.LO] UPDATED)</title>
6599 <link>http://fr.arxiv.org/abs/1810.00635</link>
6600 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dardha_O/0/1/0/all/0/1">Ornela Dardha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Perez_J/0/1/0/all/0/1">Jorge A. P&#xe9;rez</a></p>
6601
6602 <p>Message-passing software systems exhibit non-trivial forms of concurrency and
6603 distribution; they are expected to follow intended protocols among
6604 communicating services, but also to never "get stuck". This intuitive
6605 requirement has been expressed by liveness properties such as progress or
6606 (dead)lock freedom and various type systems ensure these properties for
6607 concurrent processes. Unfortunately, very little is known about the precise
6608 relationship between these type systems and the classes of typed processes they
6609 induce.
6610 </p>
6611 <p>This paper puts forward the first comparative study of different type systems
6612 for message-passing processes that guarantee deadlock freedom. We compare two
6613 classes of deadlock-free typed processes, here denoted L and K. The class L
6614 stands out for its canonicity: it results from Curry-Howard interpretations of
6615 linear logic propositions as session types. The class K, obtained by encoding
6616 session types into Kobayashi's linear types with usages, includes processes not
6617 typable in other type systems. We show that L is strictly included in K, and
6618 identify the precise conditions under which they coincide. We also provide two
6619 type-preserving translations of processes in K into processes in L.
6620 </p>
6621 </description>
6622 </item>
6623 <item>
6624 <title>AADS: Augmented Autonomous Driving Simulation using Data-driven Algorithms. (arXiv:1901.07849v3 [cs.CV] UPDATED)</title>
6625 <link>http://fr.arxiv.org/abs/1901.07849</link>
6626 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_W/0/1/0/all/0/1">Wei Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pan_C/0/1/0/all/0/1">Chengwei Pan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_R/0/1/0/all/0/1">Rong Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_J/0/1/0/all/0/1">Jiaping Ren</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_Y/0/1/0/all/0/1">Yuexin Ma</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fang_J/0/1/0/all/0/1">Jin Fang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yan_F/0/1/0/all/0/1">Feilong Yan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Geng_Q/0/1/0/all/0/1">Qichuan Geng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_X/0/1/0/all/0/1">Xinyu Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gong_H/0/1/0/all/0/1">Huajun Gong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_W/0/1/0/all/0/1">Weiwei Xu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_G/0/1/0/all/0/1">Guoping Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Manocha_D/0/1/0/all/0/1">Dinesh Manocha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_R/0/1/0/all/0/1">Ruigang Yang</a></p>
6627
6628 <p>Simulation systems have become an essential component in the development and
6629 validation of autonomous driving technologies. The prevailing state-of-the-art
6630 approach for simulation is to use game engines or high-fidelity computer
6631 graphics (CG) models to create driving scenarios. However, creating CG models
6632 and vehicle movements (e.g., the assets for simulation) remains a manual task
6633 that can be costly and time-consuming. In addition, the fidelity of CG images
6634 still lacks the richness and authenticity of real-world images and using these
6635 images for training leads to degraded performance.
6636 </p>
6637 <p>In this paper we present a novel approach to address these issues: Augmented
6638 Autonomous Driving Simulation (AADS). Our formulation augments real-world
6639 pictures with a simulated traffic flow to create photo-realistic simulation
6640 images and renderings. More specifically, we use LiDAR and cameras to scan
6641 street scenes. From the acquired trajectory data, we generate highly plausible
6642 traffic flows for cars and pedestrians and compose them into the background.
6643 The composite images can be re-synthesized with different viewpoints and sensor
6644 models. The resulting images are photo-realistic, fully annotated, and ready
6645 for end-to-end training and testing of autonomous driving systems from
6646 perception to planning. We explain our system design and validate our
6647 algorithms with a number of autonomous driving tasks from detection to
6648 segmentation and predictions.
6649 </p>
6650 <p>Compared to traditional approaches, our method offers unmatched scalability
6651 and realism. Scalability is particularly important for AD simulation and we
6652 believe the complexity and diversity of the real world cannot be realistically
6653 captured in a virtual environment. Our augmented approach combines the
6654 flexibility in a virtual environment (e.g., vehicle movements) with the
6655 richness of the real world to allow effective simulation of anywhere on earth.
6656 </p>
6657 </description>
6658 </item>
6659 <item>
6660 <title>Mockingbird: Defending Against Deep-Learning-Based Website Fingerprinting Attacks with Adversarial Traces. (arXiv:1902.06626v5 [cs.CR] UPDATED)</title>
6661 <link>http://fr.arxiv.org/abs/1902.06626</link>
6662 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rahman_M/0/1/0/all/0/1">Mohammad Saidur Rahman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Imani_M/0/1/0/all/0/1">Mohsen Imani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mathews_N/0/1/0/all/0/1">Nate Mathews</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wright_M/0/1/0/all/0/1">Matthew Wright</a></p>
6663
6664 <p>Website Fingerprinting (WF) is a type of traffic analysis attack that enables
6665 a local passive eavesdropper to infer the victim's activity, even when the
6666 traffic is protected by a VPN or an anonymity system like Tor. Leveraging a
6667 deep-learning classifier, a WF attacker can gain over 98% accuracy on Tor
6668 traffic. In this paper, we explore a novel defense, Mockingbird, based on the
6669 idea of adversarial examples that have been shown to undermine machine-learning
6670 classifiers in other domains. Since the attacker gets to design and train his
6671 attack classifier based on the defense, we first demonstrate that at a
6672 straightforward technique for generating adversarial-example based traces fails
6673 to protect against an attacker using adversarial training for robust
6674 classification. We then propose Mockingbird, a technique for generating traces
6675 that resists adversarial training by moving randomly in the space of viable
6676 traces and not following more predictable gradients. The technique drops the
6677 accuracy of the state-of-the-art attack hardened with adversarial training from
6678 98% to 42-58% while incurring only 58% bandwidth overhead. The attack accuracy
6679 is generally lower than state-of-the-art defenses, and much lower when
6680 considering Top-2 accuracy, while incurring lower bandwidth overheads.
6681 </p>
6682 </description>
6683 </item>
6684 <item>
6685 <title>Global Optimality Guarantees For Policy Gradient Methods. (arXiv:1906.01786v2 [cs.LG] UPDATED)</title>
6686 <link>http://fr.arxiv.org/abs/1906.01786</link>
6687 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bhandari_J/0/1/0/all/0/1">Jalaj Bhandari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Russo_D/0/1/0/all/0/1">Daniel Russo</a></p>
6688
6689 <p>Policy gradients methods apply to complex, poorly understood, control
6690 problems by performing stochastic gradient descent over a parameterized class
6691 of polices. Unfortunately, even for simple control problems solvable by
6692 standard dynamic programming techniques, policy gradient algorithms face
6693 non-convex optimization problems and are widely understood to converge only to
6694 a stationary point. This work identifies structural properties -- shared by
6695 several classic control problems -- that ensure the policy gradient objective
6696 function has no suboptimal stationary points despite being non-convex. When
6697 these conditions are strengthened, this objective satisfies a
6698 Polyak-lojasiewicz (gradient dominance) condition that yields convergence
6699 rates. We also provide bounds on the optimality gap of any stationary point
6700 when some of these conditions are relaxed.
6701 </p>
6702 </description>
6703 </item>
6704 <item>
6705 <title>ATRW: A Benchmark for Amur Tiger Re-identification in the Wild. (arXiv:1906.05586v4 [cs.CV] UPDATED)</title>
6706 <link>http://fr.arxiv.org/abs/1906.05586</link>
6707 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_S/0/1/0/all/0/1">Shuyuan Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_J/0/1/0/all/0/1">Jianguo Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_H/0/1/0/all/0/1">Hanlin Tang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_R/0/1/0/all/0/1">Rui Qian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_W/0/1/0/all/0/1">Weiyao Lin</a></p>
6708
6709 <p>Monitoring the population and movements of endangered species is an important
6710 task to wildlife conversation. Traditional tagging methods do not scale to
6711 large populations, while applying computer vision methods to camera sensor data
6712 requires re-identification (re-ID) algorithms to obtain accurate counts and
6713 moving trajectory of wildlife. However, existing re-ID methods are largely
6714 targeted at persons and cars, which have limited pose variations and
6715 constrained capture environments. This paper tries to fill the gap by
6716 introducing a novel large-scale dataset, the Amur Tiger Re-identification in
6717 the Wild (ATRW) dataset. ATRW contains over 8,000 video clips from 92 Amur
6718 tigers, with bounding box, pose keypoint, and tiger identity annotations. In
6719 contrast to typical re-ID datasets, the tigers are captured in a diverse set of
6720 unconstrained poses and lighting conditions. We demonstrate with a set of
6721 baseline algorithms that ATRW is a challenging dataset for re-ID. Lastly, we
6722 propose a novel method for tiger re-identification, which introduces precise
6723 pose parts modeling in deep neural networks to handle large pose variation of
6724 tigers, and reaches notable performance improvement over existing re-ID
6725 methods. The dataset is public available at https://cvwc2019.github.io/ .
6726 </p>
6727 </description>
6728 </item>
6729 <item>
6730 <title>A Simple Local Minimal Intensity Prior and An Improved Algorithm for Blind Image Deblurring. (arXiv:1906.06642v5 [eess.IV] UPDATED)</title>
6731 <link>http://fr.arxiv.org/abs/1906.06642</link>
6732 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Wen_F/0/1/0/all/0/1">Fei Wen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ying_R/0/1/0/all/0/1">Rendong Ying</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_Y/0/1/0/all/0/1">Yipeng Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_P/0/1/0/all/0/1">Peilin Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Truong_T/0/1/0/all/0/1">Trieu-Kien Truong</a></p>
6733
6734 <p>Blind image deblurring is a long standing challenging problem in image
6735 processing and low-level vision. Recently, sophisticated priors such as dark
6736 channel prior, extreme channel prior, and local maximum gradient prior, have
6737 shown promising effectiveness. However, these methods are computationally
6738 expensive. Meanwhile, since these priors involved subproblems cannot be solved
6739 explicitly, approximate solution is commonly used, which limits the best
6740 exploitation of their capability. To address these problems, this work firstly
6741 proposes a simplified sparsity prior of local minimal pixels, namely patch-wise
6742 minimal pixels (PMP). The PMP of clear images is much more sparse than that of
6743 blurred ones, and hence is very effective in discriminating between clear and
6744 blurred images. Then, a novel algorithm is designed to efficiently exploit the
6745 sparsity of PMP in deblurring. The new algorithm flexibly imposes sparsity
6746 inducing on the PMP under the MAP framework rather than directly uses the half
6747 quadratic splitting algorithm. By this, it avoids non-rigorous approximation
6748 solution in existing algorithms, while being much more computationally
6749 efficient. Extensive experiments demonstrate that the proposed algorithm can
6750 achieve better practical stability compared with state-of-the-arts. In terms of
6751 deblurring quality, robustness and computational efficiency, the new algorithm
6752 is superior to state-of-the-arts. Code for reproducing the results of the new
6753 method is available at https://github.com/FWen/deblur-pmp.git.
6754 </p>
6755 </description>
6756 </item>
6757 <item>
6758 <title>Multi-type Resource Allocation with Partial Preferences. (arXiv:1906.06836v3 [cs.AI] UPDATED)</title>
6759 <link>http://fr.arxiv.org/abs/1906.06836</link>
6760 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_H/0/1/0/all/0/1">Haibin Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sikdar_S/0/1/0/all/0/1">Sujoy Sikdar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_X/0/1/0/all/0/1">Xiaoxi Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xia_L/0/1/0/all/0/1">Lirong Xia</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_Y/0/1/0/all/0/1">Yongzhi Cao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_H/0/1/0/all/0/1">Hanpin Wang</a></p>
6761
6762 <p>We propose multi-type probabilistic serial (MPS) and multi-type random
6763 priority (MRP) as extensions of the well known PS and RP mechanisms to the
6764 multi-type resource allocation problem (MTRA) with partial preferences. In our
6765 setting, there are multiple types of divisible items, and a group of agents who
6766 have partial order preferences over bundles consisting of one item of each
6767 type. We show that for the unrestricted domain of partial order preferences, no
6768 mechanism satisfies both sd-efficiency and sd-envy-freeness. Notwithstanding
6769 this impossibility result, our main message is positive: When agents'
6770 preferences are represented by acyclic CP-nets, MPS satisfies sd-efficiency,
6771 sd-envy-freeness, ordinal fairness, and upper invariance, while MRP satisfies
6772 ex-post-efficiency, sd-strategy-proofness, and upper invariance, recovering the
6773 properties of PS and RP.
6774 </p>
6775 </description>
6776 </item>
6777 <item>
6778 <title>Dimensional Reweighting Graph Convolutional Networks. (arXiv:1907.02237v3 [cs.LG] UPDATED)</title>
6779 <link>http://fr.arxiv.org/abs/1907.02237</link>
6780 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zou_X/0/1/0/all/0/1">Xu Zou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jia_Q/0/1/0/all/0/1">Qiuye Jia</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jianwei Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_C/0/1/0/all/0/1">Chang Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_H/0/1/0/all/0/1">Hongxia Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_J/0/1/0/all/0/1">Jie Tang</a></p>
6781
6782 <p>Graph Convolution Networks (GCNs) are becoming more and more popular for
6783 learning node representations on graphs. Though there exist various
6784 developments on sampling and aggregation to accelerate the training process and
6785 improve the performances, limited works focus on dealing with the dimensional
6786 information imbalance of node representations. To bridge the gap, we propose a
6787 method named Dimensional reweighting Graph Convolution Network (DrGCN). We
6788 theoretically prove that our DrGCN can guarantee to improve the stability of
6789 GCNs via mean field theory. Our dimensional reweighting method is very flexible
6790 and can be easily combined with most sampling and aggregation techniques for
6791 GCNs. Experimental results demonstrate its superior performances on several
6792 challenging transductive and inductive node classification benchmark datasets.
6793 Our DrGCN also outperforms existing models on an industrial-sized Alibaba
6794 recommendation dataset.
6795 </p>
6796 </description>
6797 </item>
6798 <item>
6799 <title>Lexical Simplification with Pretrained Encoders. (arXiv:1907.06226v5 [cs.CL] UPDATED)</title>
6800 <link>http://fr.arxiv.org/abs/1907.06226</link>
6801 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qiang_J/0/1/0/all/0/1">Jipeng Qiang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yun Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_Y/0/1/0/all/0/1">Yi Zhu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yuan_Y/0/1/0/all/0/1">Yunhao Yuan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_X/0/1/0/all/0/1">Xindong Wu</a></p>
6802
6803 <p>Lexical simplification (LS) aims to replace complex words in a given sentence
6804 with their simpler alternatives of equivalent meaning. Recently unsupervised
6805 lexical simplification approaches only rely on the complex word itself
6806 regardless of the given sentence to generate candidate substitutions, which
6807 will inevitably produce a large number of spurious candidates. We present a
6808 simple LS approach that makes use of the Bidirectional Encoder Representations
6809 from Transformers (BERT) which can consider both the given sentence and the
6810 complex word during generating candidate substitutions for the complex word.
6811 Specifically, we mask the complex word of the original sentence for feeding
6812 into the BERT to predict the masked token. The predicted results will be used
6813 as candidate substitutions. Despite being entirely unsupervised, experimental
6814 results show that our approach obtains obvious improvement compared with these
6815 baselines leveraging linguistic databases and parallel corpus, outperforming
6816 the state-of-the-art by more than 12 Accuracy points on three well-known
6817 benchmarks.
6818 </p>
6819 </description>
6820 </item>
6821 <item>
6822 <title>Cover and variable degeneracy. (arXiv:1907.06630v3 [math.CO] UPDATED)</title>
6823 <link>http://fr.arxiv.org/abs/1907.06630</link>
6824 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Lu_F/0/1/0/all/0/1">Fangyao Lu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Wang_Q/0/1/0/all/0/1">Qianqian Wang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Wang_T/0/1/0/all/0/1">Tao Wang</a></p>
6825
6826 <p>Let $f$ be a nonnegative integer valued function on the vertex set of a
6827 graph. A graph is {\bf strictly $f$-degenerate} if each nonempty subgraph
6828 $\Gamma$ has a vertex $v$ such that $\mathrm{deg}_{\Gamma}(v) &lt; f(v)$. In this
6829 paper, we define a new concept, strictly $f$-degenerate transversal, which
6830 generalizes list coloring, signed coloring, DP-coloring, $L$-forested-coloring,
6831 and $(f_{1}, f_{2}, \dots, f_{s})$-partition. A {\bf cover} of a graph $G$ is a
6832 graph $H$ with vertex set $V(H) = \bigcup_{v \in V(G)} X_{v}$, where $X_{v} =
6833 \{(v, 1), (v, 2), \dots, (v, s)\}$; the edge set $\mathscr{M} = \bigcup_{uv \in
6834 E(G)}\mathscr{M}_{uv}$, where $\mathscr{M}_{uv}$ is a matching between $X_{u}$
6835 and $X_{v}$. A vertex set $R \subseteq V(H)$ is a {\bf transversal} of $H$ if
6836 $|R \cap X_{v}| = 1$ for each $v \in V(G)$. A transversal $R$ is a {\bf
6837 strictly $f$-degenerate transversal} if $H[R]$ is strictly $f$-degenerate. The
6838 main result of this paper is a degree type result, which generalizes Brooks'
6839 theorem, Gallai's theorem, degree-choosable result, signed degree-colorable
6840 result, and DP-degree-colorable result. Similar to Borodin, Kostochka and
6841 Toft's variable degeneracy, this degree type result is also self-strengthening.
6842 We also give some structural results on critical graphs with respect to
6843 strictly $f$-degenerate transversal. Using these results, we can uniformly
6844 prove many new and known results. In the final section, we pose some open
6845 problems.
6846 </p>
6847 </description>
6848 </item>
6849 <item>
6850 <title>An Iterative Vertex Enumeration Method for Objective Space Based Vector Optimization Algorithms. (arXiv:1907.08813v2 [math.OC] UPDATED)</title>
6851 <link>http://fr.arxiv.org/abs/1907.08813</link>
6852 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Kaya_I/0/1/0/all/0/1">Irfan Caner Kaya</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ulus_F/0/1/0/all/0/1">Firdevs Ulus</a></p>
6853
6854 <p>An application area of vertex enumeration problem (VEP) is the usage within
6855 objective space based linear/convex {vector} optimization algorithms whose aim
6856 is to generate (an approximation of) the Pareto frontier. In such algorithms,
6857 VEP, which is defined in the objective space, is solved in each iteration and
6858 it has a special structure. Namely, the recession cone of the polyhedron to be
6859 generated is the {ordering} cone. We {consider and give a detailed description
6860 of} a vertex enumeration procedure, which iterates by calling a modified
6861 `double description (DD) method' that works for such unbounded polyhedrons. We
6862 employ this procedure as a function of an existing objective space based
6863 {vector} optimization algorithm (Algorithm 1); and test the performance of it
6864 for randomly generated linear multiobjective optimization problems. We compare
6865 the efficiency of this procedure with another existing DD method as well as
6866 with the current vertex enumeration subroutine of Algorithm 1. We observe that
6867 the modified procedure excels the others especially as the dimension of the
6868 vertex enumeration problem (the number of objectives of the corresponding
6869 multiobjective problem) increases.
6870 </p>
6871 </description>
6872 </item>
6873 <item>
6874 <title>Developing an Unsupervised Real-time Anomaly Detection Scheme for Time Series with Multi-seasonality. (arXiv:1908.01146v2 [cs.LG] UPDATED)</title>
6875 <link>http://fr.arxiv.org/abs/1908.01146</link>
6876 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_W/0/1/0/all/0/1">Wentai Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_L/0/1/0/all/0/1">Ligang He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_W/0/1/0/all/0/1">Weiwei Lin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Su_Y/0/1/0/all/0/1">Yi Su</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_Y/0/1/0/all/0/1">Yuhua Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maple_C/0/1/0/all/0/1">Carsten Maple</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jarvis_S/0/1/0/all/0/1">Stephen Jarvis</a></p>
6877
6878 <p>On-line detection of anomalies in time series is a key technique used in
6879 various event-sensitive scenarios such as robotic system monitoring, smart
6880 sensor networks and data center security. However, the increasing diversity of
6881 data sources and the variety of demands make this task more challenging than
6882 ever. Firstly, the rapid increase in unlabeled data means supervised learning
6883 is becoming less suitable in many cases. Secondly, a large portion of time
6884 series data have complex seasonality features. Thirdly, on-line anomaly
6885 detection needs to be fast and reliable. In light of this, we have developed a
6886 prediction-driven, unsupervised anomaly detection scheme, which adopts a
6887 backbone model combining the decomposition and the inference of time series
6888 data. Further, we propose a novel metric, Local Trend Inconsistency (LTI), and
6889 an efficient detection algorithm that computes LTI in a real-time manner and
6890 scores each data point robustly in terms of its probability of being anomalous.
6891 We have conducted extensive experimentation to evaluate our algorithm with
6892 several datasets from both public repositories and production environments. The
6893 experimental results show that our scheme outperforms existing representative
6894 anomaly detection algorithms in terms of the commonly used metric, Area Under
6895 Curve (AUC), while achieving the desired efficiency.
6896 </p>
6897 </description>
6898 </item>
6899 <item>
6900 <title>Cluster-based Distributed Augmented Lagrangian Algorithm for a Class of Constrained Convex Optimization Problems. (arXiv:1908.06634v3 [cs.MA] UPDATED)</title>
6901 <link>http://fr.arxiv.org/abs/1908.06634</link>
6902 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Moradian_H/0/1/0/all/0/1">Hossein Moradian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kia_S/0/1/0/all/0/1">Solmaz S. Kia</a></p>
6903
6904 <p>We propose a distributed solution for a constrained convex optimization
6905 problem over a network of clustered agents each consisted of a set of
6906 subagents. The communication range of the clustered agents is such that they
6907 can form a connected undirected graph topology. The total cost in this
6908 optimization problem is the sum of the local convex costs of the subagents of
6909 each cluster. We seek a minimizer of this cost subject to a set of affine
6910 equality constraints, and a set of affine inequality constraints specifying the
6911 bounds on the decision variables if such bounds exist. We design our
6912 distributed algorithm in a cluster-based framework which results in a
6913 significant reduction in communication and computation costs. Our proposed
6914 distributed solution is a novel continuous-time algorithm that is linked to the
6915 augmented Lagrangian approach. It converges asymptotically when the local cost
6916 functions are convex and exponentially when they are strongly convex and have
6917 Lipschitz gradients. Moreover, we use an $\epsilon$-exact penalty function to
6918 address the inequality constraints and derive an explicit lower bound on the
6919 penalty function weight to guarantee convergence to $\epsilon$-neighborhood of
6920 the global minimum value of the cost. A numerical example demonstrates our
6921 results.
6922 </p>
6923 </description>
6924 </item>
6925 <item>
6926 <title>Optimal Machine Intelligence at the Edge of Chaos. (arXiv:1909.05176v2 [cs.LG] UPDATED)</title>
6927 <link>http://fr.arxiv.org/abs/1909.05176</link>
6928 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_L/0/1/0/all/0/1">Ling Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_L/0/1/0/all/0/1">Lin Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lai_C/0/1/0/all/0/1">Choy Heng Lai</a></p>
6929
6930 <p>It has long been suggested that the biological brain operates at some
6931 critical point between two different phases, possibly order and chaos. Despite
6932 many indirect empirical evidence from the brain and analytical indication on
6933 simple neural networks, the foundation of this hypothesis on generic non-linear
6934 systems remains unclear. Here we develop a general theory that reveals the
6935 exact edge of chaos is the boundary between the chaotic phase and the
6936 (pseudo)periodic phase arising from Neimark-Sacker bifurcation. This edge is
6937 analytically determined by the asymptotic Jacobian norm values of the
6938 non-linear operator and influenced by the dimensionality of the system. The
6939 optimality at the edge of chaos is associated with the highest information
6940 transfer between input and output at this point similar to that of the logistic
6941 map. As empirical validations, our experiments on the various deep learning
6942 models in computer vision demonstrate the optimality of the models near the
6943 edge of chaos, and we observe that the state-of-art training algorithms push
6944 the models towards such edge as they become more accurate. We further
6945 establishes the theoretical understanding of deep learning model generalization
6946 through asymptotic stability.
6947 </p>
6948 </description>
6949 </item>
6950 <item>
6951 <title>Inverse Kinematics for Serial Kinematic Chains via Sum of Squares Optimization. (arXiv:1909.09318v3 [cs.RO] UPDATED)</title>
6952 <link>http://fr.arxiv.org/abs/1909.09318</link>
6953 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Maric_F/0/1/0/all/0/1">Filip Maric</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Giamou_M/0/1/0/all/0/1">Matthew Giamou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Khoubyarian_S/0/1/0/all/0/1">Soroush Khoubyarian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Petrovic_I/0/1/0/all/0/1">Ivan Petrovic</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kelly_J/0/1/0/all/0/1">Jonathan Kelly</a></p>
6954
6955 <p>Inverse kinematics is a fundamental problem for articulated robots: fast and
6956 accurate algorithms are needed for translating task-related workspace
6957 constraints and goals into feasible joint configurations. In general, inverse
6958 kinematics for serial kinematic chains is a difficult nonlinear problem, for
6959 which closed form solutions cannot be easily obtained. Therefore,
6960 computationally efficient numerical methods that can be adapted to a general
6961 class of manipulators are of great importance. % to motion planning and
6962 workspace generation tasks. In this paper, we use convex optimization
6963 techniques to solve the inverse kinematics problem with joint limit constraints
6964 for highly redundant serial kinematic chains with spherical joints in two and
6965 three dimensions. This is accomplished through a novel formulation of inverse
6966 kinematics as a nearest point problem, and with a fast sum of squares solver
6967 that exploits the sparsity of kinematic constraints for serial manipulators.
6968 Our method has the advantages of post-hoc certification of global optimality
6969 and a runtime that scales polynomialy with the number of degrees of freedom.
6970 Additionally, we prove that our convex relaxation leads to a globally optimal
6971 solution when certain conditions are met, and demonstrate empirically that
6972 these conditions are common and represent many practical instances. Finally, we
6973 provide an open source implementation of our algorithm.
6974 </p>
6975 </description>
6976 </item>
6977 <item>
6978 <title>Noisy Batch Active Learning with Deterministic Annealing. (arXiv:1909.12473v2 [cs.LG] UPDATED)</title>
6979 <link>http://fr.arxiv.org/abs/1909.12473</link>
6980 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_G/0/1/0/all/0/1">Gaurav Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sahu_A/0/1/0/all/0/1">Anit Kumar Sahu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_W/0/1/0/all/0/1">Wan-Yi Lin</a></p>
6981
6982 <p>We study the problem of training machine learning models incrementally with
6983 batches of samples annotated with noisy oracles. We select each batch of
6984 samples that are important and also diverse via clustering and importance
6985 sampling. More importantly, we incorporate model uncertainty into the sampling
6986 probability to compensate for poor estimation of the importance scores when the
6987 training data is too small to build a meaningful model. Experiments on
6988 benchmark image classification datasets (MNIST, SVHN, CIFAR10, and EMNIST) show
6989 improvement over existing active learning strategies. We introduce an extra
6990 denoising layer to deep networks to make active learning robust to label noises
6991 and show significant improvements.
6992 </p>
6993 </description>
6994 </item>
6995 <item>
6996 <title>Subspace Estimation from Unbalanced and Incomplete Data Matrices: $\ell_{2,\infty}$ Statistical Guarantees. (arXiv:1910.04267v4 [math.ST] UPDATED)</title>
6997 <link>http://fr.arxiv.org/abs/1910.04267</link>
6998 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Cai_C/0/1/0/all/0/1">Changxiao Cai</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Li_G/0/1/0/all/0/1">Gen Li</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Chi_Y/0/1/0/all/0/1">Yuejie Chi</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Poor_H/0/1/0/all/0/1">H. Vincent Poor</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Chen_Y/0/1/0/all/0/1">Yuxin Chen</a></p>
6999
7000 <p>This paper is concerned with estimating the column space of an unknown
7001 low-rank matrix $\boldsymbol{A}^{\star}\in\mathbb{R}^{d_{1}\times d_{2}}$,
7002 given noisy and partial observations of its entries. There is no shortage of
7003 scenarios where the observations -- while being too noisy to support faithful
7004 recovery of the entire matrix -- still convey sufficient information to enable
7005 reliable estimation of the column space of interest. This is particularly
7006 evident and crucial for the highly unbalanced case where the column dimension
7007 $d_{2}$ far exceeds the row dimension $d_{1}$, which is the focal point of the
7008 current paper. We investigate an efficient spectral method, which operates upon
7009 the sample Gram matrix with diagonal deletion. While this algorithmic idea has
7010 been studied before, we establish new statistical guarantees for this method in
7011 terms of both $\ell_{2}$ and $\ell_{2,\infty}$ estimation accuracy, which
7012 improve upon prior results if $d_{2}$ is substantially larger than $d_{1}$. To
7013 illustrate the effectiveness of our findings, we derive matching minimax lower
7014 bounds with respect to the noise levels, and develop consequences of our
7015 general theory for three applications of practical importance: (1) tensor
7016 completion from noisy data, (2) covariance estimation / principal component
7017 analysis with missing data, and (3) community recovery in bipartite graphs. Our
7018 theory leads to improved performance guarantees for all three cases.
7019 </p>
7020 </description>
7021 </item>
7022 <item>
7023 <title>ProxIQA: A Proxy Approach to Perceptual Optimization of Learned Image Compression. (arXiv:1910.08845v2 [eess.IV] UPDATED)</title>
7024 <link>http://fr.arxiv.org/abs/1910.08845</link>
7025 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Chen_L/0/1/0/all/0/1">Li-Heng Chen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bampis_C/0/1/0/all/0/1">Christos G. Bampis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_Z/0/1/0/all/0/1">Zhi Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Norkin_A/0/1/0/all/0/1">Andrey Norkin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bovik_A/0/1/0/all/0/1">Alan C. Bovik</a></p>
7026
7027 <p>The use of $\ell_p$ $(p=1,2)$ norms has largely dominated the measurement of
7028 loss in neural networks due to their simplicity and analytical properties.
7029 However, when used to assess the loss of visual information, these simple norms
7030 are not very consistent with human perception. Here, we describe a different
7031 "proximal" approach to optimize image analysis networks against quantitative
7032 perceptual models. Specifically, we construct a proxy network, broadly termed
7033 ProxIQA, which mimics the perceptual model while serving as a loss layer of the
7034 network. We experimentally demonstrate how this optimization framework can be
7035 applied to train an end-to-end optimized image compression network. By building
7036 on top of an existing deep image compression model, we are able to demonstrate
7037 a bitrate reduction of as much as $31\%$ over MSE optimization, given a
7038 specified perceptual quality (VMAF) level.
7039 </p>
7040 </description>
7041 </item>
7042 <item>
7043 <title>Federated Learning over Wireless Networks: Convergence Analysis and Resource Allocation. (arXiv:1910.13067v4 [cs.LG] UPDATED)</title>
7044 <link>http://fr.arxiv.org/abs/1910.13067</link>
7045 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dinh_C/0/1/0/all/0/1">Canh T. Dinh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_N/0/1/0/all/0/1">Nguyen H. Tran</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_M/0/1/0/all/0/1">Minh N. H. Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hong_C/0/1/0/all/0/1">Choong Seon Hong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bao_W/0/1/0/all/0/1">Wei Bao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zomaya_A/0/1/0/all/0/1">Albert Y. Zomaya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gramoli_V/0/1/0/all/0/1">Vincent Gramoli</a></p>
7046
7047 <p>There is an increasing interest in a fast-growing machine learning technique
7048 called Federated Learning, in which the model training is distributed over
7049 mobile user equipments (UEs), exploiting UEs' local computation and training
7050 data. Despite its advantages in data privacy-preserving, Federated Learning
7051 (FL) still has challenges in heterogeneity across UEs' data and physical
7052 resources. We first propose a FL algorithm which can handle the heterogeneous
7053 UEs' data challenge without further assumptions except strongly convex and
7054 smooth loss functions. We provide the convergence rate characterizing the
7055 trade-off between local computation rounds of UE to update its local model and
7056 global communication rounds to update the FL global model. We then employ the
7057 proposed FL algorithm in wireless networks as a resource allocation
7058 optimization problem that captures the trade-off between the FL convergence
7059 wall clock time and energy consumption of UEs with heterogeneous computing and
7060 power resources. Even though the wireless resource allocation problem of FL is
7061 non-convex, we exploit this problem's structure to decompose it into three
7062 sub-problems and analyze their closed-form solutions as well as insights to
7063 problem design. Finally, we illustrate the theoretical analysis for the new
7064 algorithm with Tensorflow experiments and extensive numerical results for the
7065 wireless resource allocation sub-problems. The experiment results not only
7066 verify the theoretical convergence but also show that our proposed algorithm
7067 outperforms the vanilla FedAvg algorithm in terms of convergence rate and
7068 testing accuracy.
7069 </p>
7070 </description>
7071 </item>
7072 <item>
7073 <title>Making the Best Use of Review Summary for Sentiment Analysis. (arXiv:1911.02711v2 [cs.CL] UPDATED)</title>
7074 <link>http://fr.arxiv.org/abs/1911.02711</link>
7075 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_S/0/1/0/all/0/1">Sen Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_L/0/1/0/all/0/1">Leyang Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_J/0/1/0/all/0/1">Jun Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yue Zhang</a></p>
7076
7077 <p>Sentiment analysis provides a useful overview of customer review contents.
7078 Many review websites allow a user to enter a summary in addition to a full
7079 review. Intuitively, summary information may give additional benefit for review
7080 sentiment analysis. In this paper, we conduct a study to exploit methods for
7081 better use of summary information. We start by finding out that the sentimental
7082 signal distribution of a review and that of its corresponding summary are in
7083 fact complementary to each other. We thus explore various architectures to
7084 better guide the interactions between the two and propose a
7085 hierarchically-refined review-centric attention model. Empirical results show
7086 that our review-centric model can make better use of user-written summaries for
7087 review sentiment analysis, and is also more effective compared to existing
7088 methods when the user summary is replaced with summary generated by an
7089 automatic summarization system.
7090 </p>
7091 </description>
7092 </item>
7093 <item>
7094 <title>Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy. (arXiv:1911.03849v5 [cs.LG] UPDATED)</title>
7095 <link>http://fr.arxiv.org/abs/1911.03849</link>
7096 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qu_X/0/1/0/all/0/1">Xinghua Qu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sun_Z/0/1/0/all/0/1">Zhu Sun</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ong_Y/0/1/0/all/0/1">Yew-Soon Ong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_A/0/1/0/all/0/1">Abhishek Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_P/0/1/0/all/0/1">Pengfei Wei</a></p>
7097
7098 <p>Recent studies have revealed that neural network-based policies can be easily
7099 fooled by adversarial examples. However, while most prior works analyze the
7100 effects of perturbing every pixel of every frame assuming white-box policy
7101 access, in this paper we take a more restrictive view towards adversary
7102 generation - with the goal of unveiling the limits of a model's vulnerability.
7103 In particular, we explore minimalistic attacks by defining three key settings:
7104 (1) black-box policy access: where the attacker only has access to the input
7105 (state) and output (action probability) of an RL policy; (2) fractional-state
7106 adversary: where only several pixels are perturbed, with the extreme case being
7107 a single-pixel adversary; and (3) tactically-chanced attack: where only
7108 significant frames are tactically chosen to be attacked. We formulate the
7109 adversarial attack by accommodating the three key settings and explore their
7110 potency on six Atari games by examining four fully trained state-of-the-art
7111 policies. In Breakout, for example, we surprisingly find that: (i) all policies
7112 showcase significant performance degradation by merely modifying 0.01% of the
7113 input state, and (ii) the policy trained by DQN is totally deceived by
7114 perturbation to only 1% frames.
7115 </p>
7116 </description>
7117 </item>
7118 <item>
7119 <title>Rethinking Self-Attention: Towards Interpretability in Neural Parsing. (arXiv:1911.03875v3 [cs.CL] UPDATED)</title>
7120 <link>http://fr.arxiv.org/abs/1911.03875</link>
7121 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mrini_K/0/1/0/all/0/1">Khalil Mrini</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dernoncourt_F/0/1/0/all/0/1">Franck Dernoncourt</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_Q/0/1/0/all/0/1">Quan Tran</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bui_T/0/1/0/all/0/1">Trung Bui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chang_W/0/1/0/all/0/1">Walter Chang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nakashole_N/0/1/0/all/0/1">Ndapa Nakashole</a></p>
7122
7123 <p>Attention mechanisms have improved the performance of NLP tasks while
7124 allowing models to remain explainable. Self-attention is currently widely used,
7125 however interpretability is difficult due to the numerous attention
7126 distributions. Recent work has shown that model representations can benefit
7127 from label-specific information, while facilitating interpretation of
7128 predictions. We introduce the Label Attention Layer: a new form of
7129 self-attention where attention heads represent labels. We test our novel layer
7130 by running constituency and dependency parsing experiments and show our new
7131 model obtains new state-of-the-art results for both tasks on both the Penn
7132 Treebank (PTB) and Chinese Treebank. Additionally, our model requires fewer
7133 self-attention layers compared to existing work. Finally, we find that the
7134 Label Attention heads learn relations between syntactic categories and show
7135 pathways to analyze errors.
7136 </p>
7137 </description>
7138 </item>
7139 <item>
7140 <title>Privacy-Preserving Gradient Boosting Decision Trees. (arXiv:1911.04209v3 [cs.LG] UPDATED)</title>
7141 <link>http://fr.arxiv.org/abs/1911.04209</link>
7142 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qinbin Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Z/0/1/0/all/0/1">Zhaomin Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_Z/0/1/0/all/0/1">Zeyi Wen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_B/0/1/0/all/0/1">Bingsheng He</a></p>
7143
7144 <p>The Gradient Boosting Decision Tree (GBDT) is a popular machine learning
7145 model for various tasks in recent years. In this paper, we study how to improve
7146 model accuracy of GBDT while preserving the strong guarantee of differential
7147 privacy. Sensitivity and privacy budget are two key design aspects for the
7148 effectiveness of differential private models. Existing solutions for GBDT with
7149 differential privacy suffer from the significant accuracy loss due to too loose
7150 sensitivity bounds and ineffective privacy budget allocations (especially
7151 across different trees in the GBDT model). Loose sensitivity bounds lead to
7152 more noise to obtain a fixed privacy level. Ineffective privacy budget
7153 allocations worsen the accuracy loss especially when the number of trees is
7154 large. Therefore, we propose a new GBDT training algorithm that achieves
7155 tighter sensitivity bounds and more effective noise allocations. Specifically,
7156 by investigating the property of gradient and the contribution of each tree in
7157 GBDTs, we propose to adaptively control the gradients of training data for each
7158 iteration and leaf node clipping in order to tighten the sensitivity bounds.
7159 Furthermore, we design a novel boosting framework to allocate the privacy
7160 budget between trees so that the accuracy loss can be further reduced. Our
7161 experiments show that our approach can achieve much better model accuracy than
7162 other baselines.
7163 </p>
7164 </description>
7165 </item>
7166 <item>
7167 <title>A Continuous Teleoperation Subspace with Empirical and Algorithmic Mapping Algorithms for Non-Anthropomorphic Hands. (arXiv:1911.09565v5 [cs.RO] UPDATED)</title>
7168 <link>http://fr.arxiv.org/abs/1911.09565</link>
7169 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Meeker_C/0/1/0/all/0/1">Cassie Meeker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haas_Heger_M/0/1/0/all/0/1">Maximilian Haas-Heger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ciocarlie_M/0/1/0/all/0/1">Matei Ciocarlie</a></p>
7170
7171 <p>Teleoperation is a valuable tool for robotic manipulators in highly
7172 unstructured environments. However, finding an intuitive mapping between a
7173 human hand and a non-anthropomorphic robot hand can be difficult, due to the
7174 hands' dissimilar kinematics. In this paper, we seek to create a mapping
7175 between the human hand and a fully actuated, non-anthropomorphic robot hand
7176 that is intuitive enough to enable effective real-time teleoperation, even for
7177 novice users. To accomplish this, we propose a low-dimensional teleoperation
7178 subspace which can be used as an intermediary for mapping between hand pose
7179 spaces. We present two different methods to define the teleoperation subspace:
7180 an empirical definition, which requires a person to define hand motions in an
7181 intuitive, hand-specific way, and an algorithmic definition, which is
7182 kinematically independent, and uses objects to define the subspace. We use each
7183 of these definitions to create a teleoperation mapping for different hands. One
7184 of the main contributions of this paper is the validation of both the empirical
7185 and algorithmic mappings with teleoperation experiments controlled by ten
7186 novices and performed on two kinematically distinct hands. The experiments show
7187 that the proposed subspace is relevant to teleoperation, intuitive enough to
7188 enable control by novices, and can generalize to non-anthropomorphic hands with
7189 different kinematics.
7190 </p>
7191 </description>
7192 </item>
7193 <item>
7194 <title>QoS-Aware Joint Power Allocation and Task Offloading in a MEC/NFV-enabled C-RAN Network. (arXiv:1912.00187v2 [cs.NI] UPDATED)</title>
7195 <link>http://fr.arxiv.org/abs/1912.00187</link>
7196 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tajallifar_M/0/1/0/all/0/1">Mohsen Tajallifar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ebrahimi_S/0/1/0/all/0/1">Sina Ebrahimi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Javan_M/0/1/0/all/0/1">Mohammad Reza Javan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mokari_N/0/1/0/all/0/1">Nader Mokari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chiaraviglio_L/0/1/0/all/0/1">Luca Chiaraviglio</a></p>
7197
7198 <p>In this paper, we propose a novel resource management scheme that jointly
7199 allocates the transmission power and computational resources in a centralized
7200 radio access network architecture. The network comprises a set of computing
7201 nodes to which the requested tasks of different users are offloaded. The
7202 optimization problem takes the transmission, execution, and propagation delays
7203 of each task into account, with the aim to allocate the transmission power and
7204 computational resources such that the user's maximum tolerable latency is
7205 satisfied. Since the optimization problem is highly non-convex, we adopt the
7206 alternate search method (ASM) to divide it into smaller subproblems. A
7207 heuristic algorithm is proposed to jointly manage the allocated computational
7208 resources and placement of the tasks derived by ASM. We also propose an
7209 admission control mechanism for finding the set of tasks that can be served by
7210 the available resources. Furthermore, a disjoint method that separately
7211 allocates the transmission power and the computational resources is proposed as
7212 the baseline of comparison. The optimal solution of the optimization problem is
7213 also derived based on exhaustive search over offloading decisions and utilizing
7214 Karush-Kuhn-Tucker optimality conditions. The simulation results show that the
7215 joint method outperforms the disjoint task offloading and power allocation.
7216 Moreover, simulations show that the performance of the proposed method is
7217 almost equal to that of the optimal solution.
7218 </p>
7219 </description>
7220 </item>
7221 <item>
7222 <title>Hierarchical Indian Buffet Neural Networks for Bayesian Continual Learning. (arXiv:1912.02290v4 [stat.ML] UPDATED)</title>
7223 <link>http://fr.arxiv.org/abs/1912.02290</link>
7224 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Kessler_S/0/1/0/all/0/1">Samuel Kessler</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Nguyen_V/0/1/0/all/0/1">Vu Nguyen</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Zohren_S/0/1/0/all/0/1">Stefan Zohren</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Roberts_S/0/1/0/all/0/1">Stephen Roberts</a></p>
7225
7226 <p>We place an Indian Buffet process (IBP) prior over the structure of a
7227 Bayesian Neural Network (BNN), thus allowing the complexity of the BNN to
7228 increase and decrease automatically. We further extend this model such that the
7229 prior on the structure of each hidden layer is shared globally across all
7230 layers, using a Hierarchical-IBP (H-IBP). We apply this model to the problem of
7231 resource allocation in Continual Learning (CL) where new tasks occur and the
7232 network requires extra resources. Our model uses online variational inference
7233 with reparameterisation of the Bernoulli and Beta distributions, which
7234 constitute the IBP and H-IBP priors. As we automatically learn the number of
7235 weights in each layer of the BNN, overfitting and underfitting problems are
7236 largely overcome. We show empirically that our approach offers a competitive
7237 edge over existing methods in CL.
7238 </p>
7239 </description>
7240 </item>
7241 <item>
7242 <title>CoSimLex: A Resource for Evaluating Graded Word Similarity in Context. (arXiv:1912.05320v3 [cs.CL] UPDATED)</title>
7243 <link>http://fr.arxiv.org/abs/1912.05320</link>
7244 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Armendariz_C/0/1/0/all/0/1">Carlos Santos Armendariz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Purver_M/0/1/0/all/0/1">Matthew Purver</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ulcar_M/0/1/0/all/0/1">Matej Ul&#x10d;ar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pollak_S/0/1/0/all/0/1">Senja Pollak</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ljubesic_N/0/1/0/all/0/1">Nikola Ljube&#x161;i&#x107;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Robnik_Sikonja_M/0/1/0/all/0/1">Marko Robnik-&#x160;ikonja</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Granroth_Wilding_M/0/1/0/all/0/1">Mark Granroth-Wilding</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vaik_K/0/1/0/all/0/1">Kristiina Vaik</a></p>
7245
7246 <p>State of the art natural language processing tools are built on
7247 context-dependent word embeddings, but no direct method for evaluating these
7248 representations currently exists. Standard tasks and datasets for intrinsic
7249 evaluation of embeddings are based on judgements of similarity, but ignore
7250 context; standard tasks for word sense disambiguation take account of context
7251 but do not provide continuous measures of meaning similarity. This paper
7252 describes an effort to build a new dataset, CoSimLex, intended to fill this
7253 gap. Building on the standard pairwise similarity task of SimLex-999, it
7254 provides context-dependent similarity measures; covers not only discrete
7255 differences in word sense but more subtle, graded changes in meaning; and
7256 covers not only a well-resourced language (English) but a number of
7257 less-resourced languages. We define the task and evaluation metrics, outline
7258 the dataset collection methodology, and describe the status of the dataset so
7259 far.
7260 </p>
7261 </description>
7262 </item>
7263 <item>
7264 <title>What it Thinks is Important is Important: Robustness Transfers through Input Gradients. (arXiv:1912.05699v3 [cs.LG] UPDATED)</title>
7265 <link>http://fr.arxiv.org/abs/1912.05699</link>
7266 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chan_A/0/1/0/all/0/1">Alvin Chan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tay_Y/0/1/0/all/0/1">Yi Tay</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ong_Y/0/1/0/all/0/1">Yew-Soon Ong</a></p>
7267
7268 <p>Adversarial perturbations are imperceptible changes to input pixels that can
7269 change the prediction of deep learning models. Learned weights of models robust
7270 to such perturbations are previously found to be transferable across different
7271 tasks but this applies only if the model architecture for the source and target
7272 tasks is the same. Input gradients characterize how small changes at each input
7273 pixel affect the model output. Using only natural images, we show here that
7274 training a student model's input gradients to match those of a robust teacher
7275 model can gain robustness close to a strong baseline that is robustly trained
7276 from scratch. Through experiments in MNIST, CIFAR-10, CIFAR-100 and
7277 Tiny-ImageNet, we show that our proposed method, input gradient adversarial
7278 matching, can transfer robustness across different tasks and even across
7279 different model architectures. This demonstrates that directly targeting the
7280 semantics of input gradients is a feasible way towards adversarial robustness.
7281 </p>
7282 </description>
7283 </item>
7284 <item>
7285 <title>ORCA: a Benchmark for Data Web Crawlers. (arXiv:1912.08026v2 [cs.DB] UPDATED)</title>
7286 <link>http://fr.arxiv.org/abs/1912.08026</link>
7287 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Roder_M/0/1/0/all/0/1">Michael R&#xf6;der</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Souza_G/0/1/0/all/0/1">Geraldo de Souza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kuchelev_D/0/1/0/all/0/1">Denis Kuchelev</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Desouki_A/0/1/0/all/0/1">Abdelmoneim Amer Desouki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ngomo_A/0/1/0/all/0/1">Axel-Cyrille Ngonga Ngomo</a></p>
7288
7289 <p>The number of RDF knowledge graphs available on the Web grows constantly.
7290 Gathering these graphs at large scale for downstream applications hence
7291 requires the use of crawlers. Although Data Web crawlers exist, and general Web
7292 crawlers could be adapted to focus on the Data Web, there is currently no
7293 benchmark to fairly evaluate their performance. Our work closes this gap by
7294 presenting the Orca benchmark. Orca generates a synthetic Data Web, which is
7295 decoupled from the original Web and enables a fair and repeatable comparison of
7296 Data Web crawlers. Our evaluations show that Orca can be used to reveal the
7297 different advantages and disadvantages of existing crawlers. The benchmark is
7298 open-source and available at https://github.com/dice-group/orca.
7299 </p>
7300 </description>
7301 </item>
7302 <item>
7303 <title>Deep Automodulators. (arXiv:1912.10321v4 [cs.LG] UPDATED)</title>
7304 <link>http://fr.arxiv.org/abs/1912.10321</link>
7305 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Heljakka_A/0/1/0/all/0/1">Ari Heljakka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hou_Y/0/1/0/all/0/1">Yuxin Hou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kannala_J/0/1/0/all/0/1">Juho Kannala</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Solin_A/0/1/0/all/0/1">Arno Solin</a></p>
7306
7307 <p>We introduce a new category of generative autoencoders called automodulators.
7308 These networks can faithfully reproduce individual real-world input images like
7309 regular autoencoders, but also generate a fused sample from an arbitrary
7310 combination of several such images, allowing instantaneous 'style-mixing' and
7311 other new applications. An automodulator decouples the data flow of decoder
7312 operations from statistical properties thereof and uses the latent vector to
7313 modulate the former by the latter, with a principled approach for mutual
7314 disentanglement of decoder layers. Prior work has explored similar decoder
7315 architecture with GANs, but their focus has been on random sampling. A
7316 corresponding autoencoder could operate on real input images. For the first
7317 time, we show how to train such a general-purpose model with sharp outputs in
7318 high resolution, using novel training techniques, demonstrated on four image
7319 data sets. Besides style-mixing, we show state-of-the-art results in
7320 autoencoder comparison, and visual image quality nearly indistinguishable from
7321 state-of-the-art GANs. We expect the automodulator variants to become a useful
7322 building block for image applications and other data domains.
7323 </p>
7324 </description>
7325 </item>
7326 <item>
7327 <title>Statistical Limits of Supervised Quantum Learning. (arXiv:2001.10477v3 [quant-ph] UPDATED)</title>
7328 <link>http://fr.arxiv.org/abs/2001.10477</link>
7329 <description><p>Authors: <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Ciliberto_C/0/1/0/all/0/1">Carlo Ciliberto</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Rocchetto_A/0/1/0/all/0/1">Andrea Rocchetto</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Rudi_A/0/1/0/all/0/1">Alessandro Rudi</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Wossnig_L/0/1/0/all/0/1">Leonard Wossnig</a></p>
7330
7331 <p>Within the framework of statistical learning theory it is possible to bound
7332 the minimum number of samples required by a learner to reach a target accuracy.
7333 We show that if the bound on the accuracy is taken into account, quantum
7334 machine learning algorithms for supervised learning---for which statistical
7335 guarantees are available---cannot achieve polylogarithmic runtimes in the input
7336 dimension. We conclude that, when no further assumptions on the problem are
7337 made, quantum machine learning algorithms for supervised learning can have at
7338 most polynomial speedups over efficient classical algorithms, even in cases
7339 where quantum access to the data is naturally available.
7340 </p>
7341 </description>
7342 </item>
7343 <item>
7344 <title>Can Graph Neural Networks Count Substructures?. (arXiv:2002.04025v4 [cs.LG] UPDATED)</title>
7345 <link>http://fr.arxiv.org/abs/2002.04025</link>
7346 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Z/0/1/0/all/0/1">Zhengdao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_L/0/1/0/all/0/1">Lei Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Villar_S/0/1/0/all/0/1">Soledad Villar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bruna_J/0/1/0/all/0/1">Joan Bruna</a></p>
7347
7348 <p>The ability to detect and count certain substructures in graphs is important
7349 for solving many tasks on graph-structured data, especially in the contexts of
7350 computational chemistry and biology as well as social network analysis.
7351 Inspired by this, we propose to study the expressive power of graph neural
7352 networks (GNNs) via their ability to count attributed graph substructures,
7353 extending recent works that examine their power in graph isomorphism testing
7354 and function approximation. We distinguish between two types of substructure
7355 counting: induced-subgraph-count and subgraph-count, and establish both
7356 positive and negative answers for popular GNN architectures. Specifically, we
7357 prove that Message Passing Neural Networks (MPNNs), 2-Weisfeiler-Lehman (2-WL)
7358 and 2-Invariant Graph Networks (2-IGNs) cannot perform induced-subgraph-count
7359 of substructures consisting of 3 or more nodes, while they can perform
7360 subgraph-count of star-shaped substructures. As an intermediary step, we prove
7361 that 2-WL and 2-IGNs are equivalent in distinguishing non-isomorphic graphs,
7362 partly answering an open problem raised in Maron et al. (2019). We also prove
7363 positive results for k-WL and k-IGNs as well as negative results for k-WL with
7364 a finite number of iterations. We then conduct experiments that support the
7365 theoretical results for MPNNs and 2-IGNs. Moreover, motivated by substructure
7366 counting and inspired by Murphy et al. (2019), we propose the Local Relational
7367 Pooling model and demonstrate that it is not only effective for substructure
7368 counting but also able to achieve competitive performance on molecular
7369 prediction tasks.
7370 </p>
7371 </description>
7372 </item>
7373 <item>
7374 <title>An implicit function learning approach for parametric modal regression. (arXiv:2002.06195v2 [stat.ML] UPDATED)</title>
7375 <link>http://fr.arxiv.org/abs/2002.06195</link>
7376 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Pan_Y/0/1/0/all/0/1">Yangchen Pan</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Imani_E/0/1/0/all/0/1">Ehsan Imani</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+White_M/0/1/0/all/0/1">Martha White</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Farahmand_A/0/1/0/all/0/1">Amir-massoud Farahmand</a></p>
7377
7378 <p>For multi-valued functions---such as when the conditional distribution on
7379 targets given the inputs is multi-modal---standard regression approaches are
7380 not always desirable because they provide the conditional mean. Modal
7381 regression algorithms address this issue by instead finding the conditional
7382 mode(s). Most, however, are nonparametric approaches and so can be difficult to
7383 scale. Further, parametric approximators, like neural networks, facilitate
7384 learning complex relationships between inputs and targets. In this work, we
7385 propose a parametric modal regression algorithm. We use the implicit function
7386 theorem to develop an objective, for learning a joint function over inputs and
7387 targets. We empirically demonstrate on several synthetic problems that our
7388 method (i) can learn multi-valued functions and produce the conditional modes,
7389 (ii) scales well to high-dimensional inputs, and (iii) can even be more
7390 effective for certain uni-modal problems, particularly for high-frequency
7391 functions. We demonstrate that our method is competitive in a real-world modal
7392 regression problem and two regular regression datasets.
7393 </p>
7394 </description>
7395 </item>
7396 <item>
7397 <title>Learning Global Transparent Models Consistent with Local Contrastive Explanations. (arXiv:2002.08247v4 [cs.LG] UPDATED)</title>
7398 <link>http://fr.arxiv.org/abs/2002.08247</link>
7399 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Pedapati_T/0/1/0/all/0/1">Tejaswini Pedapati</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Balakrishnan_A/0/1/0/all/0/1">Avinash Balakrishnan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shanmugam_K/0/1/0/all/0/1">Karthikeyan Shanmugam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dhurandhar_A/0/1/0/all/0/1">Amit Dhurandhar</a></p>
7400
7401 <p>There is a rich and growing literature on producing local
7402 contrastive/counterfactual explanations for black-box models (e.g. neural
7403 networks).
7404 </p>
7405 <p>In these methods, for an input, an explanation is in the form of a contrast
7406 point differing in very few features from the original input and lying in a
7407 different class. Other works try to build globally interpretable models like
7408 decision trees and rule lists based on the data using actual labels or based on
7409 the black-box models predictions. Although these interpretable global models
7410 can be useful, they may not be consistent with local explanations from a
7411 specific black-box of choice. In this work, we explore the question: Can we
7412 produce a transparent global model that is simultaneously accurate and
7413 consistent with the local (contrastive) explanations of the black-box model? We
7414 introduce a natural local consistency metric that quantifies if the local
7415 explanations and predictions of the black-box model are also consistent with
7416 the proxy global transparent model. Based on a key insight we propose a novel
7417 method where we create custom boolean features from sparse local contrastive
7418 explanations of the black-box model and then train a globally transparent model
7419 on just these, and showcase empirically that such models have higher local
7420 consistency compared with other known strategies, while still being close in
7421 performance to models that are trained with access to the original data.
7422 </p>
7423 </description>
7424 </item>
7425 <item>
7426 <title>A two-stage data-analysis method for total-reflection high-energy positron diffraction (TRHEPD). (arXiv:2002.12165v2 [cond-mat.mtrl-sci] UPDATED)</title>
7427 <link>http://fr.arxiv.org/abs/2002.12165</link>
7428 <description><p>Authors: <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Tanaka_K/0/1/0/all/0/1">Kazuyuki Tanaka</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Mochizuki_I/0/1/0/all/0/1">Izumi Mochizuki</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Hanada_T/0/1/0/all/0/1">Takashi Hanada</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Ichimiya_A/0/1/0/all/0/1">Ayahiko Ichimiya</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Hyodo_T/0/1/0/all/0/1">Toshio Hyodo</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Hoshi_T/0/1/0/all/0/1">Takeo Hoshi</a></p>
7429
7430 <p>Total-reflection high-energy positron diffraction (TRHEPD) is a novel
7431 experimental method for the determination of surface structure, which has been
7432 extensively developed at the Slow Positron Facility, Institute of Materials
7433 Structure Science, High Energy Accelerator Research Organization (KEK). In this
7434 paper, a two-stage data-analysis method is proposed. The data analysis is based
7435 on an inverse problem in which the atomic positions of a surface structure are
7436 determined from the experimental diffraction data (rocking curves). The
7437 relevant forward problem is solved by the numerical solution of the partial
7438 differential equation for quantum scattering of the positron. In the present
7439 two-stage method, the first stage is a grid-based global search and the second
7440 stage is a local search for the unique candidate for the atomic arrangement.
7441 The numerical problem is solved on a supercomputer
7442 </p>
7443 </description>
7444 </item>
7445 <item>
7446 <title>Curriculum By Smoothing. (arXiv:2003.01367v3 [cs.LG] UPDATED)</title>
7447 <link>http://fr.arxiv.org/abs/2003.01367</link>
7448 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sinha_S/0/1/0/all/0/1">Samarth Sinha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Garg_A/0/1/0/all/0/1">Animesh Garg</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Larochelle_H/0/1/0/all/0/1">Hugo Larochelle</a></p>
7449
7450 <p>Convolutional Neural Networks (CNNs) have shown impressive performance in
7451 computer vision tasks such as image classification, detection, and
7452 segmentation. Moreover, recent work in Generative Adversarial Networks (GANs)
7453 has highlighted the importance of learning by progressively increasing the
7454 difficulty of a learning task [26]. When learning a network from scratch, the
7455 information propagated within the network during the earlier stages of training
7456 can contain distortion artifacts due to noise which can be detrimental to
7457 training. In this paper, we propose an elegant curriculum based scheme that
7458 smoothes the feature embedding of a CNN using anti-aliasing or low-pass
7459 filters. We propose to augment the train-ing of CNNs by controlling the amount
7460 of high frequency information propagated within the CNNs as training
7461 progresses, by convolving the output of a CNN feature map of each layer with a
7462 Gaussian kernel. By decreasing the variance of the Gaussian kernel, we
7463 gradually increase the amount of high-frequency information available within
7464 the network for inference. As the amount of information in the feature maps
7465 increases during training, the network is able to progressively learn better
7466 representations of the data. Our proposed augmented training scheme
7467 significantly improves the performance of CNNs on various vision tasks without
7468 either adding additional trainable parameters or an auxiliary regularization
7469 objective. The generality of our method is demonstrated through empirical
7470 performance gains in CNN architectures across four different tasks: transfer
7471 learning, cross-task transfer learning, and generative models.
7472 </p>
7473 </description>
7474 </item>
7475 <item>
7476 <title>Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations. (arXiv:2003.02960v3 [cs.LG] UPDATED)</title>
7477 <link>http://fr.arxiv.org/abs/2003.02960</link>
7478 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Golatkar_A/0/1/0/all/0/1">Aditya Golatkar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Achille_A/0/1/0/all/0/1">Alessandro Achille</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Soatto_S/0/1/0/all/0/1">Stefano Soatto</a></p>
7479
7480 <p>We describe a procedure for removing dependency on a cohort of training data
7481 from a trained deep network that improves upon and generalizes previous methods
7482 to different readout functions and can be extended to ensure forgetting in the
7483 activations of the network. We introduce a new bound on how much information
7484 can be extracted per query about the forgotten cohort from a black-box network
7485 for which only the input-output behavior is observed. The proposed forgetting
7486 procedure has a deterministic part derived from the differential equations of a
7487 linearized version of the model, and a stochastic part that ensures information
7488 destruction by adding noise tailored to the geometry of the loss landscape. We
7489 exploit the connections between the activation and weight dynamics of a DNN
7490 inspired by Neural Tangent Kernels to compute the information in the
7491 activations.
7492 </p>
7493 </description>
7494 </item>
7495 <item>
7496 <title>No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks. (arXiv:2003.03824v2 [eess.IV] UPDATED)</title>
7497 <link>http://fr.arxiv.org/abs/2003.03824</link>
7498 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_S/0/1/0/all/0/1">Siqi Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Setio_A/0/1/0/all/0/1">Arnaud Arindra Adiyoso Setio</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ghesu_F/0/1/0/all/0/1">Florin C. Ghesu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gibson_E/0/1/0/all/0/1">Eli Gibson</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Grbic_S/0/1/0/all/0/1">Sasa Grbic</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Georgescu_B/0/1/0/all/0/1">Bogdan Georgescu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Comaniciu_D/0/1/0/all/0/1">Dorin Comaniciu</a></p>
7499
7500 <p>Detecting malignant pulmonary nodules at an early stage can allow medical
7501 interventions which may increase the survival rate of lung cancer patients.
7502 Using computer vision techniques to detect nodules can improve the sensitivity
7503 and the speed of interpreting chest CT for lung cancer screening. Many studies
7504 have used CNNs to detect nodule candidates. Though such approaches have been
7505 shown to outperform the conventional image processing based methods regarding
7506 the detection accuracy, CNNs are also known to be limited to generalize on
7507 under-represented samples in the training set and prone to imperceptible noise
7508 perturbations. Such limitations can not be easily addressed by scaling up the
7509 dataset or the models. In this work, we propose to add adversarial synthetic
7510 nodules and adversarial attack samples to the training data to improve the
7511 generalization and the robustness of the lung nodule detection systems. To
7512 generate hard examples of nodules from a differentiable nodule synthesizer, we
7513 use projected gradient descent (PGD) to search the latent code within a bounded
7514 neighbourhood that would generate nodules to decrease the detector response. To
7515 make the network more robust to unanticipated noise perturbations, we use PGD
7516 to search for noise patterns that can trigger the network to give
7517 over-confident mistakes. By evaluating on two different benchmark datasets
7518 containing consensus annotations from three radiologists, we show that the
7519 proposed techniques can improve the detection performance on real CT data. To
7520 understand the limitations of both the conventional networks and the proposed
7521 augmented networks, we also perform stress-tests on the false positive
7522 reduction networks by feeding different types of artificially produced patches.
7523 We show that the augmented networks are more robust to both under-represented
7524 nodules as well as resistant to noise perturbations.
7525 </p>
7526 </description>
7527 </item>
7528 <item>
7529 <title>Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule. (arXiv:2003.03977v4 [cs.LG] UPDATED)</title>
7530 <link>http://fr.arxiv.org/abs/2003.03977</link>
7531 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Iyer_N/0/1/0/all/0/1">Nikhil Iyer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Thejas_V/0/1/0/all/0/1">V Thejas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kwatra_N/0/1/0/all/0/1">Nipun Kwatra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramjee_R/0/1/0/all/0/1">Ramachandran Ramjee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sivathanu_M/0/1/0/all/0/1">Muthian Sivathanu</a></p>
7532
7533 <p>Several papers argue that wide minima generalize better than narrow minima.
7534 In this paper, through detailed experiments that not only corroborate the
7535 generalization properties of wide minima, we also provide empirical evidence
7536 for a new hypothesis that the density of wide minima is likely lower than the
7537 density of narrow minima. Further, motivated by this hypothesis, we design a
7538 novel explore-exploit learning rate schedule. On a variety of image and natural
7539 language datasets, compared to their original hand-tuned learning rate
7540 baselines, we show that our explore-exploit schedule can result in either up to
7541 0.84% higher absolute accuracy using the original training budget or up to 57%
7542 reduced training time while achieving the original reported accuracy. For
7543 example, we achieve state-of-the-art (SOTA) accuracy for IWSLT'14 (DE-EN) and
7544 WMT'14 (DE-EN) datasets by just modifying the learning rate schedule of a high
7545 performing model.
7546 </p>
7547 </description>
7548 </item>
7549 <item>
7550 <title>Compressive Isogeometric Analysis. (arXiv:2003.06475v2 [math.NA] UPDATED)</title>
7551 <link>http://fr.arxiv.org/abs/2003.06475</link>
7552 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Brugiapaglia_S/0/1/0/all/0/1">Simone Brugiapaglia</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Tamellini_L/0/1/0/all/0/1">Lorenzo Tamellini</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Tani_M/0/1/0/all/0/1">Mattia Tani</a></p>
7553
7554 <p>This work is motivated by the difficulty in assembling the Galerkin matrix
7555 when solving Partial Differential Equations (PDEs) with Isogeometric Analysis
7556 (IGA) using B-splines of moderate-to-high polynomial degree. To mitigate this
7557 problem, we propose a novel methodology named CossIGA (COmpreSSive IsoGeometric
7558 Analysis), which combines the IGA principle with CORSING, a recently introduced
7559 sparse recovery approach for PDEs based on compressive sensing. CossIGA
7560 assembles only a small portion of a suitable IGA Petrov-Galerkin discretization
7561 and is effective whenever the PDE solution is sufficiently sparse or
7562 compressible, i.e., when most of its coefficients are zero or negligible. The
7563 sparsity of the solution is promoted by employing a multilevel dictionary of
7564 B-splines as opposed to a basis. Thanks to sparsity and the fact that only a
7565 fraction of the full discretization matrix is assembled, the proposed technique
7566 has the potential to lead to significant computational savings. We show the
7567 effectiveness of CossIGA for the solution of the 2D and 3D Poisson equation
7568 over nontrivial geometries by means of an extensive numerical investigation.
7569 </p>
7570 </description>
7571 </item>
7572 <item>
7573 <title>Thermodynamic Cost of Edge Detection in Artificial Neural Network(ANN)-Based Processors. (arXiv:2003.08196v2 [eess.IV] UPDATED)</title>
7574 <link>http://fr.arxiv.org/abs/2003.08196</link>
7575 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Barisik_S/0/1/0/all/0/1">Se&#xe7;kin Bar&#x131;&#x15f;&#x131;k</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ercan_I/0/1/0/all/0/1">&#x130;lke Ercan</a></p>
7576
7577 <p>Architecture-based heat dissipation analyses allow us to reveal fundamental
7578 sources of inefficiency in a given processor and thereby provide us with
7579 road-maps to design less dissipative computing schemes independent of
7580 technology-base used to implement them. In this work, we study
7581 architectural-level contributions to energy dissipation in an Artificial Neural
7582 Network (ANN)-based processor that is trained to perform edge-detection task.
7583 We compare the training and information processing cost of ANN to that of
7584 conventional architectures and algorithms using 64-pixel binary image. Our
7585 results reveal the inherent efficiency advantages of an ANN network trained for
7586 specific tasks over general-purpose processors based on von Neumann
7587 architecture. We also compare the proposed performance improvements to that of
7588 Cellular Array Processors (CAPs) and illustrate the reduction in dissipation
7589 for special purpose processors. Lastly, we calculate the change in dissipation
7590 as a result of input data structure and show the effect of randomness on
7591 energetic cost of information processing. The results we obtained provide a
7592 basis for comparison for task-based fundamental energy efficiency analyses for
7593 a range of processors and therefore contribute to the study of
7594 architecture-level descriptions of processors and thermodynamic cost
7595 calculations based on physics of computation.
7596 </p>
7597 </description>
7598 </item>
7599 <item>
7600 <title>On Calibration of Mixup Training for Deep Neural Networks. (arXiv:2003.09946v3 [cs.LG] UPDATED)</title>
7601 <link>http://fr.arxiv.org/abs/2003.09946</link>
7602 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Maronas_J/0/1/0/all/0/1">Juan Maro&#xf1;as</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramos_D/0/1/0/all/0/1">Daniel Ramos</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Paredes_R/0/1/0/all/0/1">Roberto Paredes</a></p>
7603
7604 <p>Deep Neural Networks (DNN) represent the state of the art in many tasks.
7605 However, due to their overparameterization, their generalization capabilities
7606 are in doubt and still a field under study. Consequently, DNN can overfit and
7607 assign overconfident predictions -- effects that have been shown to affect the
7608 calibration of the confidences assigned to unseen data. Data Augmentation (DA)
7609 strategies have been proposed to regularize these models, being Mixup one of
7610 the most popular due to its ability to improve the accuracy, the uncertainty
7611 quantification and the calibration of DNN. In this work however we argue and
7612 provide empirical evidence that, due to its fundamentals, Mixup does not
7613 necessarily improve calibration. Based on our observations we propose a new
7614 loss function that improves the calibration, and also sometimes the accuracy,
7615 of DNN trained with this DA technique. Our loss is inspired by Bayes decision
7616 theory and introduces a new training framework for designing losses for
7617 probabilistic modelling. We provide state-of-the-art accuracy with consistent
7618 improvements in calibration performance. Appendix and code are provided here:
7619 https://github.com/jmaronas/calibration_MixupDNN_ARCLoss.pytorch.git
7620 </p>
7621 </description>
7622 </item>
7623 <item>
7624 <title>Unique Chinese Linguistic Phenomena. (arXiv:2004.00499v3 [cs.CL] UPDATED)</title>
7625 <link>http://fr.arxiv.org/abs/2004.00499</link>
7626 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jia_S/0/1/0/all/0/1">Shengbin Jia</a></p>
7627
7628 <p>Linguistics holds unique characteristics of generality, stability, and
7629 nationality, which will affect the formulation of extraction strategies and
7630 should be incorporated into the relation extraction. Chinese open relation
7631 extraction is not well-established, because of the complexity of Chinese
7632 linguistics makes it harder to operate, and the methods for English are not
7633 compatible with that for Chinese. The diversities between Chinese and English
7634 linguistics are mainly reflected in morphology and syntax.
7635 </p>
7636 </description>
7637 </item>
7638 <item>
7639 <title>Is Graph Structure Necessary for Multi-hop Question Answering?. (arXiv:2004.03096v2 [cs.CL] UPDATED)</title>
7640 <link>http://fr.arxiv.org/abs/2004.03096</link>
7641 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shao_N/0/1/0/all/0/1">Nan Shao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_Y/0/1/0/all/0/1">Yiming Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_T/0/1/0/all/0/1">Ting Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shijin Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_G/0/1/0/all/0/1">Guoping Hu</a></p>
7642
7643 <p>Recently, attempting to model texts as graph structure and introducing graph
7644 neural networks to deal with it has become a trend in many NLP research areas.
7645 In this paper, we investigate whether the graph structure is necessary for
7646 multi-hop question answering. Our analysis is centered on HotpotQA. We
7647 construct a strong baseline model to establish that, with the proper use of
7648 pre-trained models, graph structure may not be necessary for multi-hop question
7649 answering. We point out that both graph structure and adjacency matrix are
7650 task-related prior knowledge, and graph-attention can be considered as a
7651 special case of self-attention. Experiments and visualized analysis demonstrate
7652 that graph-attention or the entire graph structure can be replaced by
7653 self-attention or Transformers.
7654 </p>
7655 </description>
7656 </item>
7657 <item>
7658 <title>Risk-Constrained Linear-Quadratic Regulators. (arXiv:2004.04685v2 [eess.SY] UPDATED)</title>
7659 <link>http://fr.arxiv.org/abs/2004.04685</link>
7660 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Tsiamis_A/0/1/0/all/0/1">Anastasios Tsiamis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kalogerias_D/0/1/0/all/0/1">Dionysios S. Kalogerias</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chamon_L/0/1/0/all/0/1">Luiz F. O. Chamon</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ribeiro_A/0/1/0/all/0/1">Alejandro Ribeiro</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pappas_G/0/1/0/all/0/1">George J. Pappas</a></p>
7661
7662 <p>We propose a new risk-constrained reformulation of the standard Linear
7663 Quadratic Regulator (LQR) problem. Our framework is motivated by the fact that
7664 the classical (risk-neutral) LQR controller, although optimal in expectation,
7665 might be ineffective under relatively infrequent, yet statistically significant
7666 (risky) events. To effectively trade between average and extreme event
7667 performance, we introduce a new risk constraint, which explicitly restricts the
7668 total expected predictive variance of the state penalty by a user-prescribed
7669 level. We show that, under rather minimal conditions on the process noise
7670 (i.e., finite fourth-order moments), the optimal risk-aware controller can be
7671 evaluated explicitly and in closed form. In fact, it is affine relative to the
7672 state, and is always internally stable regardless of parameter tuning. Our new
7673 risk-aware controller: i) pushes the state away from directions where the noise
7674 exhibits heavy tails, by exploiting the third-order moment (skewness) of the
7675 noise; ii) inflates the state penalty in riskier directions, where both the
7676 noise covariance and the state penalty are simultaneously large. The properties
7677 of the proposed risk-aware LQR framework are also illustrated via indicative
7678 numerical examples.
7679 </p>
7680 </description>
7681 </item>
7682 <item>
7683 <title>Supervised Contrastive Learning. (arXiv:2004.11362v2 [cs.LG] UPDATED)</title>
7684 <link>http://fr.arxiv.org/abs/2004.11362</link>
7685 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Khosla_P/0/1/0/all/0/1">Prannay Khosla</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Teterwak_P/0/1/0/all/0/1">Piotr Teterwak</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_C/0/1/0/all/0/1">Chen Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sarna_A/0/1/0/all/0/1">Aaron Sarna</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_Y/0/1/0/all/0/1">Yonglong Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Isola_P/0/1/0/all/0/1">Phillip Isola</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maschinot_A/0/1/0/all/0/1">Aaron Maschinot</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_C/0/1/0/all/0/1">Ce Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Krishnan_D/0/1/0/all/0/1">Dilip Krishnan</a></p>
7686
7687 <p>Contrastive learning applied to self-supervised representation learning has
7688 seen a resurgence in recent years, leading to state of the art performance in
7689 the unsupervised training of deep image models. Modern batch contrastive
7690 approaches subsume or significantly outperform traditional contrastive losses
7691 such as triplet, max-margin and the N-pairs loss. In this work, we extend the
7692 self-supervised batch contrastive approach to the fully-supervised setting,
7693 allowing us to effectively leverage label information. Clusters of points
7694 belonging to the same class are pulled together in embedding space, while
7695 simultaneously pushing apart clusters of samples from different classes. We
7696 analyze two possible versions of the supervised contrastive (SupCon) loss,
7697 identifying the best-performing formulation of the loss. On ResNet-200, we
7698 achieve top-1 accuracy of 81.4% on the ImageNet dataset, which is 0.8% above
7699 the best number reported for this architecture. We show consistent
7700 outperformance over cross-entropy on other datasets and two ResNet variants.
7701 The loss shows benefits for robustness to natural corruptions and is more
7702 stable to hyperparameter settings such as optimizers and data augmentations. In
7703 reduced data settings, it outperforms cross-entropy significantly. Our loss
7704 function is simple to implement, and reference TensorFlow code is released at
7705 https://t.ly/supcon.
7706 </p>
7707 </description>
7708 </item>
7709 <item>
7710 <title>An Epidemiological Modelling Approach for Covid19 via Data Assimilation. (arXiv:2004.12130v3 [stat.AP] UPDATED)</title>
7711 <link>http://fr.arxiv.org/abs/2004.12130</link>
7712 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Nadler_P/0/1/0/all/0/1">Philip Nadler</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wang_S/0/1/0/all/0/1">Shuo Wang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Arcucci_R/0/1/0/all/0/1">Rossella Arcucci</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Yang_X/0/1/0/all/0/1">Xian Yang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Guo_Y/0/1/0/all/0/1">Yike Guo</a></p>
7713
7714 <p>The global pandemic of the 2019-nCov requires the evaluation of policy
7715 interventions to mitigate future social and economic costs of quarantine
7716 measures worldwide. We propose an epidemiological model for forecasting and
7717 policy evaluation which incorporates new data in real-time through variational
7718 data assimilation. We analyze and discuss infection rates in China, the US and
7719 Italy. In particular, we develop a custom compartmental SIR model fit to
7720 variables related to the epidemic in Chinese cities, named SITR model. We
7721 compare and discuss model results which conducts updates as new observations
7722 become available. A hybrid data assimilation approach is applied to make
7723 results robust to initial conditions. We use the model to do inference on
7724 infection numbers as well as parameters such as the disease transmissibility
7725 rate or the rate of recovery. The parameterisation of the model is parsimonious
7726 and extendable, allowing for the incorporation of additional data and
7727 parameters of interest. This allows for scalability and the extension of the
7728 model to other locations or the adaption of novel data sources.
7729 </p>
7730 </description>
7731 </item>
7732 <item>
7733 <title>Holistic Privacy for Electricity, Water, and Natural Gas Metering in Next Generation Smart Homes. (arXiv:2004.13363v3 [eess.SY] UPDATED)</title>
7734 <link>http://fr.arxiv.org/abs/2004.13363</link>
7735 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Kement_C/0/1/0/all/0/1">Cihan Emre Kement</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Tavli_B/0/1/0/all/0/1">Bulent Tavli</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gultekin_H/0/1/0/all/0/1">Hakan Gultekin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Yanikomeroglu_H/0/1/0/all/0/1">Halim Yanikomeroglu</a></p>
7736
7737 <p>In smart electricity grids, high time granularity (HTG) power consumption
7738 data can be decomposed into individual appliance load signatures via
7739 Nonintrusive Appliance Load Monitoring techniques to expose appliance usage
7740 profiles. Various methods ranging from load shaping to noise addition and data
7741 aggregation have been proposed to mitigate this problem. However, with the
7742 growing scarcity of natural resources, utilities other than electricity (such
7743 as water and natural gas) have also begun to be subject to HTG metering, which
7744 creates privacy issues similar to that of electricity. Therefore, employing
7745 privacy protection countermeasures for only electricity usage is ineffective
7746 for appliances that utilize additional/other metered resources. As such,
7747 existing privacy countermeasures and metrics need to be reevaluated to address
7748 not only electricity, but also any other resource that is metered. Furthermore,
7749 a holistic privacy protection approach for all metered resources must be
7750 adopted as the information leak from any of the resources has a potential to
7751 render the privacy preserving countermeasures for all the other resources
7752 futile. This paper introduces the privacy preservation problem for multiple HTG
7753 metered resources and explores potential solutions for its mitigation.
7754 </p>
7755 </description>
7756 </item>
7757 <item>
7758 <title>Geometric group testing. (arXiv:2004.14632v3 [cs.CG] UPDATED)</title>
7759 <link>http://fr.arxiv.org/abs/2004.14632</link>
7760 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Berendsohn_B/0/1/0/all/0/1">Benjamin Aram Berendsohn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kozma_L/0/1/0/all/0/1">L&#xe1;szl&#xf3; Kozma</a></p>
7761
7762 <p>Group testing is concerned with identifying $t$ defective items in a set of
7763 $m$ items, where each test reports whether a specific subset of items contains
7764 at least one defective. In non-adaptive group testing, the subsets to be tested
7765 are fixed in advance. By testing multiple items at once, the required number of
7766 tests can be made much smaller than $m$. In fact, for $t \in \mathcal{O}(1)$,
7767 the optimal number of (non-adaptive) tests is known to be $\Theta(\log{m})$.
7768 </p>
7769 <p>In this paper, we consider the problem of non-adaptive group testing in a
7770 geometric setting, where the items are points in $d$-dimensional Euclidean
7771 space and the tests are axis-parallel boxes (hyperrectangles). We present upper
7772 and lower bounds on the required number of tests under this geometric
7773 constraint. In contrast to the general, combinatorial case, the bounds in our
7774 geometric setting are polynomial in $m$. For instance, our results imply that
7775 identifying a defective pair in a set of $m$ points in the plane always
7776 requires $\Omega(m^{3/5})$ tests, and there exist configurations of $m$ points
7777 for which $\mathcal{O}(m^{2/3})$ tests are sufficient, whereas to identify a
7778 single defective point in the plane, $\Theta(m^{1/2})$ tests are always
7779 necessary and sometimes sufficient.
7780 </p>
7781 </description>
7782 </item>
7783 <item>
7784 <title>Minimum Cuts in Geometric Intersection Graphs. (arXiv:2005.00858v2 [cs.CG] UPDATED)</title>
7785 <link>http://fr.arxiv.org/abs/2005.00858</link>
7786 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Cabello_S/0/1/0/all/0/1">Sergio Cabello</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mulzer_W/0/1/0/all/0/1">Wolfgang Mulzer</a></p>
7787
7788 <p>Let $\mathcal{D}$ be a set of $n$ disks in the plane. The disk graph
7789 $G_\mathcal{D}$ for $\mathcal{D}$ is the undirected graph with vertex set
7790 $\mathcal{D}$ in which two disks are joined by an edge if and only if they
7791 intersect. The directed transmission graph $G^{\rightarrow}_\mathcal{D}$ for
7792 $\mathcal{D}$ is the directed graph with vertex set $\mathcal{D}$ in which
7793 there is an edge from a disk $D_1 \in \mathcal{D}$ to a disk $D_2 \in
7794 \mathcal{D}$ if and only if $D_1$ contains the center of $D_2$.
7795 </p>
7796 <p>Given $\mathcal{D}$ and two non-intersecting disks $s, t \in \mathcal{D}$, we
7797 show that a minimum $s$-$t$ vertex cut in $G_\mathcal{D}$ or in
7798 $G^{\rightarrow}_\mathcal{D}$ can be found in $O(n^{3/2}\text{polylog} n)$
7799 expected time. To obtain our result, we combine an algorithm for the maximum
7800 flow problem in general graphs with dynamic geometric data structures to
7801 manipulate the disks.
7802 </p>
7803 <p>As an application, we consider the barrier resilience problem in a
7804 rectangular domain. In this problem, we have a vertical strip $S$ bounded by
7805 two vertical lines, $L_\ell$ and $L_r$, and a collection $\mathcal{D}$ of
7806 disks. Let $a$ be a point in $S$ above all disks of $\mathcal{D}$, and let $b$
7807 a point in $S$ below all disks of $\mathcal{D}$. The task is to find a curve
7808 from $a$ to $b$ that lies in $S$ and that intersects as few disks of
7809 $\mathcal{D}$ as possible. Using our improved algorithm for minimum cuts in
7810 disk graphs, we can solve the barrier resilience problem in
7811 $O(n^{3/2}\text{polylog} n)$ expected time.
7812 </p>
7813 </description>
7814 </item>
7815 <item>
7816 <title>Model Creation and Equivalence Proofs of Cellular Automata and Artificial Neural Networks. (arXiv:2005.01192v3 [cs.NE] UPDATED)</title>
7817 <link>http://fr.arxiv.org/abs/2005.01192</link>
7818 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Christen_P/0/1/0/all/0/1">Patrik Christen</a></p>
7819
7820 <p>Computational methods and mathematical models have invaded arguably every
7821 scientific discipline forming its own field of research called computational
7822 science. Mathematical models are the theoretical foundation of computational
7823 science. Since Newton's time, differential equations in mathematical models
7824 have been widely and successfully used to describe the macroscopic or global
7825 behaviour of systems. With spatially inhomogeneous, time-varying, local
7826 element-specific, and often non-linear interactions, the dynamics of complex
7827 systems is in contrast more efficiently described by local rules and thus in an
7828 algorithmic and local or microscopic manner. The theory of mathematical
7829 modelling taking into account these characteristics of complex systems has to
7830 be established still. We recently presented a so-called allagmatic method
7831 including a system metamodel to provide a framework for describing, modelling,
7832 simulating, and interpreting complex systems. Implementations of cellular
7833 automata and artificial neural networks were described and created with that
7834 method. Guidance from philosophy were helpful in these first studies focusing
7835 on programming and feasibility. A rigorous mathematical formalism, however, is
7836 still missing. This would not only more precisely describe and define the
7837 system metamodel, it would also further generalise it and with that extend its
7838 reach to formal treatment in applied mathematics and theoretical aspects of
7839 computational science as well as extend its applicability to other mathematical
7840 and computational models such as agent-based models. Here, a mathematical
7841 definition of the system metamodel is provided. Based on the presented
7842 formalism, model creation and equivalence of cellular automata and artificial
7843 neural networks are proved. It thus provides a formal approach for studying the
7844 creation of mathematical models as well as their structural and operational
7845 comparison.
7846 </p>
7847 </description>
7848 </item>
7849 <item>
7850 <title>Analysis of the Symmetric Join the Shortest Orbit Queue. (arXiv:2005.02683v2 [math.PR] UPDATED)</title>
7851 <link>http://fr.arxiv.org/abs/2005.02683</link>
7852 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Dimitriou_I/0/1/0/all/0/1">Ioannis Dimitriou</a></p>
7853
7854 <p>This work introduces the join the shortest queue policy in the retrial
7855 setting. We consider a Markovian single server retrial system with two infinite
7856 capacity orbits. An arriving job finding the server busy, it is forwarded to
7857 the least loaded orbit. Otherwise, it is forwarded to an orbit randomly.
7858 Orbiting jobs of either type retry to access the server independently. We
7859 investigate the stability condition, the stationary tail decay rate, and obtain
7860 the equilibrium distribution by using the compensation method.
7861 </p>
7862 </description>
7863 </item>
7864 <item>
7865 <title>Anonymized GCN: A Novel Robust Graph Embedding Method via Hiding Node Position in Noise. (arXiv:2005.03482v2 [cs.LG] UPDATED)</title>
7866 <link>http://fr.arxiv.org/abs/2005.03482</link>
7867 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_A/0/1/0/all/0/1">Ao Liu</a></p>
7868
7869 <p>Graph convolution network (GCN) have achieved state-of-the-art performance in
7870 the task of node prediction in the graph structure. However, with the gradual
7871 various of graph attack methods, there are lack of research on the robustness
7872 of GCN. In this paper, we prove the reason why GCN is vulnerable to attack:
7873 only training another GCN model can find the vulnerability of the target GCN
7874 model. To solve that, we propose a GCN model which is robust to attacks. By
7875 hiding the node's position in the Gaussian noise, the attacker will not be able
7876 to modify the connection information of the graph node, thus immune to the
7877 attack. Considering attackers usually modify the connection to interfere the
7878 prediction results of the target node, so, by hiding the connection of the
7879 graph in the noise through adversarial training, accurate node prediction can
7880 be completed only by the node number rather than its specific position in the
7881 graph, thus let the nodes in the graph are no longer related to the graph
7882 itself, that is to say, make the node anonymous. Specifically, we first
7883 demonstrated the key to determine the embedding of a specific node: the row
7884 corresponding to the node of the eigenmatrix of the Laplace matrix, by target
7885 it as the output of the generator, we take the corresponding noise as input.
7886 The generator will try to find the correct position of the node in the graph.
7887 Then the encoder and decoder are spliced both in discriminator, so that after
7888 adversarial training, the generator and discriminator can cooperate to complete
7889 the node prediction. Finally, All node positions can generated by noise at the
7890 same time, that is to say, the generator will hides all the connection
7891 information of the graph structure. The evaluation shows that we only need to
7892 obtain the initial features and node numbers of the nodes to complete the node
7893 prediction, and the accuracy did not decrease, but increased by 0.0293.
7894 </p>
7895 </description>
7896 </item>
7897 <item>
7898 <title>InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs. (arXiv:2005.09635v2 [cs.CV] UPDATED)</title>
7899 <link>http://fr.arxiv.org/abs/2005.09635</link>
7900 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_Y/0/1/0/all/0/1">Yujun Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_C/0/1/0/all/0/1">Ceyuan Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_X/0/1/0/all/0/1">Xiaoou Tang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_B/0/1/0/all/0/1">Bolei Zhou</a></p>
7901
7902 <p>Although Generative Adversarial Networks (GANs) have made significant
7903 progress in face synthesis, there lacks enough understanding of what GANs have
7904 learned in the latent representation to map a random code to a photo-realistic
7905 image. In this work, we propose a framework called InterFaceGAN to interpret
7906 the disentangled face representation learned by the state-of-the-art GAN models
7907 and study the properties of the facial semantics encoded in the latent space.
7908 We first find that GANs learn various semantics in some linear subspaces of the
7909 latent space. After identifying these subspaces, we can realistically
7910 manipulate the corresponding facial attributes without retraining the model. We
7911 then conduct a detailed study on the correlation between different semantics
7912 and manage to better disentangle them via subspace projection, resulting in
7913 more precise control of the attribute manipulation. Besides manipulating the
7914 gender, age, expression, and presence of eyeglasses, we can even alter the face
7915 pose and fix the artifacts accidentally made by GANs. Furthermore, we perform
7916 an in-depth face identity analysis and a layer-wise analysis to evaluate the
7917 editing results quantitatively. Finally, we apply our approach to real face
7918 editing by employing GAN inversion approaches and explicitly training
7919 feed-forward models based on the synthetic data established by InterFaceGAN.
7920 Extensive experimental results suggest that learning to synthesize faces
7921 spontaneously brings a disentangled and controllable face representation.
7922 </p>
7923 </description>
7924 </item>
7925 <item>
7926 <title>Stochastic control liasons: Richard Sinkhorn meets Gaspard Monge on a Schroedinger bridge. (arXiv:2005.10963v2 [math.OC] UPDATED)</title>
7927 <link>http://fr.arxiv.org/abs/2005.10963</link>
7928 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Chen_Y/0/1/0/all/0/1">Yongxin Chen</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Georgiou_T/0/1/0/all/0/1">Tryphon T. Georgiou</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Pavon_M/0/1/0/all/0/1">Michele Pavon</a></p>
7929
7930 <p>In 1931/32, Schroedinger studied a hot gas Gedankenexperiment, an instance of
7931 large deviations of the empirical distribution and an early example of the
7932 so-called maximum entropy inference method. This so-called Schroedinger bridge
7933 problem (SBP) was recently recognized as a regularization of the
7934 Monge-Kantorovich Optimal Mass Transport (OMT), leading to effective
7935 computation of the latter. Specifically, OMT with quadratic cost may be viewed
7936 as a zero-temperature limit of SBP, which amounts to minimization of the
7937 Helmholtz's free energy over probability distributions constrained to possess
7938 given marginals. The problem features a delicate compromise, mediated by a
7939 temperature parameter, between minimizing the internal energy and maximizing
7940 the entropy. These concepts are central to a rapidly expanding area of modern
7941 science dealing with the so-called {\em Sinkhorn algorithm} which appears as a
7942 special case of an algorithm first studied by the French analyst Robert Fortet
7943 in 1938/40 specifically for Schroedinger bridges. Due to the constraint on
7944 end-point distributions, dynamic programming is not a suitable tool to attack
7945 these problems. Instead, Fortet's iterative algorithm and its discrete
7946 counterpart, the Sinkhorn iteration, permit computation by iteratively solving
7947 the so-called {\em Schroedinger system}. In both the continuous as well as the
7948 discrete-time and space settings, {\em stochastic control} provides a
7949 reformulation and dynamic versions of these problems. The formalism behind
7950 these control problems have attracted attention as they lead to a variety of
7951 new applications in spacecraft guidance, control of robot or biological swarms,
7952 sensing, active cooling, network routing as well as in computer and data
7953 science. This multifacet and versatile framework, intertwining SBP and OMT,
7954 provides the substrate for a historical and technical overview of the field
7955 taken up in this paper.
7956 </p>
7957 </description>
7958 </item>
7959 <item>
7960 <title>Multivariate Quasi-tight Framelets with High Balancing Orders Derived from Any Compactly Supported Refinable Vector Functions. (arXiv:2005.12451v2 [math.FA] UPDATED)</title>
7961 <link>http://fr.arxiv.org/abs/2005.12451</link>
7962 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Han_B/0/1/0/all/0/1">Bin Han</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lu_R/0/1/0/all/0/1">Ran Lu</a></p>
7963
7964 <p>Generalizing wavelets by adding desired redundancy and flexibility,framelets
7965 are of interest and importance in many applications such as image processing
7966 and numerical algorithms. Several key properties of framelets are high
7967 vanishing moments for sparse multiscale representation, fast framelet
7968 transforms for numerical efficiency, and redundancy for robustness. However, it
7969 is a challenging problem to study and construct multivariate nonseparable
7970 framelets, mainly due to their intrinsic connections to factorization and
7971 syzygy modules of multivariate polynomial matrices. In this paper, we
7972 circumvent the above difficulties through the approach of quasi-tight
7973 framelets, which behave almost identically to tight framelets. Employing the
7974 popular oblique extension principle (OEP), from an arbitrary compactly
7975 supported $\dm$-refinable vector function $\phi$ with multiplicity greater than
7976 one, we prove that we can always derive from $\phi$ a compactly supported
7977 multivariate quasi-tight framelet such that (i) all the framelet generators
7978 have the highest possible order of vanishing moments;(ii) its associated fast
7979 framelet transform is compact with the highest balancing order.For a refinable
7980 scalar function $\phi$, the above item (ii) often cannot be achieved
7981 intrinsically but we show that we can always construct a compactly supported
7982 OEP-based multivariate quasi-tight framelet derived from $\phi$ satisfying item
7983 (i).This paper provides a comprehensive investigation on OEP-based multivariate
7984 quasi-tight multiframelets and their associated framelet transforms with high
7985 balancing orders. This deepens our theoretical understanding of multivariate
7986 quasi-tight multiframelets and their associated fast multiframelet transforms.
7987 </p>
7988 </description>
7989 </item>
7990 <item>
7991 <title>Refining Implicit Argument Annotation for UCCA. (arXiv:2005.12889v2 [cs.CL] UPDATED)</title>
7992 <link>http://fr.arxiv.org/abs/2005.12889</link>
7993 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_R/0/1/0/all/0/1">Ruixiang Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hershcovich_D/0/1/0/all/0/1">Daniel Hershcovich</a></p>
7994
7995 <p>Predicate-argument structure analysis is a central component in meaning
7996 representations of text. The fact that some arguments are not explicitly
7997 mentioned in a sentence gives rise to ambiguity in language understanding, and
7998 renders it difficult for machines to interpret text correctly. However, only
7999 few resources represent implicit roles for NLU, and existing studies in NLP
8000 only make coarse distinctions between categories of arguments omitted from
8001 linguistic form. This paper proposes a typology for fine-grained implicit
8002 argument annotation on top of Universal Conceptual Cognitive Annotation's
8003 foundational layer. The proposed implicit argument categorisation is driven by
8004 theories of implicit role interpretation and consists of six types: Deictic,
8005 Generic, Genre-based, Type-identifiable, Non-specific, and Iterated-set. We
8006 exemplify our design by revisiting part of the UCCA EWT corpus, providing a new
8007 dataset annotated with the refinement layer, and making a comparative analysis
8008 with other schemes.
8009 </p>
8010 </description>
8011 </item>
8012 <item>
8013 <title>An Empirical Study of Bots in Software Development -- Characteristics and Challenges from a Practitioner's Perspective. (arXiv:2005.13969v2 [cs.SE] UPDATED)</title>
8014 <link>http://fr.arxiv.org/abs/2005.13969</link>
8015 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Erlenhov_L/0/1/0/all/0/1">Linda Erlenhov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Neto_F/0/1/0/all/0/1">Francisco Gomes de Oliveira Neto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leitner_P/0/1/0/all/0/1">Philipp Leitner</a></p>
8016
8017 <p>Software engineering bots - automated tools that handle tedious tasks - are
8018 increasingly used by industrial and open source projects to improve developer
8019 productivity. Current research in this area is held back by a lack of consensus
8020 of what software engineering bots (DevBots) actually are, what characteristics
8021 distinguish them from other tools, and what benefits and challenges are
8022 associated with DevBot usage. In this paper we report on a mixed-method
8023 empirical study of DevBot usage in industrial practice. We report on findings
8024 from interviewing 21 and surveying a total of 111 developers. We identify three
8025 different personas among DevBot users (focusing on autonomy, chat interfaces,
8026 and "smartness"), each with different definitions of what a DevBot is, why
8027 developers use them, and what they struggle with. We conclude that future
8028 DevBot research should situate their work within our framework, to clearly
8029 identify what type of bot the work targets, and what advantages practitioners
8030 can expect. Further, we find that there currently is a lack of general purpose
8031 "smart" bots that go beyond simple automation tools or chat interfaces. This is
8032 problematic, as we have seen that such bots, if available, can have a
8033 transformative effect on the projects that use them.
8034 </p>
8035 </description>
8036 </item>
8037 <item>
8038 <title>Sub-Band Knowledge Distillation Framework for Speech Enhancement. (arXiv:2005.14435v2 [eess.AS] UPDATED)</title>
8039 <link>http://fr.arxiv.org/abs/2005.14435</link>
8040 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wen_S/0/1/0/all/0/1">Shixue Wen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_Y/0/1/0/all/0/1">Yun Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gao_G/0/1/0/all/0/1">Guanglai Gao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_X/0/1/0/all/0/1">Xiaofei Li</a></p>
8041
8042 <p>In single-channel speech enhancement, methods based on full-band spectral
8043 features have been widely studied. However, only a few methods pay attention to
8044 non-full-band spectral features. In this paper, we explore a knowledge
8045 distillation framework based on sub-band spectral mapping for single-channel
8046 speech enhancement. Specifically, we divide the full frequency band into
8047 multiple sub-bands and pre-train an elite-level sub-band enhancement model
8048 (teacher model) for each sub-band. These teacher models are dedicated to
8049 processing their own sub-bands. Next, under the teacher models' guidance, we
8050 train a general sub-band enhancement model (student model) that works for all
8051 sub-bands. Without increasing the number of model parameters and computational
8052 complexity, the student model's performance is further improved. To evaluate
8053 our proposed method, we conducted a large number of experiments on an
8054 open-source data set. The final experimental results show that the guidance
8055 from the elite-level teacher models dramatically improves the student model's
8056 performance, which exceeds the full-band model by employing fewer parameters.
8057 </p>
8058 </description>
8059 </item>
8060 <item>
8061 <title>SNR-Based Teachers-Student Technique for Speech Enhancement. (arXiv:2005.14441v2 [eess.AS] UPDATED)</title>
8062 <link>http://fr.arxiv.org/abs/2005.14441</link>
8063 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_Z/0/1/0/all/0/1">Zhiyu Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_Q/0/1/0/all/0/1">Qiang Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_H/0/1/0/all/0/1">Huali Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gao_G/0/1/0/all/0/1">Guanglai Gao</a></p>
8064
8065 <p>It is very challenging for speech enhancement methods to achieves robust
8066 performance under both high signal-to-noise ratio (SNR) and low SNR
8067 simultaneously. In this paper, we propose a method that integrates an SNR-based
8068 teachers-student technique and time-domain U-Net to deal with this problem.
8069 Specifically, this method consists of multiple teacher models and a student
8070 model. We first train the teacher models under multiple small-range SNRs that
8071 do not coincide with each other so that they can perform speech enhancement
8072 well within the specific SNR range. Then, we choose different teacher models to
8073 supervise the training of the student model according to the SNR of the
8074 training data. Eventually, the student model can perform speech enhancement
8075 under both high SNR and low SNR. To evaluate the proposed method, we
8076 constructed a dataset with an SNR ranging from -20dB to 20dB based on the
8077 public dataset. We experimentally analyzed the effectiveness of the SNR-based
8078 teachers-student technique and compared the proposed method with several
8079 state-of-the-art methods.
8080 </p>
8081 </description>
8082 </item>
8083 <item>
8084 <title>A mathematical model for automatic differentiation in machine learning. (arXiv:2006.02080v2 [cs.LG] UPDATED)</title>
8085 <link>http://fr.arxiv.org/abs/2006.02080</link>
8086 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bolte_J/0/1/0/all/0/1">Jerome Bolte</a> (TSE), <a href="http://fr.arxiv.org/find/cs/1/au:+Pauwels_E/0/1/0/all/0/1">Edouard Pauwels</a> (IRIT-ADRIA)</p>
8087
8088 <p>Automatic differentiation, as implemented today, does not have a simple
8089 mathematical model adapted to the needs of modern machine learning. In this
8090 work we articulate the relationships between differentiation of programs as
8091 implemented in practice and differentiation of nonsmooth functions. To this end
8092 we provide a simple class of functions, a nonsmooth calculus, and show how they
8093 apply to stochastic approximation methods. We also evidence the issue of
8094 artificial critical points created by algorithmic differentiation and show how
8095 usual methods avoid these points with probability one.
8096 </p>
8097 </description>
8098 </item>
8099 <item>
8100 <title>Convolutional Neural Networks for Global Human Settlements Mapping from Sentinel-2 Satellite Imagery. (arXiv:2006.03267v2 [eess.IV] UPDATED)</title>
8101 <link>http://fr.arxiv.org/abs/2006.03267</link>
8102 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Corbane_C/0/1/0/all/0/1">Christina Corbane</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Syrris_V/0/1/0/all/0/1">Vasileios Syrris</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sabo_F/0/1/0/all/0/1">Filip Sabo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Politis_P/0/1/0/all/0/1">Panagiotis Politis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Melchiorri_M/0/1/0/all/0/1">Michele Melchiorri</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pesaresi_M/0/1/0/all/0/1">Martino Pesaresi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Soille_P/0/1/0/all/0/1">Pierre Soille</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kemper_T/0/1/0/all/0/1">Thomas Kemper</a></p>
8103
8104 <p>Spatially consistent and up-to-date maps of human settlements are crucial for
8105 addressing policies related to urbanization and sustainability, especially in
8106 the era of an increasingly urbanized world.The availability of open and free
8107 Sentinel-2 data of the Copernicus Earth Observation program offers a new
8108 opportunity for wall-to-wall mapping of human settlements at a global
8109 scale.This paper presents a deep-learning-based framework for a fully automated
8110 extraction of built-up areas at a spatial resolution of 10 m from a global
8111 composite of Sentinel-2 imagery.A multi-neuro modeling methodology building on
8112 a simple Convolution Neural Networks architecture for pixel-wise image
8113 classification of built-up areas is developed.The core features of the proposed
8114 model are the image patch of size 5 x 5 pixels adequate for describing built-up
8115 areas from Sentinel-2 imagery and the lightweight topology with a total number
8116 of 1,448,578 trainable parameters and 4 2D convolutional layers and 2 flattened
8117 layers.The deployment of the model on the global Sentinel-2 image composite
8118 provides the most detailed and complete map reporting about built-up areas for
8119 reference year 2018. The validation of the results with an independent
8120 reference data-set of building footprints covering 277 sites across the world
8121 establishes the reliability of the built-up layer produced by the proposed
8122 framework and the model robustness.
8123 </p>
8124 </description>
8125 </item>
8126 <item>
8127 <title>3D Self-Supervised Methods for Medical Imaging. (arXiv:2006.03829v2 [cs.CV] UPDATED)</title>
8128 <link>http://fr.arxiv.org/abs/2006.03829</link>
8129 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Taleb_A/0/1/0/all/0/1">Aiham Taleb</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Loetzsch_W/0/1/0/all/0/1">Winfried Loetzsch</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Danz_N/0/1/0/all/0/1">Noel Danz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Severin_J/0/1/0/all/0/1">Julius Severin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gaertner_T/0/1/0/all/0/1">Thomas Gaertner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bergner_B/0/1/0/all/0/1">Benjamin Bergner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lippert_C/0/1/0/all/0/1">Christoph Lippert</a></p>
8130
8131 <p>Self-supervised learning methods have witnessed a recent surge of interest
8132 after proving successful in multiple application fields. In this work, we
8133 leverage these techniques, and we propose 3D versions for five different
8134 self-supervised methods, in the form of proxy tasks. Our methods facilitate
8135 neural network feature learning from unlabeled 3D images, aiming to reduce the
8136 required cost for expert annotation. The developed algorithms are 3D
8137 Contrastive Predictive Coding, 3D Rotation prediction, 3D Jigsaw puzzles,
8138 Relative 3D patch location, and 3D Exemplar networks. Our experiments show that
8139 pretraining models with our 3D tasks yields more powerful semantic
8140 representations, and enables solving downstream tasks more accurately and
8141 efficiently, compared to training the models from scratch and to pretraining
8142 them on 2D slices. We demonstrate the effectiveness of our methods on three
8143 downstream tasks from the medical imaging domain: i) Brain Tumor Segmentation
8144 from 3D MRI, ii) Pancreas Tumor Segmentation from 3D CT, and iii) Diabetic
8145 Retinopathy Detection from 2D Fundus images. In each task, we assess the gains
8146 in data-efficiency, performance, and speed of convergence. Interestingly, we
8147 also find gains when transferring the learned representations, by our methods,
8148 from a large unlabeled 3D corpus to a small downstream-specific dataset. We
8149 achieve results competitive to state-of-the-art solutions at a fraction of the
8150 computational expense. We publish our implementations for the developed
8151 algorithms (both 3D and 2D versions) as an open-source library, in an effort to
8152 allow other researchers to apply and extend our methods on their datasets.
8153 </p>
8154 </description>
8155 </item>
8156 <item>
8157 <title>Truthful Data Acquisition via Peer Prediction. (arXiv:2006.03992v2 [cs.GT] UPDATED)</title>
8158 <link>http://fr.arxiv.org/abs/2006.03992</link>
8159 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yiling Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_Y/0/1/0/all/0/1">Yiheng Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zheng_S/0/1/0/all/0/1">Shuran Zheng</a></p>
8160
8161 <p>We consider the problem of purchasing data for machine learning or
8162 statistical estimation. The data analyst has a budget to purchase datasets from
8163 multiple data providers. She does not have any test data that can be used to
8164 evaluate the collected data and can assign payments to data providers solely
8165 based on the collected datasets. We consider the problem in the standard
8166 Bayesian paradigm and in two settings: (1) data are only collected once; (2)
8167 data are collected repeatedly and each day's data are drawn independently from
8168 the same distribution. For both settings, our mechanisms guarantee that
8169 truthfully reporting one's dataset is always an equilibrium by adopting
8170 techniques from peer prediction: pay each provider the mutual information
8171 between his reported data and other providers' reported data. Depending on the
8172 data distribution, the mechanisms can also discourage misreports that would
8173 lead to inaccurate predictions. Our mechanisms also guarantee individual
8174 rationality and budget feasibility for certain underlying distributions in the
8175 first setting and for all distributions in the second setting.
8176 </p>
8177 </description>
8178 </item>
8179 <item>
8180 <title>Self-consumption for energy communities in Spain: a regional analysis under the new legal framework. (arXiv:2006.06459v3 [eess.SY] UPDATED)</title>
8181 <link>http://fr.arxiv.org/abs/2006.06459</link>
8182 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Gallego_Castillo_C/0/1/0/all/0/1">Cristobal Gallego-Castillo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Heleno_M/0/1/0/all/0/1">Miguel Heleno</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Victoria_M/0/1/0/all/0/1">Marta Victoria</a></p>
8183
8184 <p>European climate polices acknowledge the role that energy communities can
8185 play in the energy transition. Self-consumption installations shared among
8186 those living in the same building are a good example of such energy
8187 communities. In this work, we perform a regional analysis of optimal
8188 self-consumption installations under the new legal framework recently passed in
8189 Spain. Results show that the optimal sizing of the installation leads to
8190 economic savings for self-consumers in all the territory, for both options with
8191 and without remuneration for energy surplus. A sensitivity analysis on
8192 technology costs revealed that batteries still require noticeably cost
8193 reductions to be cost-effective in a behind the meter self-consumption
8194 environment. In addition, solar compensation mechanisms make batteries less
8195 attractive in a scenario of low PV costs, since feeding PV surplus into the
8196 grid, yet less efficient, becomes more cost-effective. An improvement for the
8197 current energy surplus remuneration policy was proposed and analysed. It
8198 consists in the inclusion of the economic value of the avoided power losses in
8199 the remuneration.
8200 </p>
8201 </description>
8202 </item>
8203 <item>
8204 <title>Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction. (arXiv:2006.06648v3 [cs.LG] UPDATED)</title>
8205 <link>http://fr.arxiv.org/abs/2006.06648</link>
8206 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Baek_J/0/1/0/all/0/1">Jinheon Baek</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_D/0/1/0/all/0/1">Dong Bok Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hwang_S/0/1/0/all/0/1">Sung Ju Hwang</a></p>
8207
8208 <p>Many practical graph problems, such as knowledge graph construction and
8209 drug-drug interaction prediction, require to handle multi-relational graphs.
8210 However, handling real-world multi-relational graphs with Graph Neural Networks
8211 (GNNs) is often challenging due to their evolving nature, as new entities
8212 (nodes) can emerge over time. Moreover, newly emerged entities often have few
8213 links, which makes the learning even more difficult. Motivated by this
8214 challenge, we introduce a realistic problem of few-shot out-of-graph link
8215 prediction, where we not only predict the links between the seen and unseen
8216 nodes as in a conventional out-of-knowledge link prediction task but also
8217 between the unseen nodes, with only few edges per node. We tackle this problem
8218 with a novel transductive meta-learning framework which we refer to as Graph
8219 Extrapolation Networks (GEN). GEN meta-learns both the node embedding network
8220 for inductive inference (seen-to-unseen) and the link prediction network for
8221 transductive inference (unseen-to-unseen). For transductive link prediction, we
8222 further propose a stochastic embedding layer to model uncertainty in the link
8223 prediction between unseen entities. We validate our model on multiple benchmark
8224 datasets for knowledge graph completion and drug-drug interaction prediction.
8225 The results show that our model significantly outperforms relevant baselines
8226 for out-of-graph link prediction tasks.
8227 </p>
8228 </description>
8229 </item>
8230 <item>
8231 <title>Frontiers in Mortar Methods for Isogeometric Analysis. (arXiv:2006.06677v3 [cs.CE] UPDATED)</title>
8232 <link>http://fr.arxiv.org/abs/2006.06677</link>
8233 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hesch_C/0/1/0/all/0/1">Christian Hesch</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Khristenko_U/0/1/0/all/0/1">Ustim Khristenko</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Krause_R/0/1/0/all/0/1">Rolf Krause</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Popp_A/0/1/0/all/0/1">Alexander Popp</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Seitz_A/0/1/0/all/0/1">Alexander Seitz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wall_W/0/1/0/all/0/1">Wolfgang Wall</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wohlmuth_B/0/1/0/all/0/1">Barbara Wohlmuth</a></p>
8234
8235 <p>Complex geometries as common in industrial applications consist of multiple
8236 patches, if spline based parametrizations are used. The requirements for the
8237 generation of analysis-suitable models are increasing dramatically since
8238 isogeometric analysis is directly based on the spline parametrization and
8239 nowadays used for the calculation of higher-order partial differential
8240 equations. The computational, or more general, the engineering analysis
8241 necessitates suitable coupling techniques between the different patches. Mortar
8242 methods have been successfully applied for coupling of patches and for contact
8243 mechanics in recent years to resolve the arising issues within the interface.
8244 We present here current achievements in the design of mortar technologies in
8245 isogeometric analysis within the Priority Program SPP 1748, Reliable Simulation
8246 Techniques in Solid Mechanics. Development of Non-standard Discretisation
8247 Methods, Mechanical and Mathematical Analysis.
8248 </p>
8249 </description>
8250 </item>
8251 <item>
8252 <title>Sparse and Continuous Attention Mechanisms. (arXiv:2006.07214v3 [cs.LG] UPDATED)</title>
8253 <link>http://fr.arxiv.org/abs/2006.07214</link>
8254 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Martins_A/0/1/0/all/0/1">Andr&#xe9; F. T. Martins</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farinhas_A/0/1/0/all/0/1">Ant&#xf3;nio Farinhas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Treviso_M/0/1/0/all/0/1">Marcos Treviso</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Niculae_V/0/1/0/all/0/1">Vlad Niculae</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Aguiar_P/0/1/0/all/0/1">Pedro M. Q. Aguiar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Figueiredo_M/0/1/0/all/0/1">M&#xe1;rio A. T. Figueiredo</a></p>
8255
8256 <p>Exponential families are widely used in machine learning; they include many
8257 distributions in continuous and discrete domains (e.g., Gaussian, Dirichlet,
8258 Poisson, and categorical distributions via the softmax transformation).
8259 Distributions in each of these families have fixed support. In contrast, for
8260 finite domains, there has been recent work on sparse alternatives to softmax
8261 (e.g. sparsemax and alpha-entmax), which have varying support, being able to
8262 assign zero probability to irrelevant categories. This paper expands that work
8263 in two directions: first, we extend alpha-entmax to continuous domains,
8264 revealing a link with Tsallis statistics and deformed exponential families.
8265 Second, we introduce continuous-domain attention mechanisms, deriving efficient
8266 gradient backpropagation algorithms for alpha in {1,2}. Experiments on
8267 attention-based text classification, machine translation, and visual question
8268 answering illustrate the use of continuous attention in 1D and 2D, showing that
8269 it allows attending to time intervals and compact regions.
8270 </p>
8271 </description>
8272 </item>
8273 <item>
8274 <title>Neural Estimators for Conditional Mutual Information Using Nearest Neighbors Sampling. (arXiv:2006.07225v2 [cs.IT] UPDATED)</title>
8275 <link>http://fr.arxiv.org/abs/2006.07225</link>
8276 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Molavipour_S/0/1/0/all/0/1">Sina Molavipour</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bassi_G/0/1/0/all/0/1">Germ&#xe1;n Bassi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Skoglund_M/0/1/0/all/0/1">Mikael Skoglund</a></p>
8277
8278 <p>The estimation of mutual information (MI) or conditional mutual information
8279 (CMI) from a set of samples is a long-standing problem. A recent line of work
8280 in this area has leveraged the approximation power of artificial neural
8281 networks and has shown improvements over conventional methods. One important
8282 challenge in this new approach is the need to obtain, given the original
8283 dataset, a different set where the samples are distributed according to a
8284 specific product density function. This is particularly challenging when
8285 estimating CMI.
8286 </p>
8287 <p>In this paper, we introduce a new technique, based on k nearest neighbors
8288 (k-NN), to perform the resampling and derive high-confidence concentration
8289 bounds for the sample average. Then the technique is employed to train a neural
8290 network classifier and the CMI is estimated accordingly. We propose three
8291 estimators using this technique and prove their consistency, make a comparison
8292 between them and similar approaches in the literature, and experimentally show
8293 improvements in estimating the CMI in terms of accuracy and variance of the
8294 estimators.
8295 </p>
8296 </description>
8297 </item>
8298 <item>
8299 <title>Learning Latent Space Energy-Based Prior Model. (arXiv:2006.08205v2 [stat.ML] UPDATED)</title>
8300 <link>http://fr.arxiv.org/abs/2006.08205</link>
8301 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Pang_B/0/1/0/all/0/1">Bo Pang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Han_T/0/1/0/all/0/1">Tian Han</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Nijkamp_E/0/1/0/all/0/1">Erik Nijkamp</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Zhu_S/0/1/0/all/0/1">Song-Chun Zhu</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wu_Y/0/1/0/all/0/1">Ying Nian Wu</a></p>
8302
8303 <p>We propose to learn energy-based model (EBM) in the latent space of a
8304 generator model, so that the EBM serves as a prior model that stands on the
8305 top-down network of the generator model. Both the latent space EBM and the
8306 top-down network can be learned jointly by maximum likelihood, which involves
8307 short-run MCMC sampling from both the prior and posterior distributions of the
8308 latent vector. Due to the low dimensionality of the latent space and the
8309 expressiveness of the top-down network, a simple EBM in latent space can
8310 capture regularities in the data effectively, and MCMC sampling in latent space
8311 is efficient and mixes well. We show that the learned model exhibits strong
8312 performances in terms of image and text generation and anomaly detection. The
8313 one-page code can be found in supplementary materials.
8314 </p>
8315 </description>
8316 </item>
8317 <item>
8318 <title>Iterative regularization for convex regularizers. (arXiv:2006.09859v2 [stat.ML] UPDATED)</title>
8319 <link>http://fr.arxiv.org/abs/2006.09859</link>
8320 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Molinari_C/0/1/0/all/0/1">Cesare Molinari</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Massias_M/0/1/0/all/0/1">Mathurin Massias</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Rosasco_L/0/1/0/all/0/1">Lorenzo Rosasco</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Villa_S/0/1/0/all/0/1">Silvia Villa</a></p>
8321
8322 <p>We study iterative regularization for linear models, when the bias is convex
8323 but not necessarily strongly convex. We characterize the stability properties
8324 of a primal-dual gradient based approach, analyzing its convergence in the
8325 presence of worst case deterministic noise. As a main example, we specialize
8326 and illustrate the results for the problem of robust sparse recovery. Key to
8327 our analysis is a combination of ideas from regularization theory and
8328 optimization in the presence of errors. Theoretical results are complemented by
8329 experiments showing that state-of-the-art performances can be achieved with
8330 considerable computational speed-ups.
8331 </p>
8332 </description>
8333 </item>
8334 <item>
8335 <title>Socially Fair k-Means Clustering. (arXiv:2006.10085v2 [cs.LG] UPDATED)</title>
8336 <link>http://fr.arxiv.org/abs/2006.10085</link>
8337 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ghadiri_M/0/1/0/all/0/1">Mehrdad Ghadiri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Samadi_S/0/1/0/all/0/1">Samira Samadi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vempala_S/0/1/0/all/0/1">Santosh Vempala</a></p>
8338
8339 <p>We show that the popular k-means clustering algorithm (Lloyd's heuristic),
8340 used for a variety of scientific data, can result in outcomes that are
8341 unfavorable to subgroups of data (e.g., demographic groups). Such biased
8342 clusterings can have deleterious implications for human-centric applications
8343 such as resource allocation. We present a fair k-means objective and algorithm
8344 to choose cluster centers that provide equitable costs for different groups.
8345 The algorithm, Fair-Lloyd, is a modification of Lloyd's heuristic for k-means,
8346 inheriting its simplicity, efficiency, and stability. In comparison with
8347 standard Lloyd's, we find that on benchmark datasets, Fair-Lloyd exhibits
8348 unbiased performance by ensuring that all groups have equal costs in the output
8349 k-clustering, while incurring a negligible increase in running time, thus
8350 making it a viable fair option wherever k-means is currently used.
8351 </p>
8352 </description>
8353 </item>
8354 <item>
8355 <title>Neutralizing Self-Selection Bias in Sampling for Sortition. (arXiv:2006.10498v2 [cs.GT] UPDATED)</title>
8356 <link>http://fr.arxiv.org/abs/2006.10498</link>
8357 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Flanigan_B/0/1/0/all/0/1">Bailey Flanigan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Golz_P/0/1/0/all/0/1">Paul G&#xf6;lz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_A/0/1/0/all/0/1">Anupam Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Procaccia_A/0/1/0/all/0/1">Ariel Procaccia</a></p>
8358
8359 <p>Sortition is a political system in which decisions are made by panels of
8360 randomly selected citizens. The process for selecting a sortition panel is
8361 traditionally thought of as uniform sampling without replacement, which has
8362 strong fairness properties. In practice, however, sampling without replacement
8363 is not possible since only a fraction of agents is willing to participate in a
8364 panel when invited, and different demographic groups participate at different
8365 rates. In order to still produce panels whose composition resembles that of the
8366 population, we develop a sampling algorithm that restores close-to-equal
8367 representation probabilities for all agents while satisfying meaningful
8368 demographic quotas. As part of its input, our algorithm requires probabilities
8369 indicating how likely each volunteer in the pool was to participate. Since
8370 these participation probabilities are not directly observable, we show how to
8371 learn them, and demonstrate our approach using data on a real sortition panel
8372 combined with information on the general population in the form of publicly
8373 available survey data.
8374 </p>
8375 </description>
8376 </item>
8377 <item>
8378 <title>ContraGAN: Contrastive Learning for Conditional Image Generation. (arXiv:2006.12681v2 [cs.CV] UPDATED)</title>
8379 <link>http://fr.arxiv.org/abs/2006.12681</link>
8380 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kang_M/0/1/0/all/0/1">Minguk Kang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Park_J/0/1/0/all/0/1">Jaesik Park</a></p>
8381
8382 <p>Conditional image generation is the task of generating diverse images using
8383 class label information. Although many conditional Generative Adversarial
8384 Networks (GAN) have shown realistic results, such methods consider pairwise
8385 relations between the embedding of an image and the embedding of the
8386 corresponding label (data-to-class relations) as the conditioning losses. In
8387 this paper, we propose ContraGAN that considers relations between multiple
8388 image embeddings in the same batch (data-to-data relations) as well as the
8389 data-to-class relations by using a conditional contrastive loss. The
8390 discriminator of ContraGAN discriminates the authenticity of given samples and
8391 minimizes a contrastive objective to learn the relations between training
8392 images. Simultaneously, the generator tries to generate realistic images that
8393 deceive the authenticity and have a low contrastive loss. The experimental
8394 results show that ContraGAN outperforms state-of-the-art-models by 7.3% and
8395 7.7% on Tiny ImageNet and ImageNet datasets, respectively. Besides, we
8396 experimentally demonstrate that ContraGAN helps to relieve the overfitting of
8397 the discriminator. For a fair comparison, we re-implement twelve
8398 state-of-the-art GANs using the PyTorch library. The software package is
8399 available at https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.
8400 </p>
8401 </description>
8402 </item>
8403 <item>
8404 <title>Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization. (arXiv:2006.13258v2 [cs.LG] UPDATED)</title>
8405 <link>http://fr.arxiv.org/abs/2006.13258</link>
8406 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Barde_P/0/1/0/all/0/1">Paul Barde</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Roy_J/0/1/0/all/0/1">Julien Roy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jeon_W/0/1/0/all/0/1">Wonseok Jeon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pineau_J/0/1/0/all/0/1">Joelle Pineau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pal_C/0/1/0/all/0/1">Christopher Pal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nowrouzezahrai_D/0/1/0/all/0/1">Derek Nowrouzezahrai</a></p>
8407
8408 <p>Adversarial Imitation Learning alternates between learning a discriminator --
8409 which tells apart expert's demonstrations from generated ones -- and a
8410 generator's policy to produce trajectories that can fool this discriminator.
8411 This alternated optimization is known to be delicate in practice since it
8412 compounds unstable adversarial training with brittle and sample-inefficient
8413 reinforcement learning. We propose to remove the burden of the policy
8414 optimization steps by leveraging a novel discriminator formulation.
8415 Specifically, our discriminator is explicitly conditioned on two policies: the
8416 one from the previous generator's iteration and a learnable policy. When
8417 optimized, this discriminator directly learns the optimal generator's policy.
8418 Consequently, our discriminator's update solves the generator's optimization
8419 problem for free: learning a policy that imitates the expert does not require
8420 an additional optimization loop. This formulation effectively cuts by half the
8421 implementation and computational burden of Adversarial Imitation Learning
8422 algorithms by removing the Reinforcement Learning phase altogether. We show on
8423 a variety of tasks that our simpler approach is competitive to prevalent
8424 Imitation Learning methods.
8425 </p>
8426 </description>
8427 </item>
8428 <item>
8429 <title>Relative Deviation Margin Bounds. (arXiv:2006.14950v2 [cs.LG] UPDATED)</title>
8430 <link>http://fr.arxiv.org/abs/2006.14950</link>
8431 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Cortes_C/0/1/0/all/0/1">Corinna Cortes</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohri_M/0/1/0/all/0/1">Mehryar Mohri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Suresh_A/0/1/0/all/0/1">Ananda Theertha Suresh</a></p>
8432
8433 <p>We present a series of new and more favorable margin-based learning
8434 guarantees that depend on the empirical margin loss of a predictor. We give two
8435 types of learning bounds, both distribution-dependent and valid for general
8436 families, in terms of the Rademacher complexity or the empirical $\ell_\infty$
8437 covering number of the hypothesis set used. Furthermore, using our relative
8438 deviation margin bounds, we derive distribution-dependent generalization bounds
8439 for unbounded loss functions under the assumption of a finite moment. We also
8440 briefly highlight several applications of these bounds and discuss their
8441 connection with existing results.
8442 </p>
8443 </description>
8444 </item>
8445 <item>
8446 <title>Weighted hypersoft configuration model. (arXiv:2007.00124v2 [physics.soc-ph] UPDATED)</title>
8447 <link>http://fr.arxiv.org/abs/2007.00124</link>
8448 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Voitalov_I/0/1/0/all/0/1">Ivan Voitalov</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Hoorn_P/0/1/0/all/0/1">Pim van der Hoorn</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Kitsak_M/0/1/0/all/0/1">Maksim Kitsak</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Papadopoulos_F/0/1/0/all/0/1">Fragkiskos Papadopoulos</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Krioukov_D/0/1/0/all/0/1">Dmitri Krioukov</a></p>
8449
8450 <p>Maximum entropy null models of networks come in different flavors that depend
8451 on the type of constraints under which entropy is maximized. If the constraints
8452 are on degree sequences or distributions, we are dealing with configuration
8453 models. If the degree sequence is constrained exactly, the corresponding
8454 microcanonical ensemble of random graphs with a given degree sequence is the
8455 configuration model per se. If the degree sequence is constrained only on
8456 average, the corresponding grand-canonical ensemble of random graphs with a
8457 given expected degree sequence is the soft configuration model. If the degree
8458 sequence is not fixed at all but randomly drawn from a fixed distribution, the
8459 corresponding hypercanonical ensemble of random graphs with a given degree
8460 distribution is the hypersoft configuration model, a more adequate description
8461 of dynamic real-world networks in which degree sequences are never fixed but
8462 degree distributions often stay stable. Here, we introduce the hypersoft
8463 configuration model of weighted networks. The main contribution is a particular
8464 version of the model with power-law degree and strength distributions, and
8465 superlinear scaling of strengths with degrees, mimicking the properties of some
8466 real-world networks. As a byproduct, we generalize the notions of sparse
8467 graphons and their entropy to weighted networks.
8468 </p>
8469 </description>
8470 </item>
8471 <item>
8472 <title>Robustness against Relational Adversary. (arXiv:2007.00772v2 [cs.LG] UPDATED)</title>
8473 <link>http://fr.arxiv.org/abs/2007.00772</link>
8474 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yizhen Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Meng_X/0/1/0/all/0/1">Xiaozhu Meng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_K/0/1/0/all/0/1">Ke Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Christodorescu_M/0/1/0/all/0/1">Mihai Christodorescu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jha_S/0/1/0/all/0/1">Somesh Jha</a></p>
8475
8476 <p>Test-time adversarial attacks have posed serious challenges to the robustness
8477 of machine-learning models, and in many settings the adversarial perturbation
8478 need not be bounded by small $\ell_p$-norms. Motivated by the
8479 semantics-preserving attacks in vision and security domain, we investigate
8480 $\textit{relational adversaries}$, a broad class of attackers who create
8481 adversarial examples that are in a reflexive-transitive closure of a logical
8482 relation. We analyze the conditions for robustness and propose
8483 $\textit{normalize-and-predict}$ -- a learning framework with provable
8484 robustness guarantee. We compare our approach with adversarial training and
8485 derive an unified framework that provides benefits of both approaches. Guided
8486 by our theoretical findings, we apply our framework to image classification and
8487 malware detection. Results of both tasks show that attacks using relational
8488 adversaries frequently fool existing models, but our unified framework can
8489 significantly enhance their robustness.
8490 </p>
8491 </description>
8492 </item>
8493 <item>
8494 <title>Information Theoretic Lower Bounds for Feed-Forward Fully-Connected Deep Networks. (arXiv:2007.00796v2 [stat.ML] UPDATED)</title>
8495 <link>http://fr.arxiv.org/abs/2007.00796</link>
8496 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Yang_X/0/1/0/all/0/1">Xiaochen Yang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Honorio_J/0/1/0/all/0/1">Jean Honorio</a></p>
8497
8498 <p>In this paper, we study the sample complexity lower bounds for the exact
8499 recovery of parameters and for a positive excess risk of a feed-forward,
8500 fully-connected neural network for binary classification, using
8501 information-theoretic tools. We prove these lower bounds by the existence of a
8502 generative network characterized by a backwards data generating process, where
8503 the input is generated based on the binary output, and the network is
8504 parametrized by weight parameters for the hidden layers. The sample complexity
8505 lower bound for the exact recovery of parameters is $\Omega(d r \log(r) + p )$
8506 and for a positive excess risk is $\Omega(r \log(r) + p )$, where $p$ is the
8507 dimension of the input, $r$ reflects the rank of the weight matrices and $d$ is
8508 the number of hidden layers. To the best of our knowledge, our results are the
8509 first information theoretic lower bounds.
8510 </p>
8511 </description>
8512 </item>
8513 <item>
8514 <title>Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning. (arXiv:2007.01293v2 [cs.LG] UPDATED)</title>
8515 <link>http://fr.arxiv.org/abs/2007.01293</link>
8516 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_Z/0/1/0/all/0/1">Zhongzheng Ren</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yeh_R/0/1/0/all/0/1">Raymond A. Yeh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schwing_A/0/1/0/all/0/1">Alexander G. Schwing</a></p>
8517
8518 <p>Existing semi-supervised learning (SSL) algorithms use a single weight to
8519 balance the loss of labeled and unlabeled examples, i.e., all unlabeled
8520 examples are equally weighted. But not all unlabeled data are equal. In this
8521 paper we study how to use a different weight for every unlabeled example.
8522 Manual tuning of all those weights -- as done in prior work -- is no longer
8523 possible. Instead, we adjust those weights via an algorithm based on the
8524 influence function, a measure of a model's dependency on one training example.
8525 To make the approach efficient, we propose a fast and effective approximation
8526 of the influence function. We demonstrate that this technique outperforms
8527 state-of-the-art methods on semi-supervised image and language classification
8528 tasks.
8529 </p>
8530 </description>
8531 </item>
8532 <item>
8533 <title>A Framework for Modelling, Verification and Transformation of Concurrent Imperative Programs. (arXiv:2007.02261v2 [cs.LO] UPDATED)</title>
8534 <link>http://fr.arxiv.org/abs/2007.02261</link>
8535 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bortin_M/0/1/0/all/0/1">Maksym Bortin</a></p>
8536
8537 <p>The paper gives a comprehensive presentation of a framework, embedded into
8538 the simply typed higher-order logic, and aimed at providing a sound assistance
8539 in formal reasoning about models of imperative programs with interleaved
8540 computations. As a case study, a model of the Peterson's mutual exclusion
8541 algorithm will be scrutinised in the course of the paper illustrating
8542 applicability of the framework.
8543 </p>
8544 </description>
8545 </item>
8546 <item>
8547 <title>Self-Supervised Graph Transformer on Large-Scale Molecular Data. (arXiv:2007.02835v2 [q-bio.BM] UPDATED)</title>
8548 <link>http://fr.arxiv.org/abs/2007.02835</link>
8549 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-bio/1/au:+Rong_Y/0/1/0/all/0/1">Yu Rong</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Bian_Y/0/1/0/all/0/1">Yatao Bian</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Xu_T/0/1/0/all/0/1">Tingyang Xu</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Xie_W/0/1/0/all/0/1">Weiyang Xie</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Wei_Y/0/1/0/all/0/1">Ying Wei</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Huang_W/0/1/0/all/0/1">Wenbing Huang</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Huang_J/0/1/0/all/0/1">Junzhou Huang</a></p>
8550
8551 <p>How to obtain informative representations of molecules is a crucial
8552 prerequisite in AI-driven drug design and discovery. Recent researches abstract
8553 molecules as graphs and employ Graph Neural Networks (GNNs) for molecular
8554 representation learning. Nevertheless, two issues impede the usage of GNNs in
8555 real scenarios: (1) insufficient labeled molecules for supervised training; (2)
8556 poor generalization capability to new-synthesized molecules. To address them
8557 both, we propose a novel framework, GROVER, which stands for Graph
8558 Representation frOm self-superVised mEssage passing tRansformer. With carefully
8559 designed self-supervised tasks in node-, edge- and graph-level, GROVER can
8560 learn rich structural and semantic information of molecules from enormous
8561 unlabelled molecular data. Rather, to encode such complex information, GROVER
8562 integrates Message Passing Networks into the Transformer-style architecture to
8563 deliver a class of more expressive encoders of molecules. The flexibility of
8564 GROVER allows it to be trained efficiently on large-scale molecular dataset
8565 without requiring any supervision, thus being immunized to the two issues
8566 mentioned above. We pre-train GROVER with 100 million parameters on 10 million
8567 unlabelled molecules -- the biggest GNN and the largest training dataset in
8568 molecular representation learning. We then leverage the pre-trained GROVER for
8569 molecular property prediction followed by task-specific fine-tuning, where we
8570 observe a huge improvement (more than 6% on average) from current
8571 state-of-the-art methods on 11 challenging benchmarks. The insights we gained
8572 are that well-designed self-supervision losses and largely-expressive
8573 pre-trained models enjoy the significant potential on performance boosting.
8574 </p>
8575 </description>
8576 </item>
8577 <item>
8578 <title>BoxE: A Box Embedding Model for Knowledge Base Completion. (arXiv:2007.06267v2 [cs.AI] UPDATED)</title>
8579 <link>http://fr.arxiv.org/abs/2007.06267</link>
8580 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Abboud_R/0/1/0/all/0/1">Ralph Abboud</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ceylan_I/0/1/0/all/0/1">&#x130;smail &#x130;lkan Ceylan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lukasiewicz_T/0/1/0/all/0/1">Thomas Lukasiewicz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Salvatori_T/0/1/0/all/0/1">Tommaso Salvatori</a></p>
8581
8582 <p>Knowledge base completion (KBC) aims to automatically infer missing facts by
8583 exploiting information already present in a knowledge base (KB). A promising
8584 approach for KBC is to embed knowledge into latent spaces and make predictions
8585 from learned embeddings. However, existing embedding models are subject to at
8586 least one of the following limitations: (1) theoretical inexpressivity, (2)
8587 lack of support for prominent inference patterns (e.g., hierarchies), (3) lack
8588 of support for KBC over higher-arity relations, and (4) lack of support for
8589 incorporating logical rules. Here, we propose a spatio-translational embedding
8590 model, called BoxE, that simultaneously addresses all these limitations. BoxE
8591 embeds entities as points, and relations as a set of hyper-rectangles (or
8592 boxes), which spatially characterize basic logical properties. This seemingly
8593 simple abstraction yields a fully expressive model offering a natural encoding
8594 for many desired logical properties. BoxE can both capture and inject rules
8595 from rich classes of rule languages, going well beyond individual inference
8596 patterns. By design, BoxE naturally applies to higher-arity KBs. We conduct a
8597 detailed experimental analysis, and show that BoxE achieves state-of-the-art
8598 performance, both on benchmark knowledge graphs and on more general KBs, and we
8599 empirically show the power of integrating logical rules.
8600 </p>
8601 </description>
8602 </item>
8603 <item>
8604 <title>RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning. (arXiv:2007.06271v2 [cs.CV] UPDATED)</title>
8605 <link>http://fr.arxiv.org/abs/2007.06271</link>
8606 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chiaro_R/0/1/0/all/0/1">Riccardo Del Chiaro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Twardowski_B/0/1/0/all/0/1">Bart&#x142;omiej Twardowski</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bagdanov_A/0/1/0/all/0/1">Andrew D. Bagdanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weijer_J/0/1/0/all/0/1">Joost van de Weijer</a></p>
8607
8608 <p>Research on continual learning has led to a variety of approaches to
8609 mitigating catastrophic forgetting in feed-forward classification networks.
8610 Until now surprisingly little attention has been focused on continual learning
8611 of recurrent models applied to problems like image captioning. In this paper we
8612 take a systematic look at continual learning of LSTM-based models for image
8613 captioning. We propose an attention-based approach that explicitly accommodates
8614 the transient nature of vocabularies in continual image captioning tasks --
8615 i.e. that task vocabularies are not disjoint. We call our method Recurrent
8616 Attention to Transient Tasks (RATT), and also show how to adapt continual
8617 learning approaches based on weight egularization and knowledge distillation to
8618 recurrent continual learning problems. We apply our approaches to incremental
8619 image captioning problem on two new continual learning benchmarks we define
8620 using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is
8621 able to sequentially learn five captioning tasks while incurring no forgetting
8622 of previously learned ones.
8623 </p>
8624 </description>
8625 </item>
8626 <item>
8627 <title>Graph Neural Networks for Scalable Radio Resource Management: Architecture Design and Theoretical Analysis. (arXiv:2007.07632v2 [cs.IT] UPDATED)</title>
8628 <link>http://fr.arxiv.org/abs/2007.07632</link>
8629 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_Y/0/1/0/all/0/1">Yifei Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shi_Y/0/1/0/all/0/1">Yuanming Shi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jun Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Letaief_K/0/1/0/all/0/1">Khaled B. Letaief</a></p>
8630
8631 <p>Deep learning has recently emerged as a disruptive technology to solve
8632 challenging radio resource management problems in wireless networks. However,
8633 the neural network architectures adopted by existing works suffer from poor
8634 scalability, generalization, and lack of interpretability. A long-standing
8635 approach to improve scalability and generalization is to incorporate the
8636 structures of the target task into the neural network architecture. In this
8637 paper, we propose to apply graph neural networks (GNNs) to solve large-scale
8638 radio resource management problems, supported by effective neural network
8639 architecture design and theoretical analysis. Specifically, we first
8640 demonstrate that radio resource management problems can be formulated as graph
8641 optimization problems that enjoy a universal permutation equivariance property.
8642 We then identify a class of neural networks, named \emph{message passing graph
8643 neural networks} (MPGNNs). It is demonstrated that they not only satisfy the
8644 permutation equivariance property, but also can generalize to large-scale
8645 problems while enjoying a high computational efficiency. For interpretablity
8646 and theoretical guarantees, we prove the equivalence between MPGNNs and a class
8647 of distributed optimization algorithms, which is then used to analyze the
8648 performance and generalization of MPGNN-based methods. Extensive simulations,
8649 with power control and beamforming as two examples, will demonstrate that the
8650 proposed method, trained in an unsupervised manner with unlabeled samples,
8651 matches or even outperforms classic optimization-based algorithms without
8652 domain-specific knowledge. Remarkably, the proposed method is highly scalable
8653 and can solve the beamforming problem in an interference channel with $1000$
8654 transceiver pairs within $6$ milliseconds on a single GPU.
8655 </p>
8656 </description>
8657 </item>
8658 <item>
8659 <title>Temporal Pointwise Convolutional Networks for Length of Stay Prediction in the Intensive Care Unit. (arXiv:2007.09483v2 [cs.LG] UPDATED)</title>
8660 <link>http://fr.arxiv.org/abs/2007.09483</link>
8661 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rocheteau_E/0/1/0/all/0/1">Emma Rocheteau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lio_P/0/1/0/all/0/1">Pietro Li&#xf2;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hyland_S/0/1/0/all/0/1">Stephanie Hyland</a></p>
8662
8663 <p>The pressure of ever-increasing patient demand and budget restrictions make
8664 hospital bed management a daily challenge for clinical staff. Most critical is
8665 the efficient allocation of resource-heavy Intensive Care Unit (ICU) beds to
8666 the patients who need life support. Central to solving this problem is knowing
8667 for how long the current set of ICU patients are likely to stay in the unit. In
8668 this work, we propose a new deep learning model based on the combination of
8669 temporal convolution and pointwise (1x1) convolution, to solve the length of
8670 stay prediction task on the eICU critical care dataset. The model - which we
8671 refer to as Temporal Pointwise Convolution (TPC) - is specifically designed to
8672 mitigate for common challenges with Electronic Health Records, such as
8673 skewness, irregular sampling and missing data. In doing so, we have achieved
8674 significant performance benefits of 18-51% (metric dependent) over the commonly
8675 used Long-Short Term Memory (LSTM) network, and the multi-head self-attention
8676 network known as the Transformer.
8677 </p>
8678 </description>
8679 </item>
8680 <item>
8681 <title>CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks. (arXiv:2007.10497v3 [cs.HC] UPDATED)</title>
8682 <link>http://fr.arxiv.org/abs/2007.10497</link>
8683 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hassantabar_S/0/1/0/all/0/1">Shayan Hassantabar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stefano_N/0/1/0/all/0/1">Novati Stefano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ghanakota_V/0/1/0/all/0/1">Vishweshwar Ghanakota</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ferrari_A/0/1/0/all/0/1">Alessandra Ferrari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nicola_G/0/1/0/all/0/1">Gregory N. Nicola</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bruno_R/0/1/0/all/0/1">Raffaele Bruno</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marino_I/0/1/0/all/0/1">Ignazio R. Marino</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hamidouche_K/0/1/0/all/0/1">Kenza Hamidouche</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jha_N/0/1/0/all/0/1">Niraj K. Jha</a></p>
8684
8685 <p>The novel coronavirus (SARS-CoV-2) has led to a pandemic. The current testing
8686 regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2
8687 has been unable to keep up with testing demands, and also suffers from a
8688 relatively low positive detection rate in the early stages of the resultant
8689 COVID-19 disease. Hence, there is a need for an alternative approach for
8690 repeated large-scale testing of SARS-CoV-2/COVID-19. We propose a framework
8691 called CovidDeep that combines efficient DNNs with commercially available WMSs
8692 for pervasive testing of the virus. We collected data from 87 individuals,
8693 spanning three cohorts including healthy, asymptomatic, and symptomatic
8694 patients. We trained DNNs on various subsets of the features automatically
8695 extracted from six WMS and questionnaire categories to perform ablation studies
8696 to determine which subsets are most efficacious in terms of test accuracy for a
8697 three-way classification. The highest test accuracy obtained was 98.1%. We also
8698 augmented the real training dataset with a synthetic training dataset drawn
8699 from the same probability distribution to impose a prior on DNN weights and
8700 leveraged a grow-and-prune synthesis paradigm to learn both DNN architecture
8701 and weights. This boosted the accuracy of the various DNNs further and
8702 simultaneously reduced their size and floating-point operations.
8703 </p>
8704 </description>
8705 </item>
8706 <item>
8707 <title>The Complete Lasso Tradeoff Diagram. (arXiv:2007.11078v4 [math.ST] UPDATED)</title>
8708 <link>http://fr.arxiv.org/abs/2007.11078</link>
8709 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Wang_H/0/1/0/all/0/1">Hua Wang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Yang_Y/0/1/0/all/0/1">Yachong Yang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Bu_Z/0/1/0/all/0/1">Zhiqi Bu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Su_W/0/1/0/all/0/1">Weijie J. Su</a></p>
8710
8711 <p>A fundamental problem in the high-dimensional regression is to understand the
8712 tradeoff between type I and type II errors or, equivalently, false discovery
8713 rate (FDR) and power in variable selection. To address this important problem,
8714 we offer the first complete tradeoff diagram that distinguishes all pairs of
8715 FDR and power that can be asymptotically realized by the Lasso with some choice
8716 of its penalty parameter from the remaining pairs, in a regime of linear
8717 sparsity under random designs. The tradeoff between the FDR and power
8718 characterized by our diagram holds no matter how strong the signals are. In
8719 particular, our results improve on the earlier Lasso tradeoff diagram of
8720 <a href="/abs/1511.01957">arXiv:1511.01957</a> by recognizing two simple but fundamental constraints on the
8721 pairs of FDR and power. The improvement is more substantial when the regression
8722 problem is above the Donoho--Tanner phase transition. Finally, we present
8723 extensive simulation studies to confirm the sharpness of the complete Lasso
8724 tradeoff diagram.
8725 </p>
8726 </description>
8727 </item>
8728 <item>
8729 <title>Sifting Convolution on the Sphere. (arXiv:2007.12153v2 [cs.IT] UPDATED)</title>
8730 <link>http://fr.arxiv.org/abs/2007.12153</link>
8731 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Roddy_P/0/1/0/all/0/1">Patrick J. Roddy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+McEwen_J/0/1/0/all/0/1">Jason D. McEwen</a></p>
8732
8733 <p>A novel spherical convolution is defined through the sifting property of the
8734 Dirac delta on the sphere. The so-called sifting convolution is defined by the
8735 inner product of one function with a translated version of another, but with
8736 the adoption of an alternative translation operator on the sphere. This
8737 translation operator follows by analogy with the Euclidean translation when
8738 viewed in harmonic space. The sifting convolution satisfies a variety of
8739 desirable properties that are lacking in alternate definitions, namely: it
8740 supports directional kernels; it has an output which remains on the sphere; and
8741 is efficient to compute. An illustration of the sifting convolution on a
8742 topographic map of the Earth demonstrates that it supports directional kernels
8743 to perform anisotropic filtering, while its output remains on the sphere.
8744 </p>
8745 </description>
8746 </item>
8747 <item>
8748 <title>Revisiting Locality in Binary-Integer Representations. (arXiv:2007.12159v2 [cs.NE] UPDATED)</title>
8749 <link>http://fr.arxiv.org/abs/2007.12159</link>
8750 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shastri_H/0/1/0/all/0/1">Hrishee Shastri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frachtenberg_E/0/1/0/all/0/1">Eitan Frachtenberg</a></p>
8751
8752 <p>Mutation and recombination operators play a key role in determining the speed
8753 and quality of Genetic and Evolutionary Algorithms (GEAs). Prior work has
8754 analyzed the effects of these operators on genotypic variation, often using
8755 locality metrics that measure the sensitivity and stability of
8756 genotype-phenotype representations to these operators.
8757 </p>
8758 <p>In this paper, we focus on an important subset of representations, namely
8759 nonredundant bitstring-to-integer representations, and analyze them through the
8760 lens of Rothlauf's widely used locality metrics. We first define locality
8761 metrics equivalent to Rothlauf's that are tailored to our domain: the
8762 \textit{point locality} for single-bit mutation and \textit{general locality}
8763 for recombination. With these definitions, we derive tight bounds and a closed
8764 form expected value for point locality. For general locality we show that it is
8765 asymptotically equivalent across all representations and operators. We also
8766 recreate three established GEA experiments to understand the predictive power
8767 of point locality on GEA performance, focusing on two popular and often
8768 juxtaposed representations: standard binary and binary reflected Gray.
8769 </p>
8770 <p>We show that standard binary has provably no worse locality than any Gray
8771 encoding, including binary reflected Gray. We discuss this result in the
8772 context of previous studies that found binary reflected Gray to outperform
8773 standard binary, and we argue that locality cannot be the explanation for
8774 strong performance. Finally, we provide empirical evidence that weak point
8775 locality representations can be beneficial to performance in the exploration
8776 phase of the GEA, while strong point locality representations are more
8777 beneficial in the exploitation phase.
8778 </p>
8779 </description>
8780 </item>
8781 <item>
8782 <title>YOLOpeds: Efficient Real-Time Single-Shot Pedestrian Detection for Smart Camera Applications. (arXiv:2007.13404v2 [cs.CV] UPDATED)</title>
8783 <link>http://fr.arxiv.org/abs/2007.13404</link>
8784 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kyrkou_C/0/1/0/all/0/1">Christos Kyrkou</a></p>
8785
8786 <p>Deep Learning-based object detectors can enhance the capabilities of smart
8787 camera systems in a wide spectrum of machine vision applications including
8788 video surveillance, autonomous driving, robots and drones, smart factory, and
8789 health monitoring. Pedestrian detection plays a key role in all these
8790 applications and deep learning can be used to construct accurate
8791 state-of-the-art detectors. However, such complex paradigms do not scale easily
8792 and are not traditionally implemented in resource-constrained smart cameras for
8793 on-device processing which offers significant advantages in situations when
8794 real-time monitoring and robustness are vital. Efficient neural networks can
8795 not only enable mobile applications and on-device experiences but can also be a
8796 key enabler of privacy and security allowing a user to gain the benefits of
8797 neural networks without needing to send their data to the server to be
8798 evaluated. This work addresses the challenge of achieving a good trade-off
8799 between accuracy and speed for efficient deployment of deep-learning-based
8800 pedestrian detection in smart camera applications. A computationally efficient
8801 architecture is introduced based on separable convolutions and proposes
8802 integrating dense connections across layers and multi-scale feature fusion to
8803 improve representational capacity while decreasing the number of parameters and
8804 operations. In particular, the contributions of this work are the following: 1)
8805 An efficient backbone combining multi-scale feature operations, 2) a more
8806 elaborate loss function for improved localization, 3) an anchor-less approach
8807 for detection, The proposed approach called YOLOpeds is evaluated using the
8808 PETS2009 surveillance dataset on 320x320 images. Overall, YOLOpeds provides
8809 real-time sustained operation of over 30 frames per second with detection rates
8810 in the range of 86% outperforming existing deep learning models.
8811 </p>
8812 </description>
8813 </item>
8814 <item>
8815 <title>Regularization by Denoising via Fixed-Point Projection (RED-PRO). (arXiv:2008.00226v2 [eess.IV] UPDATED)</title>
8816 <link>http://fr.arxiv.org/abs/2008.00226</link>
8817 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Cohen_R/0/1/0/all/0/1">Regev Cohen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Elad_M/0/1/0/all/0/1">Michael Elad</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Milanfar_P/0/1/0/all/0/1">Peyman Milanfar</a></p>
8818
8819 <p>Inverse problems in image processing are typically cast as optimization
8820 tasks, consisting of data-fidelity and stabilizing regularization terms. A
8821 recent regularization strategy of great interest utilizes the power of
8822 denoising engines. Two such methods are the Plug-and-Play Prior (PnP) and
8823 Regularization by Denoising (RED). While both have shown state-of-the-art
8824 results in various recovery tasks, their theoretical justification is
8825 incomplete. In this paper, we aim to bridge between RED and PnP, enriching the
8826 understanding of both frameworks. Towards that end, we reformulate RED as a
8827 convex optimization problem utilizing a projection (RED-PRO) onto the
8828 fixed-point set of demicontractive denoisers. We offer a simple iterative
8829 solution to this problem, by which we show that PnP proximal gradient method is
8830 a special case of RED-PRO, while providing guarantees for the convergence of
8831 both frameworks to globally optimal solutions. In addition, we present
8832 relaxations of RED-PRO that allow for handling denoisers with limited
8833 fixed-point sets. Finally, we demonstrate RED-PRO for the tasks of image
8834 deblurring and super-resolution, showing improved results with respect to the
8835 original RED framework.
8836 </p>
8837 </description>
8838 </item>
8839 <item>
8840 <title>A Matrix Chernoff Bound for Markov Chains and Its Application to Co-occurrence Matrices. (arXiv:2008.02464v2 [stat.ML] UPDATED)</title>
8841 <link>http://fr.arxiv.org/abs/2008.02464</link>
8842 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Qiu_J/0/1/0/all/0/1">Jiezhong Qiu</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wang_C/0/1/0/all/0/1">Chi Wang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Liao_B/0/1/0/all/0/1">Ben Liao</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Peng_R/0/1/0/all/0/1">Richard Peng</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Tang_J/0/1/0/all/0/1">Jie Tang</a></p>
8843
8844 <p>We prove a Chernoff-type bound for sums of matrix-valued random variables
8845 sampled via a regular (aperiodic and irreducible) finite Markov chain.
8846 Specially, consider a random walk on a regular Markov chain and a Hermitian
8847 matrix-valued function on its state space. Our result gives exponentially
8848 decreasing bounds on the tail distributions of the extreme eigenvalues of the
8849 sample mean matrix. Our proof is based on the matrix expander (regular
8850 undirected graph) Chernoff bound [Garg et al. STOC '18] and scalar
8851 Chernoff-Hoeffding bounds for Markov chains [Chung et al. STACS '12].
8852 </p>
8853 <p>Our matrix Chernoff bound for Markov chains can be applied to analyze the
8854 behavior of co-occurrence statistics for sequential data, which have been
8855 common and important data signals in machine learning. We show that given a
8856 regular Markov chain with $n$ states and mixing time $\tau$, we need a
8857 trajectory of length $O(\tau (\log{(n)}+\log{(\tau)})/\epsilon^2)$ to achieve
8858 an estimator of the co-occurrence matrix with error bound $\epsilon$. We
8859 conduct several experiments and the experimental results are consistent with
8860 the exponentially fast convergence rate from theoretical analysis. Our result
8861 gives the first bound on the convergence rate of the co-occurrence matrix and
8862 the first sample complexity analysis in graph representation learning.
8863 </p>
8864 </description>
8865 </item>
8866 <item>
8867 <title>Integration of the 3D Environment for UAV Onboard Visual Object Tracking. (arXiv:2008.02834v3 [cs.CV] UPDATED)</title>
8868 <link>http://fr.arxiv.org/abs/2008.02834</link>
8869 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Vujasinovic_S/0/1/0/all/0/1">St&#xe9;phane Vujasinovi&#x107;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Becker_S/0/1/0/all/0/1">Stefan Becker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Breuer_T/0/1/0/all/0/1">Timo Breuer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bullinger_S/0/1/0/all/0/1">Sebastian Bullinger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Scherer_Negenborn_N/0/1/0/all/0/1">Norbert Scherer-Negenborn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Arens_M/0/1/0/all/0/1">Michael Arens</a></p>
8870
8871 <p>Single visual object tracking from an unmanned aerial vehicle (UAV) poses
8872 fundamental challenges such as object occlusion, small-scale objects,
8873 background clutter, and abrupt camera motion. To tackle these difficulties, we
8874 propose to integrate the 3D structure of the observed scene into a
8875 detection-by-tracking algorithm. We introduce a pipeline that combines a
8876 model-free visual object tracker, a sparse 3D reconstruction, and a state
8877 estimator. The 3D reconstruction of the scene is computed with an image-based
8878 Structure-from-Motion (SfM) component that enables us to leverage a state
8879 estimator in the corresponding 3D scene during tracking. By representing the
8880 position of the target in 3D space rather than in image space, we stabilize the
8881 tracking during ego-motion and improve the handling of occlusions, background
8882 clutter, and small-scale objects. We evaluated our approach on prototypical
8883 image sequences, captured from a UAV with low-altitude oblique views. For this
8884 purpose, we adapted an existing dataset for visual object tracking and
8885 reconstructed the observed scene in 3D. The experimental results demonstrate
8886 that the proposed approach outperforms methods using plain visual cues as well
8887 as approaches leveraging image-space-based state estimations. We believe that
8888 our approach can be beneficial for traffic monitoring, video surveillance, and
8889 navigation.
8890 </p>
8891 </description>
8892 </item>
8893 <item>
8894 <title>Lifted Multiplicity Codes. (arXiv:2008.04717v2 [cs.IT] UPDATED)</title>
8895 <link>http://fr.arxiv.org/abs/2008.04717</link>
8896 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Holzbaur_L/0/1/0/all/0/1">Lukas Holzbaur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Polyanskaya_R/0/1/0/all/0/1">Rina Polyanskaya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Polyanskii_N/0/1/0/all/0/1">Nikita Polyanskii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vorobyev_I/0/1/0/all/0/1">Ilya Vorobyev</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yaakobi_E/0/1/0/all/0/1">Eitan Yaakobi</a></p>
8897
8898 <p>Lifted Reed-Solomon codes and multiplicity codes are two classes of
8899 evaluation codes that allow for the design of high-rate codes that can recover
8900 every codeword or information symbol from many disjoint sets. Recently, the
8901 underlying approaches have been combined to construct lifted bi-variate
8902 multiplicity codes, that can further improve on the rate. We continue the study
8903 of these codes by providing lower bounds on the rate and distance for lifted
8904 multiplicity codes obtained from polynomials in an arbitrary number of
8905 variables. Specifically, we investigate a subcode of a lifted multiplicity code
8906 formed by the linear span of $m$-variate monomials whose restriction to an
8907 arbitrary line in $\mathbb{F}_q^m$ is equivalent to a low-degree uni-variate
8908 polynomial. We find the tight asymptotic behavior of the fraction of such
8909 monomials when the number of variables $m$ is fixed and the alphabet size
8910 $q=2^\ell$ is large. For some parameter regimes, lifted multiplicity codes are
8911 then shown to have a better trade-off between redundancy and the number of
8912 disjoint recovering sets for every codeword or information symbol than
8913 previously known constructions. Additionally, we present a local
8914 self-correction algorithm for lifted multiplicity codes.
8915 </p>
8916 </description>
8917 </item>
8918 <item>
8919 <title>A Composable Specification Language for Reinforcement Learning Tasks. (arXiv:2008.09293v2 [cs.LG] UPDATED)</title>
8920 <link>http://fr.arxiv.org/abs/2008.09293</link>
8921 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jothimurugan_K/0/1/0/all/0/1">Kishor Jothimurugan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Alur_R/0/1/0/all/0/1">Rajeev Alur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bastani_O/0/1/0/all/0/1">Osbert Bastani</a></p>
8922
8923 <p>Reinforcement learning is a promising approach for learning control policies
8924 for robot tasks. However, specifying complex tasks (e.g., with multiple
8925 objectives and safety constraints) can be challenging, since the user must
8926 design a reward function that encodes the entire task. Furthermore, the user
8927 often needs to manually shape the reward to ensure convergence of the learning
8928 algorithm. We propose a language for specifying complex control tasks, along
8929 with an algorithm that compiles specifications in our language into a reward
8930 function and automatically performs reward shaping. We implement our approach
8931 in a tool called SPECTRL, and show that it outperforms several state-of-the-art
8932 baselines.
8933 </p>
8934 </description>
8935 </item>
8936 <item>
8937 <title>Gravilon: Applications of a New Gradient Descent Method to Machine Learning. (arXiv:2008.11370v2 [cs.LG] UPDATED)</title>
8938 <link>http://fr.arxiv.org/abs/2008.11370</link>
8939 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kelterborn_C/0/1/0/all/0/1">Chad Kelterborn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mazur_M/0/1/0/all/0/1">Marcin Mazur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Petrenko_B/0/1/0/all/0/1">Bogdan V. Petrenko</a></p>
8940
8941 <p>Gradient descent algorithms have been used in countless applications since
8942 the inception of Newton's method. The explosion in the number of applications
8943 of neural networks has re-energized efforts in recent years to improve the
8944 standard gradient descent method in both efficiency and accuracy. These methods
8945 modify the effect of the gradient in updating the values of the parameters.
8946 These modifications often incorporate hyperparameters: additional variables
8947 whose values must be specified at the outset of the program. We provide, below,
8948 a novel gradient descent algorithm, called Gravilon, that uses the geometry of
8949 the hypersurface to modify the length of the step in the direction of the
8950 gradient. Using neural networks, we provide promising experimental results
8951 comparing the accuracy and efficiency of the Gravilon method against commonly
8952 used gradient descent algorithms on MNIST digit classification.
8953 </p>
8954 </description>
8955 </item>
8956 <item>
8957 <title>On the model-based stochastic value gradient for continuous reinforcement learning. (arXiv:2008.12775v2 [cs.LG] UPDATED)</title>
8958 <link>http://fr.arxiv.org/abs/2008.12775</link>
8959 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Amos_B/0/1/0/all/0/1">Brandon Amos</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stanton_S/0/1/0/all/0/1">Samuel Stanton</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yarats_D/0/1/0/all/0/1">Denis Yarats</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wilson_A/0/1/0/all/0/1">Andrew Gordon Wilson</a></p>
8960
8961 <p>Model-based reinforcement learning approaches add explicit domain knowledge
8962 to agents in hopes of improving the sample-efficiency in comparison to
8963 model-free agents. However, in practice model-based methods are unable to
8964 achieve the same asymptotic performance on challenging continuous control tasks
8965 due to the complexity of learning and controlling an explicit world model. In
8966 this paper we investigate the stochastic value gradient (SVG), which is a
8967 well-known family of methods for controlling continuous systems which includes
8968 model-based approaches that distill a model-based value expansion into a
8969 model-free policy. We consider a variant of the model-based SVG that scales to
8970 larger systems and uses 1) an entropy regularization to help with exploration,
8971 2) a learned deterministic world model to improve the short-horizon value
8972 estimate, and 3) a learned model-free value estimate after the model's rollout.
8973 This SVG variation captures the model-free soft actor-critic method as an
8974 instance when the model rollout horizon is zero, and otherwise uses
8975 short-horizon model rollouts to improve the value estimate for the policy
8976 update. We surpass the asymptotic performance of other model-based methods on
8977 the proprioceptive MuJoCo locomotion tasks from the OpenAI gym, including a
8978 humanoid. We notably achieve these results with a simple deterministic world
8979 model without requiring an ensemble.
8980 </p>
8981 </description>
8982 </item>
8983 <item>
8984 <title>Introduction to logistic regression. (arXiv:2008.13567v2 [stat.ME] UPDATED)</title>
8985 <link>http://fr.arxiv.org/abs/2008.13567</link>
8986 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Chung_M/0/1/0/all/0/1">Moo K. Chung</a></p>
8987
8988 <p>For random field theory based multiple comparison corrections In brain
8989 imaging, it is often necessary to compute the distribution of the supremum of a
8990 random field. Unfortunately, computing the distribution of the supremum of the
8991 random field is not easy and requires satisfying many distributional
8992 assumptions that may not be true in real data. Thus, there is a need to come up
8993 with a different framework that does not use the traditional statistical
8994 hypothesis testing paradigm that requires to compute p-values. With this as a
8995 motivation, we can use a different approach called the logistic regression that
8996 does not require computing the p-value and still be able to localize the
8997 regions of brain network differences. Unlike other discriminant and
8998 classification techniques that tried to classify preselected feature vectors,
8999 the method here does not require any preselected feature vectors and performs
9000 the classification at each edge level.
9001 </p>
9002 </description>
9003 </item>
9004 <item>
9005 <title>Individuation and Adaptation in Complex Systems. (arXiv:2009.00110v2 [cs.NE] UPDATED)</title>
9006 <link>http://fr.arxiv.org/abs/2009.00110</link>
9007 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fabbro_O/0/1/0/all/0/1">Olivier Del Fabbro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Christen_P/0/1/0/all/0/1">Patrik Christen</a></p>
9008
9009 <p>Complex systems have certain characteristics such as network structures of a
9010 large number of individual elements, adaptation, and emergence. While these
9011 characteristics have been studied and described, it is often not so clear where
9012 they exactly come from. There is a focus on concrete system states rather than
9013 the emergence of the computer models themselves used to study these systems. To
9014 better understand typical characteristics of complex systems and their
9015 emergence, we recently presented a system metamodel based on which computer
9016 models can be created from abstract building blocks. In this study we extend
9017 our system metamodel with the concept of adaption in order to integrate
9018 adaptive computation in our so-called allagmatic method - a framework
9019 consisting of the system metamodel but also a way to study the creation of the
9020 computer model itself. Running experiments with cellular automata and
9021 artificial neural networks, we find that the system metamodel integrates
9022 adaptation with an additional operation called adaptation function that
9023 operates on the update function, which encodes the system's dynamics. It allows
9024 the creation of adaptive computations by providing an abstract template for
9025 adaptation and guidance for implementation. Further, the object-oriented and
9026 template meta-programming leads to a creation of computer models comparable to
9027 the individuation of observed systems. It therefore allows to study not only
9028 the behaviour of a running model but also its creation. The development of the
9029 system metamodel was first inspired by concepts of the philosophy of
9030 individuation of Gilbert Simondon. The theoretical background for the concept
9031 of adaptation is taken from the philosophy of organism of Alfred North
9032 Whitehead. In general, through the possibility to follow individuation, the
9033 allagmatic method allows to better understand the emergence of typical
9034 characteristics of complex systems.
9035 </p>
9036 </description>
9037 </item>
9038 <item>
9039 <title>Distance Encoding: Design Provably More Powerful Neural Networks for Graph Representation Learning. (arXiv:2009.00142v4 [cs.LG] UPDATED)</title>
9040 <link>http://fr.arxiv.org/abs/2009.00142</link>
9041 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_P/0/1/0/all/0/1">Pan Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yanbang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_H/0/1/0/all/0/1">Hongwei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leskovec_J/0/1/0/all/0/1">Jure Leskovec</a></p>
9042
9043 <p>Learning representations of sets of nodes in a graph is crucial for
9044 applications ranging from node-role discovery to link prediction and molecule
9045 classification. Graph Neural Networks (GNNs) have achieved great success in
9046 graph representation learning. However, expressive power of GNNs is limited by
9047 the 1-Weisfeiler-Lehman (WL) test and thus GNNs generate identical
9048 representations for graph substructures that may in fact be very different.
9049 More powerful GNNs, proposed recently by mimicking higher-order-WL tests, only
9050 focus on representing entire graphs and they are computationally inefficient as
9051 they cannot utilize sparsity of the underlying graph. Here we propose and
9052 mathematically analyze a general class of structure-related features, termed
9053 Distance Encoding (DE). DE assists GNNs in representing any set of nodes, while
9054 providing strictly more expressive power than the 1-WL test. DE captures the
9055 distance between the node set whose representation is to be learned and each
9056 node in the graph. To capture the distance DE can apply various graph-distance
9057 measures such as shortest path distance or generalized PageRank scores. We
9058 propose two ways for GNNs to use DEs (1) as extra node features, and (2) as
9059 controllers of message aggregation in GNNs. Both approaches can utilize the
9060 sparse structure of the underlying graph, which leads to computational
9061 efficiency and scalability. We also prove that DE can distinguish node sets
9062 embedded in almost all regular graphs where traditional GNNs always fail. We
9063 evaluate DE on three tasks over six real networks: structural role prediction,
9064 link prediction, and triangle prediction. Results show that our models
9065 outperform GNNs without DE by up-to 15\% in accuracy and AUROC. Furthermore,
9066 our models also significantly outperform other state-of-the-art methods
9067 especially designed for the above tasks.
9068 </p>
9069 </description>
9070 </item>
9071 <item>
9072 <title>Accelerated reactive transport simulations in heterogeneous porous media using Reaktoro and Firedrake. (arXiv:2009.01194v2 [cs.CE] UPDATED)</title>
9073 <link>http://fr.arxiv.org/abs/2009.01194</link>
9074 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kyas_S/0/1/0/all/0/1">Svetlana Kyas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Volpatto_D/0/1/0/all/0/1">Diego Volpatto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Saar_M/0/1/0/all/0/1">Martin O. Saar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leal_A/0/1/0/all/0/1">Allan M. M. Leal</a></p>
9075
9076 <p>This work investigates the performance of the on-demand machine learning
9077 (ODML) algorithm introduced in Leal et al. (2020) when applied to different
9078 reactive transport problems in heterogeneous porous media. ODML was devised to
9079 accelerate the computationally expensive geochemical reaction calculations in
9080 reactive transport simulations. We demonstrate that the ODML algorithm speeds
9081 up these calculations by one to three orders of magnitude. Such acceleration,
9082 in turn, significantly accelerates the entire reactive transport simulation.
9083 The numerical experiments are performed by implementing the coupling of two
9084 open-source software packages: Reaktoro (Leal, 2015) and Firedrake (Rathgeber
9085 et al., 2016).
9086 </p>
9087 </description>
9088 </item>
9089 <item>
9090 <title>Analysis of Uplink IRS-Assisted NOMA under Nakagami-m Fading via Moments Matching. (arXiv:2009.03133v2 [cs.IT] UPDATED)</title>
9091 <link>http://fr.arxiv.org/abs/2009.03133</link>
9092 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tahir_B/0/1/0/all/0/1">Bashar Tahir</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schwarz_S/0/1/0/all/0/1">Stefan Schwarz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rupp_M/0/1/0/all/0/1">Markus Rupp</a></p>
9093
9094 <p>This letter investigates the uplink outage performance of intelligent
9095 reflecting surface (IRS)-assisted non-orthogonal multiple access (NOMA). We
9096 consider the general case where all users have both direct and reflection
9097 links, and all links undergo Nakagami-m fading. We approximate the received
9098 powers of the NOMA users as Gamma random variables via moments matching. This
9099 allows for tractable expressions of the outage under interference cancellation
9100 (IC), while being flexible in modeling various propagation environments. Our
9101 analysis shows that under certain conditions, the presence of an IRS might
9102 degrade the performance of users that have dominant line-of-sight (LOS) to the
9103 base station (BS), while users dominated by non-line-of-sight (NLOS) will
9104 always benefit from it.
9105 </p>
9106 </description>
9107 </item>
9108 <item>
9109 <title>Physically Embedded Planning Problems: New Challenges for Reinforcement Learning. (arXiv:2009.05524v2 [cs.AI] UPDATED)</title>
9110 <link>http://fr.arxiv.org/abs/2009.05524</link>
9111 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mirza_M/0/1/0/all/0/1">Mehdi Mirza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jaegle_A/0/1/0/all/0/1">Andrew Jaegle</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hunt_J/0/1/0/all/0/1">Jonathan J. Hunt</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guez_A/0/1/0/all/0/1">Arthur Guez</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tunyasuvunakool_S/0/1/0/all/0/1">Saran Tunyasuvunakool</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Muldal_A/0/1/0/all/0/1">Alistair Muldal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weber_T/0/1/0/all/0/1">Th&#xe9;ophane Weber</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Karkus_P/0/1/0/all/0/1">Peter Karkus</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Racaniere_S/0/1/0/all/0/1">S&#xe9;bastien Racani&#xe8;re</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Buesing_L/0/1/0/all/0/1">Lars Buesing</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lillicrap_T/0/1/0/all/0/1">Timothy Lillicrap</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heess_N/0/1/0/all/0/1">Nicolas Heess</a></p>
9112
9113 <p>Recent work in deep reinforcement learning (RL) has produced algorithms
9114 capable of mastering challenging games such as Go, chess, or shogi. In these
9115 works the RL agent directly observes the natural state of the game and controls
9116 that state directly with its actions. However, when humans play such games,
9117 they do not just reason about the moves but also interact with their physical
9118 environment. They understand the state of the game by looking at the physical
9119 board in front of them and modify it by manipulating pieces using touch and
9120 fine-grained motor control. Mastering complicated physical systems with
9121 abstract goals is a central challenge for artificial intelligence, but it
9122 remains out of reach for existing RL algorithms. To encourage progress towards
9123 this goal we introduce a set of physically embedded planning problems and make
9124 them publicly available. We embed challenging symbolic tasks (Sokoban,
9125 tic-tac-toe, and Go) in a physics engine to produce a set of tasks that require
9126 perception, reasoning, and motor control over long time horizons. Although
9127 existing RL algorithms can tackle the symbolic versions of these tasks, we find
9128 that they struggle to master even the simplest of their physically embedded
9129 counterparts. As a first step towards characterizing the space of solution to
9130 these tasks, we introduce a strong baseline that uses a pre-trained expert game
9131 player to provide hints in the abstract space to an RL agent's policy while
9132 training it on the full sensorimotor control task. The resulting agent solves
9133 many of the tasks, underlining the need for methods that bridge the gap between
9134 abstract planning and embodied control. See illustrating video at
9135 https://youtu.be/RwHiHlym_1k.
9136 </p>
9137 </description>
9138 </item>
9139 <item>
9140 <title>Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses. (arXiv:2009.07165v3 [cs.LG] UPDATED)</title>
9141 <link>http://fr.arxiv.org/abs/2009.07165</link>
9142 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rawal_K/0/1/0/all/0/1">Kaivalya Rawal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lakkaraju_H/0/1/0/all/0/1">Himabindu Lakkaraju</a></p>
9143
9144 <p>As predictive models are increasingly being deployed in high-stakes
9145 decision-making, there has been a lot of interest in developing algorithms
9146 which can provide recourses to affected individuals. While developing such
9147 tools is important, it is even more critical to analyse and interpret a
9148 predictive model, and vet it thoroughly to ensure that the recourses it offers
9149 are meaningful and non-discriminatory before it is deployed in the real world.
9150 To this end, we propose a novel model agnostic framework called Actionable
9151 Recourse Summaries (AReS) to construct global counterfactual explanations which
9152 provide an interpretable and accurate summary of recourses for the entire
9153 population. We formulate a novel objective which simultaneously optimizes for
9154 correctness of the recourses and interpretability of the explanations, while
9155 minimizing overall recourse costs across the entire population. More
9156 specifically, our objective enables us to learn, with optimality guarantees on
9157 recourse correctness, a small number of compact rule sets each of which capture
9158 recourses for well defined subpopulations within the data. We also demonstrate
9159 theoretically that several of the prior approaches proposed to generate
9160 recourses for individuals are special cases of our framework. Experimental
9161 evaluation with real world datasets and user studies demonstrate that our
9162 framework can provide decision makers with a comprehensive overview of
9163 recourses corresponding to any black box model, and consequently help detect
9164 undesirable model biases and discrimination.
9165 </p>
9166 </description>
9167 </item>
9168 <item>
9169 <title>CorDEL: A Contrastive Deep Learning Approach for Entity Linkage. (arXiv:2009.07203v2 [cs.DB] UPDATED)</title>
9170 <link>http://fr.arxiv.org/abs/2009.07203</link>
9171 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zhengyang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sisman_B/0/1/0/all/0/1">Bunyamin Sisman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_H/0/1/0/all/0/1">Hao Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_X/0/1/0/all/0/1">Xin Luna Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ji_S/0/1/0/all/0/1">Shuiwang Ji</a></p>
9172
9173 <p>Entity linkage (EL) is a critical problem in data cleaning and integration.
9174 In the past several decades, EL has typically been done by rule-based systems
9175 or traditional machine learning models with hand-curated features, both of
9176 which heavily depend on manual human inputs. With the ever-increasing growth of
9177 new data, deep learning (DL) based approaches have been proposed to alleviate
9178 the high cost of EL associated with the traditional models. Existing
9179 exploration of DL models for EL strictly follows the well-known twin-network
9180 architecture. However, we argue that the twin-network architecture is
9181 sub-optimal to EL, leading to inherent drawbacks of existing models. In order
9182 to address the drawbacks, we propose a novel and generic contrastive DL
9183 framework for EL. The proposed framework is able to capture both syntactic and
9184 semantic matching signals and pays attention to subtle but critical
9185 differences. Based on the framework, we develop a contrastive DL approach for
9186 EL, called CorDEL, with three powerful variants. We evaluate CorDEL with
9187 extensive experiments conducted on both public benchmark datasets and a
9188 real-world dataset. CorDEL outperforms previous state-of-the-art models by 5.2%
9189 on public benchmark datasets. Moreover, CorDEL yields a 2.4% improvement over
9190 the current best DL model on the real-world dataset, while reducing the number
9191 of training parameters by 97.6%.
9192 </p>
9193 </description>
9194 </item>
9195 <item>
9196 <title>Autoregressive Knowledge Distillation through Imitation Learning. (arXiv:2009.07253v2 [cs.CL] UPDATED)</title>
9197 <link>http://fr.arxiv.org/abs/2009.07253</link>
9198 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_A/0/1/0/all/0/1">Alexander Lin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wohlwend_J/0/1/0/all/0/1">Jeremy Wohlwend</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_H/0/1/0/all/0/1">Howard Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lei_T/0/1/0/all/0/1">Tao Lei</a></p>
9199
9200 <p>The performance of autoregressive models on natural language generation tasks
9201 has dramatically improved due to the adoption of deep, self-attentive
9202 architectures. However, these gains have come at the cost of hindering
9203 inference speed, making state-of-the-art models cumbersome to deploy in
9204 real-world, time-sensitive settings. We develop a compression technique for
9205 autoregressive models that is driven by an imitation learning perspective on
9206 knowledge distillation. The algorithm is designed to address the exposure bias
9207 problem. On prototypical language generation tasks such as translation and
9208 summarization, our method consistently outperforms other distillation
9209 algorithms, such as sequence-level knowledge distillation. Student models
9210 trained with our method attain 1.4 to 4.8 BLEU/ROUGE points higher than those
9211 trained from scratch, while increasing inference speed by up to 14 times in
9212 comparison to the teacher model.
9213 </p>
9214 </description>
9215 </item>
9216 <item>
9217 <title>Video based real-time positional tracker. (arXiv:2009.08276v3 [cs.CV] UPDATED)</title>
9218 <link>http://fr.arxiv.org/abs/2009.08276</link>
9219 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Albarracin_D/0/1/0/all/0/1">David Albarrac&#xed;n</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hormigo_J/0/1/0/all/0/1">Jes&#xfa;s Hormigo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fernandez_J/0/1/0/all/0/1">Jos&#xe9; David Fern&#xe1;ndez</a></p>
9220
9221 <p>We propose a system that uses video as the input to track the position of
9222 objects relative to their surrounding environment in real-time. The neural
9223 network employed is trained on a 100% synthetic dataset coming from our own
9224 automated generator. The positional tracker relies on a range of 1 to n video
9225 cameras placed around an arena of choice.
9226 </p>
9227 <p>The system returns the positions of the tracked objects relative to the
9228 broader world by understanding the overlapping matrices formed by the cameras
9229 and therefore these can be extrapolated into real world coordinates.
9230 </p>
9231 <p>In most cases, we achieve a higher update rate and positioning precision than
9232 any of the existing GPS-based systems, in particular for indoor objects or
9233 those occluded from clear sky.
9234 </p>
9235 </description>
9236 </item>
9237 <item>
9238 <title>An Embedded Index Code Construction Using Sub-packetization. (arXiv:2009.11329v2 [cs.IT] UPDATED)</title>
9239 <link>http://fr.arxiv.org/abs/2009.11329</link>
9240 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sasi_S/0/1/0/all/0/1">Shanuja Sasi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Aggarwal_V/0/1/0/all/0/1">Vaneet Aggarwal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rajan_B/0/1/0/all/0/1">B. Sundar Rajan</a></p>
9241
9242 <p>A variant of the index coding problem (ICP), the embedded index coding
9243 problem (EICP) was introduced in [A. Porter and M. Wootters, "Embedded index
9244 coding," ITW, Sweden, 2019] which was motivated by its application in
9245 distributed computing where every user can act as sender for other users and an
9246 algorithm for code construction was reported. The constructions depends on the
9247 computation of minrank of a matrix, which is computationally intensive. In [A.
9248 Mahesh, N. Sageer Karat and B. S. Rajan, "Min-rank of Embedded Index Coding
9249 Problems," ISIT, 2020], for EICP, a notion of side-information matrix was
9250 introduced and it was proved that the length of an optimal scalar linear index
9251 code is equal to the min-rank of the side-information matrix. The authors have
9252 provided an explicit code construction for a class of EICP -
9253 \textit{Consecutive and Symmetric Embedded Index Coding Problem (CS-EICP)}. We
9254 introduce the idea of sub-packetization of the messages in index coding
9255 problems to provide a novel code construction for CS-EICP in contrast to the
9256 scalar linear solutions provided in the prior works. For CS-EICP, the
9257 normalized rate, which is defined as the number of bits transmitted by all the
9258 users together normalized by the total number of bits of all the messages, for
9259 our construction is lesser than the normalized rate achieved by Mahesh et
9260 al.,for scalar linear codes.
9261 </p>
9262 </description>
9263 </item>
9264 <item>
9265 <title>Multi-scale Deep Neural Network (MscaleDNN) Methods for Oscillatory Stokes Flows in Complex Domains. (arXiv:2009.12729v2 [math.NA] UPDATED)</title>
9266 <link>http://fr.arxiv.org/abs/2009.12729</link>
9267 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Wang_B/0/1/0/all/0/1">Bo Wang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Zhang_W/0/1/0/all/0/1">Wenzhong Zhang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Cai_W/0/1/0/all/0/1">Wei Cai</a></p>
9268
9269 <p>In this paper, we study a multi-scale deep neural network (MscaleDNN) as a
9270 meshless numerical method for computing oscillatory Stokes flows in complex
9271 domains. The MscaleDNN employs a multi-scale structure in the design of its DNN
9272 using radial scalings to convert the approximation of high frequency components
9273 of the highly oscillatory Stokes solution to one of lower frequencies. The
9274 MscaleDNN solution to the Stokes problem is obtained by minimizing a loss
9275 function in terms of L2 normof the residual of the Stokes equation. Three forms
9276 of loss functions are investigated based on vorticity-velocity-pressure,
9277 velocity-stress-pressure, and velocity-gradient of velocity-pressure
9278 formulations of the Stokes equation. We first conduct a systematic study of the
9279 MscaleDNN methods with various loss functions on the Kovasznay flow in
9280 comparison with normal fully connected DNNs. Then, Stokes flows with highly
9281 oscillatory solutions in a 2-D domain with six randomly placed holes are
9282 simulated by the MscaleDNN. The results show that MscaleDNN has faster
9283 convergence and consistent error decays in the simulation of Kovasznay flow for
9284 all four tested loss functions. More importantly, the MscaleDNN is capable of
9285 learning highly oscillatory solutions when the normal DNNs fail to converge.
9286 </p>
9287 </description>
9288 </item>
9289 <item>
9290 <title>Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization. (arXiv:2009.12829v3 [cs.CV] UPDATED)</title>
9291 <link>http://fr.arxiv.org/abs/2009.12829</link>
9292 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_H/0/1/0/all/0/1">Haoliang Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">YuFei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wan_R/0/1/0/all/0/1">Renjie Wan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shiqi Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_T/0/1/0/all/0/1">Tie-Qiang Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kot_A/0/1/0/all/0/1">Alex C. Kot</a></p>
9293
9294 <p>Recently, we have witnessed great progress in the field of medical imaging
9295 classification by adopting deep neural networks. However, the recent advanced
9296 models still require accessing sufficiently large and representative datasets
9297 for training, which is often unfeasible in clinically realistic environments.
9298 When trained on limited datasets, the deep neural network is lack of
9299 generalization capability, as the trained deep neural network on data within a
9300 certain distribution (e.g. the data captured by a certain device vendor or
9301 patient population) may not be able to generalize to the data with another
9302 distribution.
9303 </p>
9304 <p>In this paper, we introduce a simple but effective approach to improve the
9305 generalization capability of deep neural networks in the field of medical
9306 imaging classification. Motivated by the observation that the domain
9307 variability of the medical images is to some extent compact, we propose to
9308 learn a representative feature space through variational encoding with a novel
9309 linear-dependency regularization term to capture the shareable information
9310 among medical data collected from different domains. As a result, the trained
9311 neural network is expected to equip with better generalization capability to
9312 the "unseen" medical data. Experimental results on two challenging medical
9313 imaging classification tasks indicate that our method can achieve better
9314 cross-domain generalization capability compared with state-of-the-art
9315 baselines.
9316 </p>
9317 </description>
9318 </item>
9319 <item>
9320 <title>Dual Attention Model for Citation Recommendation. (arXiv:2010.00182v4 [cs.IR] UPDATED)</title>
9321 <link>http://fr.arxiv.org/abs/2010.00182</link>
9322 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yang Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_Q/0/1/0/all/0/1">Qiang Ma</a></p>
9323
9324 <p>Based on an exponentially increasing number of academic articles, discovering
9325 and citing comprehensive and appropriate resources has become a non-trivial
9326 task. Conventional citation recommender methods suffer from severe information
9327 loss. For example, they do not consider the section of the paper that the user
9328 is writing and for which they need to find a citation, the relatedness between
9329 the words in the local context (the text span that describes a citation), or
9330 the importance on each word from the local context. These shortcomings make
9331 such methods insufficient for recommending adequate citations to academic
9332 manuscripts. In this study, we propose a novel embedding-based neural network
9333 called "dual attention model for citation recommendation (DACR)" to recommend
9334 citations during manuscript preparation. Our method adapts embedding of three
9335 dimensions of semantic information: words in the local context, structural
9336 contexts, and the section on which a user is working. A neural network is
9337 designed to maximize the similarity between the embedding of the three input
9338 (local context words, section and structural contexts) and the target citation
9339 appearing in the context. The core of the neural network is composed of
9340 self-attention and additive attention, where the former aims to capture the
9341 relatedness between the contextual words and structural context, and the latter
9342 aims to learn the importance of them. The experiments on real-world datasets
9343 demonstrate the effectiveness of the proposed approach.
9344 </p>
9345 </description>
9346 </item>
9347 <item>
9348 <title>Pretrained Language Model Embryology: The Birth of ALBERT. (arXiv:2010.02480v2 [cs.CL] UPDATED)</title>
9349 <link>http://fr.arxiv.org/abs/2010.02480</link>
9350 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chiang_C/0/1/0/all/0/1">Cheng-Han Chiang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_S/0/1/0/all/0/1">Sung-Feng Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_H/0/1/0/all/0/1">Hung-yi Lee</a></p>
9351
9352 <p>While behaviors of pretrained language models (LMs) have been thoroughly
9353 examined, what happened during pretraining is rarely studied. We thus
9354 investigate the developmental process from a set of randomly initialized
9355 parameters to a totipotent language model, which we refer to as the embryology
9356 of a pretrained language model. Our results show that ALBERT learns to
9357 reconstruct and predict tokens of different parts of speech (POS) in different
9358 learning speeds during pretraining. We also find that linguistic knowledge and
9359 world knowledge do not generally improve as pretraining proceeds, nor do
9360 downstream tasks' performance. These findings suggest that knowledge of a
9361 pretrained model varies during pretraining, and having more pretrain steps does
9362 not necessarily provide a model with more comprehensive knowledge. We will
9363 provide source codes and pretrained models to reproduce our results at
9364 https://github.com/d223302/albert-embryology.
9365 </p>
9366 </description>
9367 </item>
9368 <item>
9369 <title>Investigating African-American Vernacular English in Transformer-Based Text Generation. (arXiv:2010.02510v2 [cs.CL] UPDATED)</title>
9370 <link>http://fr.arxiv.org/abs/2010.02510</link>
9371 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Groenwold_S/0/1/0/all/0/1">Sophie Groenwold</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ou_L/0/1/0/all/0/1">Lily Ou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Parekh_A/0/1/0/all/0/1">Aesha Parekh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Honnavalli_S/0/1/0/all/0/1">Samhita Honnavalli</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Levy_S/0/1/0/all/0/1">Sharon Levy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mirza_D/0/1/0/all/0/1">Diba Mirza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_W/0/1/0/all/0/1">William Yang Wang</a></p>
9372
9373 <p>The growth of social media has encouraged the written use of African American
9374 Vernacular English (AAVE), which has traditionally been used only in oral
9375 contexts. However, NLP models have historically been developed using dominant
9376 English varieties, such as Standard American English (SAE), due to text corpora
9377 availability. We investigate the performance of GPT-2 on AAVE text by creating
9378 a dataset of intent-equivalent parallel AAVE/SAE tweet pairs, thereby isolating
9379 syntactic structure and AAVE- or SAE-specific language for each pair. We
9380 evaluate each sample and its GPT-2 generated text with pretrained sentiment
9381 classifiers and find that while AAVE text results in more classifications of
9382 negative sentiment than SAE, the use of GPT-2 generally increases occurrences
9383 of positive sentiment for both. Additionally, we conduct human evaluation of
9384 AAVE and SAE text generated with GPT-2 to compare contextual rigor and overall
9385 quality.
9386 </p>
9387 </description>
9388 </item>
9389 <item>
9390 <title>Improved Analysis of Clipping Algorithms for Non-convex Optimization. (arXiv:2010.02519v2 [cs.LG] UPDATED)</title>
9391 <link>http://fr.arxiv.org/abs/2010.02519</link>
9392 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_B/0/1/0/all/0/1">Bohang Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jin_J/0/1/0/all/0/1">Jikai Jin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fang_C/0/1/0/all/0/1">Cong Fang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_L/0/1/0/all/0/1">Liwei Wang</a></p>
9393
9394 <p>Gradient clipping is commonly used in training deep neural networks partly
9395 due to its practicability in relieving the exploding gradient problem.
9396 Recently, \citet{zhang2019gradient} show that clipped (stochastic) Gradient
9397 Descent (GD) converges faster than vanilla GD/SGD via introducing a new
9398 assumption called $(L_0, L_1)$-smoothness, which characterizes the violent
9399 fluctuation of gradients typically encountered in deep neural networks.
9400 However, their iteration complexities on the problem-dependent parameters are
9401 rather pessimistic, and theoretical justification of clipping combined with
9402 other crucial techniques, e.g. momentum acceleration, are still lacking. In
9403 this paper, we bridge the gap by presenting a general framework to study the
9404 clipping algorithms, which also takes momentum methods into consideration. We
9405 provide convergence analysis of the framework in both deterministic and
9406 stochastic setting, and demonstrate the tightness of our results by comparing
9407 them with existing lower bounds. Our results imply that the efficiency of
9408 clipping methods will not degenerate even in highly non-smooth regions of the
9409 landscape. Experiments confirm the superiority of clipping-based methods in
9410 deep learning tasks.
9411 </p>
9412 </description>
9413 </item>
9414 <item>
9415 <title>Improving Local Identifiability in Probabilistic Box Embeddings. (arXiv:2010.04831v2 [cs.LG] UPDATED)</title>
9416 <link>http://fr.arxiv.org/abs/2010.04831</link>
9417 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dasgupta_S/0/1/0/all/0/1">Shib Sankar Dasgupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Boratko_M/0/1/0/all/0/1">Michael Boratko</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1">Dongxu Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vilnis_L/0/1/0/all/0/1">Luke Vilnis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_X/0/1/0/all/0/1">Xiang Lorraine Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+McCallum_A/0/1/0/all/0/1">Andrew McCallum</a></p>
9418
9419 <p>Geometric embeddings have recently received attention for their natural
9420 ability to represent transitive asymmetric relations via containment. Box
9421 embeddings, where objects are represented by n-dimensional hyperrectangles, are
9422 a particularly promising example of such an embedding as they are closed under
9423 intersection and their volume can be calculated easily, allowing them to
9424 naturally represent calibrated probability distributions. The benefits of
9425 geometric embeddings also introduce a problem of local identifiability,
9426 however, where whole neighborhoods of parameters result in equivalent loss
9427 which impedes learning. Prior work addressed some of these issues by using an
9428 approximation to Gaussian convolution over the box parameters, however, this
9429 intersection operation also increases the sparsity of the gradient. In this
9430 work, we model the box parameters with min and max Gumbel distributions, which
9431 were chosen such that space is still closed under the operation of the
9432 intersection. The calculation of the expected intersection volume involves all
9433 parameters, and we demonstrate experimentally that this drastically improves
9434 the ability of such models to learn.
9435 </p>
9436 </description>
9437 </item>
9438 <item>
9439 <title>Neural-Symbolic Reasoning on Knowledge Graphs. (arXiv:2010.05446v3 [cs.AI] UPDATED)</title>
9440 <link>http://fr.arxiv.org/abs/2010.05446</link>
9441 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jing Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1">Bo Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_L/0/1/0/all/0/1">Lingxi Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ke_X/0/1/0/all/0/1">Xirui Ke</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_H/0/1/0/all/0/1">Haipeng Ding</a></p>
9442
9443 <p>Knowledge graph reasoning is the fundamental component to support machine
9444 learning applications such as information extraction, information retrieval and
9445 recommendation. Since knowledge graph can be viewed as the discrete symbolic
9446 representations of knowledge, reasoning on knowledge graphs can naturally
9447 leverage the symbolic techniques. However, symbolic reasoning is intolerant of
9448 the ambiguous and noisy data. On the contrary, the recent advances of deep
9449 learning promote neural reasoning on knowledge graphs, which is robust to the
9450 ambiguous and noisy data, but lacks interpretability compared to symbolic
9451 reasoning. Considering the advantages and disadvantages of both methodologies,
9452 recent efforts have been made on combining the two reasoning methods. In this
9453 survey, we take a thorough look at the development of the symbolic reasoning,
9454 neural reasoning and the neural-symbolic reasoning on knowledge graphs. We
9455 survey two specific reasoning tasks, knowledge graph completion and question
9456 answering on knowledge graphs, and explain them in a unified reasoning
9457 framework. We also briefly discuss the future directions for knowledge graph
9458 reasoning.
9459 </p>
9460 </description>
9461 </item>
9462 <item>
9463 <title>On lattice point counting in $\Delta$-modular polyhedra. (arXiv:2010.05768v2 [cs.CC] UPDATED)</title>
9464 <link>http://fr.arxiv.org/abs/2010.05768</link>
9465 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gribanov_D/0/1/0/all/0/1">D.V. Gribanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zolotykh_N/0/1/0/all/0/1">N.Yu. Zolotykh</a></p>
9466
9467 <p>Let a polyhedron $P$ be defined by one of the following ways:
9468 </p>
9469 <p>(i) $P = \{x \in R^n \colon A x \leq b\}$, where $A \in Z^{(n+k) \times n}$,
9470 $b \in Z^{(n+k)}$ and $rank\, A = n$;
9471 </p>
9472 <p>(ii) $P = \{x \in R_+^n \colon A x = b\}$, where $A \in Z^{k \times n}$, $b
9473 \in Z^{k}$ and $rank\, A = k$.
9474 </p>
9475 <p>And let all rank order minors of $A$ be bounded by $\Delta$ in absolute
9476 values. We show that the short rational generating function for the power
9477 series $$ \sum\limits_{m \in P \cap Z^n} x^m $$ can be computed with the
9478 arithmetic complexity $ O\left(T_{SNF}(d) \cdot d^{k} \cdot d^{\log_2
9479 \Delta}\right), $ where $k$ and $\Delta$ are fixed, $d = \dim P$, and
9480 $T_{SNF}(m)$ is the complexity to compute the Smith Normal Form for $m \times
9481 m$ integer matrix. In particular, $d = n$ for the case (i) and $d = n-k$ for
9482 the case (ii).
9483 </p>
9484 <p>The simplest examples of polyhedra that meet conditions (i) or (ii) are the
9485 simplicies, the subset sum polytope and the knapsack or multidimensional
9486 knapsack polytopes.
9487 </p>
9488 <p>We apply these results to parametric polytopes, and show that the step
9489 polynomial representation of the function $c_P(y) = |P_{y} \cap Z^n|$, where
9490 $P_{y}$ is parametric polytope, can be computed by a polynomial time even in
9491 varying dimension if $P_{y}$ has a close structure to the cases (i) or (ii). As
9492 another consequence, we show that the coefficients $e_i(P,m)$ of the Ehrhart
9493 quasi-polynomial $$ \left| mP \cap Z^n\right| = \sum\limits_{j = 0}^n
9494 e_i(P,m)m^j $$ can be computed by a polynomial time algorithm for fixed $k$ and
9495 $\Delta$.
9496 </p>
9497 </description>
9498 </item>
9499 <item>
9500 <title>CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations. (arXiv:2010.06351v3 [cs.CL] UPDATED)</title>
9501 <link>http://fr.arxiv.org/abs/2010.06351</link>
9502 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Luo_F/0/1/0/all/0/1">Fuli Luo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_P/0/1/0/all/0/1">Pengcheng Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_S/0/1/0/all/0/1">Shicheng Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_X/0/1/0/all/0/1">Xuancheng Ren</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sun_X/0/1/0/all/0/1">Xu Sun</a></p>
9503
9504 <p>Pre-trained self-supervised models such as BERT have achieved striking
9505 success in learning sequence representations, especially for natural language
9506 processing. These models typically corrupt the given sequences with certain
9507 types of noise, such as masking, shuffling, or substitution, and then try to
9508 recover the original input. However, such pre-training approaches are prone to
9509 learning representations that are covariant with the noise, leading to the
9510 discrepancy between the pre-training and fine-tuning stage. To remedy this, we
9511 present ContrAstive Pre-Training (CAPT) to learn noise invariant sequence
9512 representations. The proposed CAPT encourages the consistency between
9513 representations of the original sequence and its corrupted version via
9514 unsupervised instance-wise training signals. In this way, it not only
9515 alleviates the pretrain-finetune discrepancy induced by the noise of
9516 pre-training, but also aids the pre-trained model in better capturing global
9517 semantics of the input via more effective sentence-level supervision. Different
9518 from most prior work that focuses on a particular modality, comprehensive
9519 empirical evidence on 11 natural language understanding and cross-modal tasks
9520 illustrates that CAPT is applicable for both language and vision-language
9521 tasks, and obtains surprisingly consistent improvement, including 0.6% absolute
9522 gain on GLUE benchmarks and 0.8% absolute increment on NLVR.
9523 </p>
9524 </description>
9525 </item>
9526 <item>
9527 <title>Spherical Knowledge Distillation. (arXiv:2010.07485v2 [cs.LG] UPDATED)</title>
9528 <link>http://fr.arxiv.org/abs/2010.07485</link>
9529 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_J/0/1/0/all/0/1">Jia Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_M/0/1/0/all/0/1">Minghao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_Y/0/1/0/all/0/1">Yao Hu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_C/0/1/0/all/0/1">Chen Zhu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_X/0/1/0/all/0/1">Xiaofei He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cai_D/0/1/0/all/0/1">Deng Cai</a></p>
9530
9531 <p>Knowledge distillation aims at obtaining a small but effective deep model by
9532 transferring knowledge from a much larger one. The previous approaches try to
9533 reach this goal by simply "logit-supervised" information transferring between
9534 the teacher and student, which somehow can be subsequently decomposed as the
9535 transfer of normalized logits and $l^2$ norm. We argue that the norm of logits
9536 is actually interference, which damages the efficiency in the transfer process.
9537 To address this problem, we propose Spherical Knowledge Distillation (SKD).
9538 Specifically, we project the teacher and the student's logits into a unit
9539 sphere, and then we can efficiently perform knowledge distillation on the
9540 sphere. We verify our argument via theoretical analysis and ablation study.
9541 Extensive experiments have demonstrated the superiority and scalability of our
9542 method over the SOTAs.
9543 </p>
9544 </description>
9545 </item>
9546 <item>
9547 <title>Measuring the Dynamic Impact of High-Speed Railways on Urban Interactions in China. (arXiv:2010.08182v3 [cs.SI] UPDATED)</title>
9548 <link>http://fr.arxiv.org/abs/2010.08182</link>
9549 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gong_J/0/1/0/all/0/1">Junfang Gong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_S/0/1/0/all/0/1">Shengwen Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ye_X/0/1/0/all/0/1">Xinyue Ye</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_Q/0/1/0/all/0/1">Qiong Peng</a></p>
9550
9551 <p>High-speed rail (HSR) has become an important mode of inter-city
9552 transportation between large cities. Inter-city interaction facilitated by HSR
9553 tends to play a more prominent role in promoting urban and regional economic
9554 integration and development. Quantifying the impact of HSR's interaction on
9555 cities and people is therefore crucial for long-term urban and regional
9556 development planning and policy making. We develop an evaluation framework
9557 using toponym information from social media as a proxy to estimate the dynamics
9558 of such interactions. This paper adopts two types of spatial information:
9559 toponyms from social media posts, and the geographical location information
9560 embedded in social media posts. The framework highlights the asymmetric nature
9561 of social interaction among cities, and proposes a series of metrics to
9562 quantify such impact from multiple perspectives, including interaction
9563 strength, spatial decay, and channel effect. The results show that HSRs not
9564 only greatly expand the uneven distribution of inter-city connections, but also
9565 significantly reshape the interactions that occur along HSR routes through the
9566 channel effect.
9567 </p>
9568 </description>
9569 </item>
9570 <item>
9571 <title>Learning Accurate Entropy Model with Global Reference for Image Compression. (arXiv:2010.08321v2 [eess.IV] UPDATED)</title>
9572 <link>http://fr.arxiv.org/abs/2010.08321</link>
9573 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Qian_Y/0/1/0/all/0/1">Yichen Qian</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Tan_Z/0/1/0/all/0/1">Zhiyu Tan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sun_X/0/1/0/all/0/1">Xiuyu Sun</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lin_M/0/1/0/all/0/1">Ming Lin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_D/0/1/0/all/0/1">Dongyang Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sun_Z/0/1/0/all/0/1">Zhenhong Sun</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_H/0/1/0/all/0/1">Hao Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jin_R/0/1/0/all/0/1">Rong Jin</a></p>
9574
9575 <p>In recent deep image compression neural networks, the entropy model plays a
9576 critical role in estimating the prior distribution of deep image encodings.
9577 Existing methods combine hyperprior with local context in the entropy
9578 estimation function. This greatly limits their performance due to the absence
9579 of a global vision. In this work, we propose a novel Global Reference Model for
9580 image compression to effectively leverage both the local and the global context
9581 information, leading to an enhanced compression rate. The proposed method scans
9582 decoded latents and then finds the most relevant latent to assist the
9583 distribution estimating of the current latent. A by-product of this work is the
9584 innovation of a mean-shifting GDN module that further improves the performance.
9585 Experimental results demonstrate that the proposed model outperforms the
9586 rate-distortion performance of most of the state-of-the-art methods in the
9587 industry.
9588 </p>
9589 </description>
9590 </item>
9591 <item>
9592 <title>A Grid-based Representation for Human Action Recognition. (arXiv:2010.08841v2 [cs.CV] UPDATED)</title>
9593 <link>http://fr.arxiv.org/abs/2010.08841</link>
9594 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lamghari_S/0/1/0/all/0/1">Soufiane Lamghari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bilodeau_G/0/1/0/all/0/1">Guillaume-Alexandre Bilodeau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Saunier_N/0/1/0/all/0/1">Nicolas Saunier</a></p>
9595
9596 <p>Human action recognition (HAR) in videos is a fundamental research topic in
9597 computer vision. It consists mainly in understanding actions performed by
9598 humans based on a sequence of visual observations. In recent years, HAR have
9599 witnessed significant progress, especially with the emergence of deep learning
9600 models. However, most of existing approaches for action recognition rely on
9601 information that is not always relevant for this task, and are limited in the
9602 way they fuse the temporal information. In this paper, we propose a novel
9603 method for human action recognition that encodes efficiently the most
9604 discriminative appearance information of an action with explicit attention on
9605 representative pose features, into a new compact grid representation. Our GRAR
9606 (Grid-based Representation for Action Recognition) method is tested on several
9607 benchmark datasets demonstrating that our model can accurately recognize human
9608 actions, despite intra-class appearance variations and occlusion challenges.
9609 </p>
9610 </description>
9611 </item>
9612 <item>
9613 <title>What breach? Measuring online awareness of security incidents by studying real-world browsing behavior. (arXiv:2010.09843v2 [cs.CR] UPDATED)</title>
9614 <link>http://fr.arxiv.org/abs/2010.09843</link>
9615 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bhagavatula_S/0/1/0/all/0/1">Sruti Bhagavatula</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bauer_L/0/1/0/all/0/1">Lujo Bauer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kapadia_A/0/1/0/all/0/1">Apu Kapadia</a></p>
9616
9617 <p>Awareness about security and privacy risks is important for developing good
9618 security habits. Learning about real-world security incidents and data breaches
9619 can alert people to the ways in which their information is vulnerable online,
9620 thus playing a significant role in encouraging safe security behavior. This
9621 paper examines 1) how often people read about security incidents online, 2) of
9622 those people, whether and to what extent they follow up with an action, e.g.,
9623 by trying to read more about the incident, and 3) what influences the
9624 likelihood that they will read about an incident and take some action. We study
9625 this by quantitatively examining real-world internet-browsing data from 303
9626 participants.
9627 </p>
9628 <p>Our findings present a bleak view of awareness of security incidents. Only
9629 17% of participants visited any web pages related to six widely publicized
9630 large-scale security incidents; few read about one even when an incident was
9631 likely to have affected them (e.g., the Equifax breach almost universally
9632 affected people with Equifax credit reports). We further found that more severe
9633 incidents as well as articles that constructively spoke about the incident
9634 inspired more action. We conclude with recommendations for specific future
9635 research and for enabling useful security incident information to reach more
9636 people.
9637 </p>
9638 </description>
9639 </item>
9640 <item>
9641 <title>VarGrad: A Low-Variance Gradient Estimator for Variational Inference. (arXiv:2010.10436v2 [stat.ML] UPDATED)</title>
9642 <link>http://fr.arxiv.org/abs/2010.10436</link>
9643 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Richter_L/0/1/0/all/0/1">Lorenz Richter</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Boustati_A/0/1/0/all/0/1">Ayman Boustati</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Nusken_N/0/1/0/all/0/1">Nikolas N&#xfc;sken</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Ruiz_F/0/1/0/all/0/1">Francisco J. R. Ruiz</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Akyildiz_O/0/1/0/all/0/1">&#xd6;mer Deniz Akyildiz</a></p>
9644
9645 <p>We analyse the properties of an unbiased gradient estimator of the ELBO for
9646 variational inference, based on the score function method with leave-one-out
9647 control variates. We show that this gradient estimator can be obtained using a
9648 new loss, defined as the variance of the log-ratio between the exact posterior
9649 and the variational approximation, which we call the $\textit{log-variance
9650 loss}$. Under certain conditions, the gradient of the log-variance loss equals
9651 the gradient of the (negative) ELBO. We show theoretically that this gradient
9652 estimator, which we call $\textit{VarGrad}$ due to its connection to the
9653 log-variance loss, exhibits lower variance than the score function method in
9654 certain settings, and that the leave-one-out control variate coefficients are
9655 close to the optimal ones. We empirically demonstrate that VarGrad offers a
9656 favourable variance versus computation trade-off compared to other
9657 state-of-the-art estimators on a discrete VAE.
9658 </p>
9659 </description>
9660 </item>
9661 <item>
9662 <title>A Coarse-To-Fine (C2F) Representation for End-To-End 6-DoF Grasp Detection. (arXiv:2010.10695v2 [cs.RO] UPDATED)</title>
9663 <link>http://fr.arxiv.org/abs/2010.10695</link>
9664 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jeng_K/0/1/0/all/0/1">Kuang-Yu Jeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1">Yueh-Cheng Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Z/0/1/0/all/0/1">Zhe Yu Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_J/0/1/0/all/0/1">Jen-Wei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chang_Y/0/1/0/all/0/1">Ya-Liang Chang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Su_H/0/1/0/all/0/1">Hung-Ting Su</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hsu_W/0/1/0/all/0/1">Winston Hsu</a></p>
9665
9666 <p>We proposed an end-to-end grasp detection network, Grasp Detection Network
9667 (GDN), cooperated with a novel coarse-to-fine (C2F) grasp representation design
9668 to detect diverse and accurate 6-DoF grasps based on point clouds. Compared to
9669 previous two-stage approaches which sample and evaluate multiple grasp
9670 candidates, our architecture is at least 20 times faster. It is also 8% and 40%
9671 more accurate in terms of the success rate in single object scenes and the
9672 complete rate in clutter scenes, respectively. Our method shows superior
9673 results among settings with different number of views and input points.
9674 Moreover, we propose a new AP-based metric which considers both rotation and
9675 transition errors, making it a more comprehensive evaluation tool for grasp
9676 detection models.
9677 </p>
9678 </description>
9679 </item>
9680 <item>
9681 <title>Model selection in reconciling hierarchical time series. (arXiv:2010.10742v2 [cs.LG] UPDATED)</title>
9682 <link>http://fr.arxiv.org/abs/2010.10742</link>
9683 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Abolghasemi_M/0/1/0/all/0/1">Mahdi Abolghasemi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hyndman_R/0/1/0/all/0/1">Rob J Hyndman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Spiliotis_E/0/1/0/all/0/1">Evangelos Spiliotis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bergmeir_C/0/1/0/all/0/1">Christoph Bergmeir</a></p>
9684
9685 <p>Model selection has been proven an effective strategy for improving accuracy
9686 in time series forecasting applications. However, when dealing with
9687 hierarchical time series, apart from selecting the most appropriate forecasting
9688 model, forecasters have also to select a suitable method for reconciling the
9689 base forecasts produced for each series to make sure they are coherent.
9690 Although some hierarchical forecasting methods like minimum trace are strongly
9691 supported both theoretically and empirically for reconciling the base
9692 forecasts, there are still circumstances under which they might not produce the
9693 most accurate results, being outperformed by other methods. In this paper we
9694 propose an approach for dynamically selecting the most appropriate hierarchical
9695 forecasting method and succeeding better forecasting accuracy along with
9696 coherence. The approach, to be called conditional hierarchical forecasting, is
9697 based on Machine Learning classification methods and uses time series features
9698 as leading indicators for performing the selection for each hierarchy examined
9699 considering a variety of alternatives. Our results suggest that conditional
9700 hierarchical forecasting leads to significantly more accurate forecasts than
9701 standard approaches, especially at lower hierarchical levels.
9702 </p>
9703 </description>
9704 </item>
9705 <item>
9706 <title>Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition. (arXiv:2010.10759v3 [cs.SD] UPDATED)</title>
9707 <link>http://fr.arxiv.org/abs/2010.10759</link>
9708 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shi_Y/0/1/0/all/0/1">Yangyang Shi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yongqiang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_C/0/1/0/all/0/1">Chunyang Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yeh_C/0/1/0/all/0/1">Ching-Feng Yeh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chan_J/0/1/0/all/0/1">Julian Chan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_F/0/1/0/all/0/1">Frank Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Le_D/0/1/0/all/0/1">Duc Le</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Seltzer_M/0/1/0/all/0/1">Mike Seltzer</a></p>
9709
9710 <p>This paper proposes an efficient memory transformer Emformer for low latency
9711 streaming speech recognition. In Emformer, the long-range history context is
9712 distilled into an augmented memory bank to reduce self-attention's computation
9713 complexity. A cache mechanism saves the computation for the key and value in
9714 self-attention for the left context. Emformer applies a parallelized block
9715 processing in training to support low latency models. We carry out experiments
9716 on benchmark LibriSpeech data. Under average latency of 960 ms, Emformer gets
9717 WER $2.50\%$ on test-clean and $5.62\%$ on test-other. Comparing with a strong
9718 baseline augmented memory transformer (AM-TRF), Emformer gets $4.6$ folds
9719 training speedup and $18\%$ relative real-time factor (RTF) reduction in
9720 decoding with relative WER reduction $17\%$ on test-clean and $9\%$ on
9721 test-other. For a low latency scenario with an average latency of 80 ms,
9722 Emformer achieves WER $3.01\%$ on test-clean and $7.09\%$ on test-other.
9723 Comparing with the LSTM baseline with the same latency and model size, Emformer
9724 gets relative WER reduction $9\%$ and $16\%$ on test-clean and test-other,
9725 respectively.
9726 </p>
9727 </description>
9728 </item>
9729 <item>
9730 <title>Large-Scale High PV Power Grid Dynamic Model Development -- A Case Study on the U.S. Eastern Interconnection. (arXiv:2010.11150v2 [eess.SY] UPDATED)</title>
9731 <link>http://fr.arxiv.org/abs/2010.11150</link>
9732 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+You_S/0/1/0/all/0/1">Shutang You</a></p>
9733
9734 <p>Power systems are undergoing a transformation toward a low-carbon
9735 non-synchronous generation portfolio. A major concern for system planners and
9736 operators is the system dynamics in the high renewable penetration future.
9737 Because of the scale of the system and numerous components involved, it is
9738 extremely difficult to develop high PV dynamic models based upon actual power
9739 system models. The main contribution of this paper is providing an example of
9740 developing high PV penetration models based on the validated dynamic model of
9741 an actual large-scale power grid - the U.S. Eastern Interconnection system. The
9742 displacement of conventional generators by PV is realized by optimization.
9743 Combining the PV distribution optimization and the validated dynamic model
9744 information, this approach avoids the uncertainties brought about by
9745 transmission planning. As the existing dynamic models can be validated by
9746 measurements, this approach improves the credibility of the high PV models in
9747 representing future power grids. This generic approach can be applied to
9748 develop high PV dynamic models for other actual large-scale systems.
9749 </p>
9750 </description>
9751 </item>
9752 <item>
9753 <title>Build Smart Grids on Artificial Intelligence -- A Real-world Example. (arXiv:2010.11175v2 [eess.SY] UPDATED)</title>
9754 <link>http://fr.arxiv.org/abs/2010.11175</link>
9755 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+You_S/0/1/0/all/0/1">Shutang You</a></p>
9756
9757 <p>Power grid data are going big with the deployment of various sensors. The big
9758 data in power grids creates huge opportunities for applying artificial
9759 intelligence technologies to improve resilience and reliability. This paper
9760 introduces multiple real-world applications based on artificial intelligence to
9761 improve power grid situational awareness and resilience. These applications
9762 include event identification, inertia estimation, event location and magnitude
9763 estimation, data authentication, control, and stability assessment. These
9764 applications are operating on a real-world system called FNET-GridEye, which is
9765 a wide-area measurement network and arguably the world-largest cyber-physical
9766 system that collects power grid big data. These applications showed much better
9767 performance compared with conventional approaches and accomplished new tasks
9768 that are impossible to realized using conventional technologies. These
9769 encouraging results demonstrate that combining power grid big data and
9770 artificial intelligence can uncover and capture the non-linear correlation
9771 between power grid data and its stabilities indices and will potentially enable
9772 many advanced applications that can significantly improve power grid
9773 resilience.
9774 </p>
9775 </description>
9776 </item>
9777 <item>
9778 <title>NightOwl: Robotic Platform for Wheeled Service Robot. (arXiv:2010.11505v2 [cs.RO] UPDATED)</title>
9779 <link>http://fr.arxiv.org/abs/2010.11505</link>
9780 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Al_Fahsi_R/0/1/0/all/0/1">Resha Dwika Hefni Al-Fahsi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Winanta_K/0/1/0/all/0/1">Kevin Aldian Winanta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pradana_F/0/1/0/all/0/1">Fauzan Pradana</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ardiyanto_I/0/1/0/all/0/1">Igi Ardiyanto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cahyadi_A/0/1/0/all/0/1">Adha Imam Cahyadi</a></p>
9781
9782 <p>NightOwl is a robotic platform designed exclusively for a wheeled service
9783 robot. The robot navigates autonomously in omnidirectional fashion movement and
9784 equipped with LIDAR to sense the surrounding area. The platform itself was
9785 built using the Robot Operating System (ROS) and written in two different
9786 programming languages (C++ and Python). NightOwl is composed of several modular
9787 programs, namely hardware controller, light detection and ranging (LIDAR),
9788 simultaneous localization and mapping (SLAM), world model, path planning, robot
9789 control, communication, and behaviour. The programs run in parallel and
9790 communicate reciprocally to share various information. This paper explains the
9791 role of modular programs in the term of input, process, and output. In
9792 addition, NightOwl provides simulation visualized in both Gazebo and RViz. The
9793 robot in its environment is visualized by Gazebo. Sensor data from LIDAR and
9794 results from SLAM will be visualized by RViz.
9795 </p>
9796 </description>
9797 </item>
9798 <item>
9799 <title>Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity. (arXiv:2010.11775v2 [cs.LG] UPDATED)</title>
9800 <link>http://fr.arxiv.org/abs/2010.11775</link>
9801 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_S/0/1/0/all/0/1">Shuxiao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_H/0/1/0/all/0/1">Hangfeng He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Su_W/0/1/0/all/0/1">Weijie J. Su</a></p>
9802
9803 <p>As a popular approach to modeling the dynamics of training overparametrized
9804 neural networks (NNs), the neural tangent kernels (NTK) are known to fall
9805 behind real-world NNs in generalization ability. This performance gap is in
9806 part due to the \textit{label agnostic} nature of the NTK, which renders the
9807 resulting kernel not as \textit{locally elastic} as NNs~\citep{he2019local}. In
9808 this paper, we introduce a novel approach from the perspective of
9809 \emph{label-awareness} to reduce this gap for the NTK. Specifically, we propose
9810 two label-aware kernels that are each a superimposition of a label-agnostic
9811 part and a hierarchy of label-aware parts with increasing complexity of label
9812 dependence, using the Hoeffding decomposition. Through both theoretical and
9813 empirical evidence, we show that the models trained with the proposed kernels
9814 better simulate NNs in terms of generalization ability and local elasticity.
9815 </p>
9816 </description>
9817 </item>
9818 <item>
9819 <title>The Polynomial Method is Universal for Distribution-Free Correlational SQ Learning. (arXiv:2010.11925v2 [cs.DS] UPDATED)</title>
9820 <link>http://fr.arxiv.org/abs/2010.11925</link>
9821 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gollakota_A/0/1/0/all/0/1">Aravind Gollakota</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Karmalkar_S/0/1/0/all/0/1">Sushrut Karmalkar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Klivans_A/0/1/0/all/0/1">Adam Klivans</a></p>
9822
9823 <p>We consider the problem of distribution-free learning for Boolean function
9824 classes in the PAC and agnostic models. Generalizing a recent beautiful work of
9825 Malach and Shalev-Shwartz (2020) who gave the first tight correlational SQ
9826 (CSQ) lower bounds for learning DNF formulas, we show that lower bounds on the
9827 threshold or approximate degree of any function class directly imply CSQ lower
9828 bounds for PAC or agnostic learning respectively. These match corresponding
9829 positive results using upper bounds on the threshold or approximate degree in
9830 the SQ model for PAC or agnostic learning. Many of these results were implicit
9831 in earlier works of Feldman and Sherstov.
9832 </p>
9833 </description>
9834 </item>
9835 <item>
9836 <title>Escape saddle points faster on manifolds via perturbed Riemannian stochastic recursive gradient. (arXiv:2010.12191v2 [math.OC] UPDATED)</title>
9837 <link>http://fr.arxiv.org/abs/2010.12191</link>
9838 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Han_A/0/1/0/all/0/1">Andi Han</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Gao_J/0/1/0/all/0/1">Junbin Gao</a></p>
9839
9840 <p>In this paper, we propose a variant of Riemannian stochastic recursive
9841 gradient method that can achieve second-order convergence guarantee and escape
9842 saddle points using simple perturbation. The idea is to perturb the iterates
9843 when gradient is small and carry out stochastic recursive gradient updates over
9844 tangent space. This avoids the complication of exploiting Riemannian geometry.
9845 We show that under finite-sum setting, our algorithm requires
9846 $\widetilde{\mathcal{O}}\big( \frac{ \sqrt{n}}{\epsilon^2} + \frac{\sqrt{n}
9847 }{\delta^4} + \frac{n}{\delta^3}\big)$ stochastic gradient queries to find a
9848 $(\epsilon, \delta)$-second-order critical point. This strictly improves the
9849 complexity of perturbed Riemannian gradient descent and is superior to
9850 perturbed Riemannian accelerated gradient descent under large-sample settings.
9851 We also provide a complexity of $\widetilde{\mathcal{O}} \big(
9852 \frac{1}{\epsilon^3} + \frac{1}{\delta^3 \epsilon^2} + \frac{1}{\delta^4
9853 \epsilon} \big)$ for online optimization, which is novel on Riemannian manifold
9854 in terms of second-order convergence using only first-order information.
9855 </p>
9856 </description>
9857 </item>
9858 <item>
9859 <title>On the mechanical contribution of head stabilization to passive dynamics of anthropometric walkers. (arXiv:2010.12234v2 [cs.RO] UPDATED)</title>
9860 <link>http://fr.arxiv.org/abs/2010.12234</link>
9861 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Benallegue_M/0/1/0/all/0/1">Mehdi Benallegue</a> (AIST), <a href="http://fr.arxiv.org/find/cs/1/au:+Laumond_J/0/1/0/all/0/1">Jean-Paul Laumond</a> (DI-ENS), <a href="http://fr.arxiv.org/find/cs/1/au:+Berthoz_A/0/1/0/all/0/1">Alain Berthoz</a> (CdF (institution))</p>
9862
9863 <p>During the steady gait, humans stabilize their head around the vertical
9864 orientation. While there are sensori-cognitive explanations for this
9865 phenomenon, its mechanical e fect on the body dynamics remains un-explored. In
9866 this study, we take profit from the similarities that human steady gait share
9867 with the locomotion of passive dynamics robots. We introduce a simplified
9868 anthropometric D model to reproduce a broad walking dynamics. In a previous
9869 study, we showed heuristically that the presence of a stabilized head-neck
9870 system significantly influences the dynamics of walking. This paper gives new
9871 insights that lead to understanding this mechanical e fect. In particular, we
9872 introduce an original cart upper-body model that allows to better understand
9873 the mechanical interest of head stabilization when walking, and we study how
9874 this e fect is sensitive to the choice of control parameters.
9875 </p>
9876 </description>
9877 </item>
9878 <item>
9879 <title>Exploring task-based query expansion at the TREC-COVID track. (arXiv:2010.12674v2 [cs.IR] UPDATED)</title>
9880 <link>http://fr.arxiv.org/abs/2010.12674</link>
9881 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Schoegje_T/0/1/0/all/0/1">Thomas Schoegje</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kamphuis_C/0/1/0/all/0/1">Chris Kamphuis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dercksen_K/0/1/0/all/0/1">Koen Dercksen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hiemstra_D/0/1/0/all/0/1">Djoerd Hiemstra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pieters_T/0/1/0/all/0/1">Toine Pieters</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vries_A/0/1/0/all/0/1">Arjen de Vries</a></p>
9882
9883 <p>We explore how to generate effective queries based on search tasks. Our
9884 approach has three main steps: 1) identify search tasks based on research
9885 goals, 2) manually classify search queries according to those tasks, and 3)
9886 compare three methods to improve search rankings based on the task context. The
9887 most promising approach is based on expanding the user's query terms using task
9888 terms, which slightly improved the NDCG@20 scores over a BM25 baseline. Further
9889 improvements might be gained if we can identify more specific search tasks.
9890 </p>
9891 </description>
9892 </item>
9893 <item>
9894 <title>Adaptive In-network Collaborative Caching for Enhanced Ensemble Deep Learning at Edge. (arXiv:2010.12899v3 [cs.NI] UPDATED)</title>
9895 <link>http://fr.arxiv.org/abs/2010.12899</link>
9896 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_Y/0/1/0/all/0/1">Yana Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_D/0/1/0/all/0/1">Danye Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_Z/0/1/0/all/0/1">Zhiwei Xu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_J/0/1/0/all/0/1">Jie Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yujun Zhang</a></p>
9897
9898 <p>To enhance the quality and speed of data processing and protect the privacy
9899 and security of the data, edge computing has been extensively applied to
9900 support data-intensive intelligent processing services at edge. Among these
9901 data-intensive services, ensemble learning-based services can in natural
9902 leverage the distributed computation and storage resources at edge devices to
9903 achieve efficient data collection, processing, analysis.
9904 </p>
9905 <p>Collaborative caching has been applied in edge computing to support services
9906 close to the data source, in order to take the limited resources at edge
9907 devices to support high-performance ensemble learning solutions. To achieve
9908 this goal, we propose an adaptive in-network collaborative caching scheme for
9909 ensemble learning at edge. First, an efficient data representation structure is
9910 proposed to record cached data among different nodes. In addition, we design a
9911 collaboration scheme to facilitate edge nodes to cache valuable data for local
9912 ensemble learning, by scheduling local caching according to a summarization of
9913 data representations from different edge nodes. Our extensive simulations
9914 demonstrate the high performance of the proposed collaborative caching scheme,
9915 which significantly reduces the learning latency and the transmission overhead.
9916 </p>
9917 </description>
9918 </item>
9919 <item>
9920 <title>Lightning-Fast Gravitational Wave Parameter Inference through Neural Amortization. (arXiv:2010.12931v2 [astro-ph.IM] UPDATED)</title>
9921 <link>http://fr.arxiv.org/abs/2010.12931</link>
9922 <description><p>Authors: <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Delaunoy_A/0/1/0/all/0/1">Arnaud Delaunoy</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Wehenkel_A/0/1/0/all/0/1">Antoine Wehenkel</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Hinderer_T/0/1/0/all/0/1">Tanja Hinderer</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Nissanke_S/0/1/0/all/0/1">Samaya Nissanke</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Weniger_C/0/1/0/all/0/1">Christoph Weniger</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Williamson_A/0/1/0/all/0/1">Andrew R. Williamson</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Louppe_G/0/1/0/all/0/1">Gilles Louppe</a></p>
9923
9924 <p>Gravitational waves from compact binaries measured by the LIGO and Virgo
9925 detectors are routinely analyzed using Markov Chain Monte Carlo sampling
9926 algorithms. Because the evaluation of the likelihood function requires
9927 evaluating millions of waveform models that link between signal shapes and the
9928 source parameters, running Markov chains until convergence is typically
9929 expensive and requires days of computation. In this extended abstract, we
9930 provide a proof of concept that demonstrates how the latest advances in neural
9931 simulation-based inference can speed up the inference time by up to three
9932 orders of magnitude -- from days to minutes -- without impairing the
9933 performance. Our approach is based on a convolutional neural network modeling
9934 the likelihood-to-evidence ratio and entirely amortizes the computation of the
9935 posterior. We find that our model correctly estimates credible intervals for
9936 the parameters of simulated gravitational waves.
9937 </p>
9938 </description>
9939 </item>
9940 <item>
9941 <title>A Survey on Churn Analysis. (arXiv:2010.13119v2 [cs.LG] UPDATED)</title>
9942 <link>http://fr.arxiv.org/abs/2010.13119</link>
9943 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ahn_J/0/1/0/all/0/1">Jaehuyn Ahn</a></p>
9944
9945 <p>In this paper, I present churn prediction techniques that have been released
9946 so far. Churn prediction is used in the fields of Internet services, games,
9947 insurance, and management. However, since it has been used intensively to
9948 increase the predictability of various industry/academic fields, there is a big
9949 difference in its definition and utilization. In this paper, I collected the
9950 definitions of churn used in the fields of business administration, marketing,
9951 IT, telecommunications, newspapers, insurance and psychology, and described
9952 their differences. Based on this, I classified and explained churn loss,
9953 feature engineering, and prediction models. Our study can be used to select the
9954 definition of churn and its associated models suitable for the service field
9955 that researchers are most interested in by integrating fragmented churn studies
9956 in industry/academic fields.
9957 </p>
9958 </description>
9959 </item>
9960 <item>
9961 <title>Geometric Exploration for Online Control. (arXiv:2010.13178v2 [cs.LG] UPDATED)</title>
9962 <link>http://fr.arxiv.org/abs/2010.13178</link>
9963 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Plevrakis_O/0/1/0/all/0/1">Orestis Plevrakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hazan_E/0/1/0/all/0/1">Elad Hazan</a></p>
9964
9965 <p>We study the control of an \emph{unknown} linear dynamical system under
9966 general convex costs. The objective is minimizing regret vs. the class of
9967 disturbance-feedback-controllers, which encompasses all stabilizing
9968 linear-dynamical-controllers. In this work, we first consider the case of known
9969 cost functions, for which we design the first polynomial-time algorithm with
9970 $n^3\sqrt{T}$-regret, where $n$ is the dimension of the state plus the
9971 dimension of control input. The $\sqrt{T}$-horizon dependence is optimal, and
9972 improves upon the previous best known bound of $T^{2/3}$. The main component of
9973 our algorithm is a novel geometric exploration strategy: we adaptively
9974 construct a sequence of barycentric spanners in the policy space. Second, we
9975 consider the case of bandit feedback, for which we give the first
9976 polynomial-time algorithm with $poly(n)\sqrt{T}$-regret, building on Stochastic
9977 Bandit Convex Optimization.
9978 </p>
9979 </description>
9980 </item>
9981 <item>
9982 <title>Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach. (arXiv:2010.13273v2 [cs.IR] UPDATED)</title>
9983 <link>http://fr.arxiv.org/abs/2010.13273</link>
9984 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_Y/0/1/0/all/0/1">Yuyang Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Takeoka_K/0/1/0/all/0/1">Kunihiro Takeoka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xiao_C/0/1/0/all/0/1">Chuan Xiao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Oyamada_M/0/1/0/all/0/1">Masafumi Oyamada</a></p>
9985
9986 <p>Finding joinable tables in data lakes is key procedure in many applications
9987 such as data integration, data augmentation, data analysis, and data market.
9988 Traditional approaches that find equi-joinable tables are unable to deal with
9989 misspellings and different formats, nor do they capture any semantic joins. In
9990 this paper, we propose PEXESO, a framework for joinable table discovery in data
9991 lakes. We embed textual values as high-dimensional vectors and join columns
9992 under similarity predicates on high-dimensional vectors, hence to address the
9993 limitations of equi-join approaches and identify more meaningful results. To
9994 efficiently find joinable tables with similarity, we propose a block-and-verify
9995 method that utilizes pivot-based filtering. A partitioning technique is
9996 developed to cope with the case when the data lake is large and the index
9997 cannot fit in main memory. An experimental evaluation on real datasets shows
9998 that our solution identifies substantially more tables than equi-joins and
9999 outperforms other similarity-based options, and the join results are useful in
10000 data enrichment for machine learning tasks. The experiments also demonstrate
10001 the efficiency of the proposed method.
10002 </p>
10003 </description>
10004 </item>
10005 <item>
10006 <title>Malicious Requests Detection with Improved Bidirectional Long Short-term Memory Neural Networks. (arXiv:2010.13285v2 [cs.LG] UPDATED)</title>
10007 <link>http://fr.arxiv.org/abs/2010.13285</link>
10008 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_W/0/1/0/all/0/1">Wenhao Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_B/0/1/0/all/0/1">Bincheng Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jiajie Zhang</a></p>
10009
10010 <p>Detecting and intercepting malicious requests are one of the most widely used
10011 ways against attacks in the network security. Most existing detecting
10012 approaches, including matching blacklist characters and machine learning
10013 algorithms have all shown to be vulnerable to sophisticated attacks. To address
10014 the above issues, a more general and rigorous detection method is required. In
10015 this paper, we formulate the problem of detecting malicious requests as a
10016 temporal sequence classification problem, and propose a novel deep learning
10017 model namely Convolutional Neural Network-Bidirectional Long Short-term
10018 Memory-Convolutional Neural Network (CNN-BiLSTM-CNN). By connecting the shadow
10019 and deep feature maps of the convolutional layers, the malicious feature
10020 extracting ability is improved on more detailed functionality. Experimental
10021 results on HTTP dataset CSIC 2010 have demonstrated the effectiveness of the
10022 proposed method when compared with the state-of-the-arts.
10023 </p>
10024 </description>
10025 </item>
10026 <item>
10027 <title>Recent Developments on ESPnet Toolkit Boosted by Conformer. (arXiv:2010.13956v2 [eess.AS] UPDATED)</title>
10028 <link>http://fr.arxiv.org/abs/2010.13956</link>
10029 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Guo_P/0/1/0/all/0/1">Pengcheng Guo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Boyer_F/0/1/0/all/0/1">Florian Boyer</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chang_X/0/1/0/all/0/1">Xuankai Chang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Hayashi_T/0/1/0/all/0/1">Tomoki Hayashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Higuchi_Y/0/1/0/all/0/1">Yosuke Higuchi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Inaguma_H/0/1/0/all/0/1">Hirofumi Inaguma</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kamo_N/0/1/0/all/0/1">Naoyuki Kamo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_C/0/1/0/all/0/1">Chenda Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Garcia_Romero_D/0/1/0/all/0/1">Daniel Garcia-Romero</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shi_J/0/1/0/all/0/1">Jiatong Shi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shi_J/0/1/0/all/0/1">Jing Shi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Watanabe_S/0/1/0/all/0/1">Shinji Watanabe</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wei_K/0/1/0/all/0/1">Kun Wei</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_W/0/1/0/all/0/1">Wangyou Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_Y/0/1/0/all/0/1">Yuekai Zhang</a></p>
10030
10031 <p>In this study, we present recent developments on ESPnet: End-to-End Speech
10032 Processing toolkit, which mainly involves a recently proposed architecture
10033 called Conformer, Convolution-augmented Transformer. This paper shows the
10034 results for a wide range of end-to-end speech processing applications, such as
10035 automatic speech recognition (ASR), speech translations (ST), speech separation
10036 (SS) and text-to-speech (TTS). Our experiments reveal various training tips and
10037 significant performance benefits obtained with the Conformer on different
10038 tasks. These results are competitive or even outperform the current
10039 state-of-art Transformer models. We are preparing to release all-in-one recipes
10040 using open source and publicly available corpora for all the above tasks with
10041 pre-trained models. Our aim for this work is to contribute to our research
10042 community by reducing the burden of preparing state-of-the-art research
10043 environments usually requiring high resources.
10044 </p>
10045 </description>
10046 </item>
10047 <item>
10048 <title>Simultaenous Sieves: A Deterministic Streaming Algorithm for Non-Monotone Submodular Maximization. (arXiv:2010.14367v2 [cs.DS] UPDATED)</title>
10049 <link>http://fr.arxiv.org/abs/2010.14367</link>
10050 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kuhnle_A/0/1/0/all/0/1">Alan Kuhnle</a></p>
10051
10052 <p>In this work, we present a combinatorial, deterministic single-pass streaming
10053 algorithm for the problem of maximizing a submodular function, not necessarily
10054 monotone, with respect to a cardinality constraint (SMCC). In the case the
10055 function is monotone, our algorithm reduces to the optimal streaming algorithm
10056 of Badanidiyuru et al. (2014). In general, our algorithm achieves ratio $\alpha
10057 / (1 + \alpha) - \varepsilon$, for any $\varepsilon &gt; 0$, where $\alpha$ is the
10058 ratio of an offline (deterministic) algorithm for SMCC used for
10059 post-processing. Thus, if exponential computation time is allowed, our
10060 algorithm deterministically achieves nearly the optimal $1/2$ ratio. These
10061 results nearly match those of a recently proposed, randomized streaming
10062 algorithm that achieves the same ratios in expectation. For a deterministic,
10063 single-pass streaming algorithm, our algorithm achieves in polynomial time an
10064 improvement of the best approximation factor from $1/9$ of previous literature
10065 to $\approx 0.2689$.
10066 </p>
10067 </description>
10068 </item>
10069 <item>
10070 <title>Memory Optimization for Deep Networks. (arXiv:2010.14501v2 [cs.LG] UPDATED)</title>
10071 <link>http://fr.arxiv.org/abs/2010.14501</link>
10072 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shah_A/0/1/0/all/0/1">Aashaka Shah</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_C/0/1/0/all/0/1">Chao-Yuan Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohan_J/0/1/0/all/0/1">Jayashree Mohan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chidambaram_V/0/1/0/all/0/1">Vijay Chidambaram</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Krahenbuhl_P/0/1/0/all/0/1">Philipp Kr&#xe4;henb&#xfc;hl</a></p>
10073
10074 <p>Deep learning is slowly, but steadily, hitting a memory bottleneck. While the
10075 tensor computation in top-of-the-line GPUs increased by 32x over the last five
10076 years, the total available memory only grew by 2.5x. This prevents researchers
10077 from exploring larger architectures, as training large networks requires more
10078 memory for storing intermediate outputs. In this paper, we present MONeT, an
10079 automatic framework that minimizes both the memory footprint and computational
10080 overhead of deep networks. MONeT jointly optimizes the checkpointing schedule
10081 and the implementation of various operators. MONeT is able to outperform all
10082 prior hand-tuned operations as well as automated checkpointing. MONeT reduces
10083 the overall memory requirement by 3x for various PyTorch models, with a 9-16%
10084 overhead in computation. For the same computation cost, MONeT requires 1.2-1.8x
10085 less memory than current state-of-the-art automated checkpointing frameworks.
10086 Our code is available at https://github.com/utsaslab/MONeT.
10087 </p>
10088 </description>
10089 </item>
10090 <item>
10091 <title>Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus. (arXiv:2010.14571v2 [cs.CL] UPDATED)</title>
10092 <link>http://fr.arxiv.org/abs/2010.14571</link>
10093 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Caswell_I/0/1/0/all/0/1">Isaac Caswell</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Breiner_T/0/1/0/all/0/1">Theresa Breiner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Esch_D/0/1/0/all/0/1">Daan van Esch</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bapna_A/0/1/0/all/0/1">Ankur Bapna</a></p>
10094
10095 <p>Large text corpora are increasingly important for a wide variety of Natural
10096 Language Processing (NLP) tasks, and automatic language identification (LangID)
10097 is a core technology needed to collect such datasets in a multilingual context.
10098 LangID is largely treated as solved in the literature, with models reported
10099 that achieve over 90% average F1 on as many as 1,366 languages. We train LangID
10100 models on up to 1,629 languages with comparable quality on held-out test sets,
10101 but find that human-judged LangID accuracy for web-crawl text corpora created
10102 using these models is only around 5% for many lower-resource languages,
10103 suggesting a need for more robust evaluation. Further analysis revealed a
10104 variety of error modes, arising from domain mismatch, class imbalance, language
10105 similarity, and insufficiently expressive models. We propose two classes of
10106 techniques to mitigate these errors: wordlist-based tunable-precision filters
10107 (for which we release curated lists in about 500 languages) and
10108 transformer-based semi-supervised LangID models, which increase median dataset
10109 precision from 5.5% to 71.2%. These techniques enable us to create an initial
10110 data set covering 100K or more relatively clean sentences in each of 500+
10111 languages, paving the way towards a 1,000-language web text corpus.
10112 </p>
10113 </description>
10114 </item>
10115 <item>
10116 <title>Predicting Themes within Complex Unstructured Texts: A Case Study on Safeguarding Reports. (arXiv:2010.14584v2 [cs.CL] UPDATED)</title>
10117 <link>http://fr.arxiv.org/abs/2010.14584</link>
10118 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Edwards_A/0/1/0/all/0/1">Aleksandra Edwards</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rogers_D/0/1/0/all/0/1">David Rogers</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Camacho_Collados_J/0/1/0/all/0/1">Jose Camacho-Collados</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ribaupierre_H/0/1/0/all/0/1">H&#xe9;l&#xe8;ne de Ribaupierre</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Preece_A/0/1/0/all/0/1">Alun Preece</a></p>
10119
10120 <p>The task of text and sentence classification is associated with the need for
10121 large amounts of labelled training data. The acquisition of high volumes of
10122 labelled datasets can be expensive or unfeasible, especially for
10123 highly-specialised domains for which documents are hard to obtain. Research on
10124 the application of supervised classification based on small amounts of training
10125 data is limited. In this paper, we address the combination of state-of-the-art
10126 deep learning and classification methods and provide an insight into what
10127 combination of methods fit the needs of small, domain-specific, and
10128 terminologically-rich corpora. We focus on a real-world scenario related to a
10129 collection of safeguarding reports comprising learning experiences and
10130 reflections on tackling serious incidents involving children and vulnerable
10131 adults. The relatively small volume of available reports and their use of
10132 highly domain-specific terminology makes the application of automated
10133 approaches difficult. We focus on the problem of automatically identifying the
10134 main themes in a safeguarding report using supervised classification
10135 approaches. Our results show the potential of deep learning models to simulate
10136 subject-expert behaviour even for complex tasks with limited labelled data.
10137 </p>
10138 </description>
10139 </item>
10140 <item>
10141 <title>Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient. (arXiv:2010.14771v2 [cs.LG] UPDATED)</title>
10142 <link>http://fr.arxiv.org/abs/2010.14771</link>
10143 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tosatto_S/0/1/0/all/0/1">Samuele Tosatto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Carvalho_J/0/1/0/all/0/1">Jo&#xe3;o Carvalho</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Peters_J/0/1/0/all/0/1">Jan Peters</a></p>
10144
10145 <p>Off-policy Reinforcement Learning (RL) holds the promise of better data
10146 efficiency as it allows sample reuse and potentially enables safe interaction
10147 with the environment. Current off-policy policy gradient methods either suffer
10148 from high bias or high variance, delivering often unreliable estimates. The
10149 price of inefficiency becomes evident in real-world scenarios such as
10150 interaction-driven robot learning, where the success of RL has been rather
10151 limited, and a very high sample cost hinders straightforward application. In
10152 this paper, we propose a nonparametric Bellman equation, which can be solved in
10153 closed form. The solution is differentiable w.r.t the policy parameters and
10154 gives access to an estimation of the policy gradient. In this way, we avoid the
10155 high variance of importance sampling approaches, and the high bias of
10156 semi-gradient methods. We empirically analyze the quality of our gradient
10157 estimate against state-of-the-art methods, and show that it outperforms the
10158 baselines in terms of sample efficiency on classical control tasks.
10159 </p>
10160 </description>
10161 </item>
10162 <item>
10163 <title>Transferable Universal Adversarial Perturbations Using Generative Models. (arXiv:2010.14919v2 [cs.CV] UPDATED)</title>
10164 <link>http://fr.arxiv.org/abs/2010.14919</link>
10165 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hashemi_A/0/1/0/all/0/1">Atiye Sadat Hashemi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bar_A/0/1/0/all/0/1">Andreas B&#xe4;r</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mozaffari_S/0/1/0/all/0/1">Saeed Mozaffari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fingscheidt_T/0/1/0/all/0/1">Tim Fingscheidt</a></p>
10166
10167 <p>Deep neural networks tend to be vulnerable to adversarial perturbations,
10168 which by adding to a natural image can fool a respective model with high
10169 confidence. Recently, the existence of image-agnostic perturbations, also known
10170 as universal adversarial perturbations (UAPs), were discovered. However,
10171 existing UAPs still lack a sufficiently high fooling rate, when being applied
10172 to an unknown target model. In this paper, we propose a novel deep learning
10173 technique for generating more transferable UAPs. We utilize a perturbation
10174 generator and some given pretrained networks so-called source models to
10175 generate UAPs using the ImageNet dataset. Due to the similar feature
10176 representation of various model architectures in the first layer, we propose a
10177 loss formulation that focuses on the adversarial energy only in the respective
10178 first layer of the source models. This supports the transferability of our
10179 generated UAPs to any other target model. We further empirically analyze our
10180 generated UAPs and demonstrate that these perturbations generalize very well
10181 towards different target models. Surpassing the current state of the art in
10182 both, fooling rate and model-transferability, we can show the superiority of
10183 our proposed approach. Using our generated non-targeted UAPs, we obtain an
10184 average fooling rate of 93.36% on the source models (state of the art: 82.16%).
10185 Generating our UAPs on the deep ResNet-152, we obtain about a 12% absolute
10186 fooling rate advantage vs. cutting-edge methods on VGG-16 and VGG-19 target
10187 models.
10188 </p>
10189 </description>
10190 </item>
10191 <item>
10192 <title>Estimating Multiplicative Relations in Neural Networks. (arXiv:2010.15003v2 [cs.LG] UPDATED)</title>
10193 <link>http://fr.arxiv.org/abs/2010.15003</link>
10194 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goel_B/0/1/0/all/0/1">Bhaavan Goel</a></p>
10195
10196 <p>Universal approximation theorem suggests that a shallow neural network can
10197 approximate any function. The input to neurons at each layer is a weighted sum
10198 of previous layer neurons and then an activation is applied. These activation
10199 functions perform very well when the output is a linear combination of input
10200 data. When trying to learn a function which involves product of input data, the
10201 neural networks tend to overfit the data to approximate the function. In this
10202 paper we will use properties of logarithmic functions to propose a pair of
10203 activation functions which can translate products into linear expression and
10204 learn using backpropagation. We will try to generalize this approach for some
10205 complex arithmetic functions and test the accuracy on a disjoint distribution
10206 with the training set.
10207 </p>
10208 </description>
10209 </item>
10210 <item>
10211 <title>Benchmarking Parallelism in FaaS Platforms. (arXiv:2010.15032v2 [cs.DC] UPDATED)</title>
10212 <link>http://fr.arxiv.org/abs/2010.15032</link>
10213 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Barcelona_Pons_D/0/1/0/all/0/1">Daniel Barcelona-Pons</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Garcia_Lopez_P/0/1/0/all/0/1">Pedro Garc&#xed;a-L&#xf3;pez</a></p>
10214
10215 <p>Serverless computing has seen a myriad of work exploring its potential. Some
10216 systems tackle Function-as-a-Service (FaaS) properties on automatic elasticity
10217 and scale to run highly-parallel computing jobs. However, they focus on
10218 specific platforms and convey that their ideas can be extrapolated to any FaaS
10219 runtime.
10220 </p>
10221 <p>An important question arises: do all FaaS platforms fit parallel
10222 computations? In this paper, we argue that not all of them provide the
10223 necessary means to host highly-parallel applications. To validate our
10224 hypothesis, we create a comparative framework and categorize the architectures
10225 of four cloud FaaS offerings, with emphasis on parallel performance. We attest
10226 and extend this description with an empirical experiment that consists in
10227 plotting in deep detail the evolution of a parallel computing job on each
10228 service.
10229 </p>
10230 <p>The analysis of our results evinces that FaaS is not inherently good for
10231 parallel computations and architectural differences across platforms are
10232 decisive to categorize their performance. A key insight is the importance of
10233 virtualization technologies and the scheduling approach of FaaS platforms.
10234 Parallelism improves with lighter virtualization and proactive scheduling due
10235 to finer resource allocation and faster elasticity. This causes some platforms
10236 like AWS and IBM to perform well for highly-parallel computations, while others
10237 such as Azure present difficulties to achieve the required parallelism degree.
10238 Consequently, the information in this paper becomes of special interest to help
10239 users choose the most adequate infrastructure for their parallel applications.
10240 </p>
10241 </description>
10242 </item>
10243 <item>
10244 <title>Measuring non-trivial compositionality in emergent communication. (arXiv:2010.15058v2 [cs.NE] UPDATED)</title>
10245 <link>http://fr.arxiv.org/abs/2010.15058</link>
10246 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Korbak_T/0/1/0/all/0/1">Tomasz Korbak</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zubek_J/0/1/0/all/0/1">Julian Zubek</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Raczaszek_Leonardi_J/0/1/0/all/0/1">Joanna R&#x105;czaszek-Leonardi</a></p>
10247
10248 <p>Compositionality is an important explanatory target in emergent communication
10249 and language evolution. The vast majority of computational models of
10250 communication account for the emergence of only a very basic form of
10251 compositionality: trivial compositionality. A compositional protocol is
10252 trivially compositional if the meaning of a complex signal (e.g. blue circle)
10253 boils down to the intersection of meanings of its constituents (e.g. the
10254 intersection of the set of blue objects and the set of circles). A protocol is
10255 non-trivially compositional (NTC) if the meaning of a complex signal (e.g.
10256 biggest apple) is a more complex function of the meanings of their
10257 constituents. In this paper, we review several metrics of compositionality used
10258 in emergent communication and experimentally show that most of them fail to
10259 detect NTC - i.e. they treat non-trivial compositionality as a failure of
10260 compositionality. The one exception is tree reconstruction error, a metric
10261 motivated by formal accounts of compositionality. These results emphasise
10262 important limitations of emergent communication research that could hamper
10263 progress on modelling the emergence of NTC.
10264 </p>
10265 </description>
10266 </item>
10267 <item>
10268 <title>The fundamental equations of change in statistical ensembles and biological populations. (arXiv:2010.14544v1 [q-bio.PE] CROSS LISTED)</title>
10269 <link>http://fr.arxiv.org/abs/2010.14544</link>
10270 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-bio/1/au:+Frank_S/0/1/0/all/0/1">Steven A. Frank</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Bruggeman_F/0/1/0/all/0/1">Frank J. Bruggeman</a></p>
10271
10272 <p>A recent article in Nature Physics unified key results from thermodynamics,
10273 statistics, and information theory. The unification arose from a general
10274 equation for the rate of change in the information content of a system. The
10275 general equation describes the change in the moments of an observable quantity
10276 over a probability distribution. One term in the equation describes the change
10277 in the probability distribution. The other term describes the change in the
10278 observable values for a given state. We show the equivalence of this general
10279 equation for moment dynamics with the widely known Price equation from
10280 evolutionary theory, named after George Price. We introduce the Price equation
10281 from its biological roots, review a mathematically abstract form of the
10282 equation, and discuss the potential for this equation to unify diverse
10283 mathematical theories from different disciplines. The new work in Nature
10284 Physics and many applications in biology show that this equation also provides
10285 the basis for deriving many novel theoretical results within each discipline.
10286 </p>
10287 </description>
10288 </item>
10289 <item>
10290 <title>Generalized eigen, singular value, and partial least squares decompositions: The GSVD package. (arXiv:2010.14734v2 [cs.MS] CROSS LISTED)</title>
10291 <link>http://fr.arxiv.org/abs/2010.14734</link>
10292 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Beaton_D/0/1/0/all/0/1">Derek Beaton</a> (1) ((1) Rotman Research Institute, Baycrest Health Sciences)</p>
10293
10294 <p>The generalized singular value decomposition (GSVD, a.k.a. "SVD triplet",
10295 "duality diagram" approach) provides a unified strategy and basis to perform
10296 nearly all of the most common multivariate analyses (e.g., principal
10297 components, correspondence analysis, multidimensional scaling, canonical
10298 correlation, partial least squares). Though the GSVD is ubiquitous, powerful,
10299 and flexible, it has very few implementations. Here I introduce the GSVD
10300 package for R. The general goal of GSVD is to provide a small set of accessible
10301 functions to perform the GSVD and two other related decompositions (generalized
10302 eigenvalue decomposition, generalized partial least squares-singular value
10303 decomposition). Furthermore, GSVD helps provide a more unified conceptual
10304 approach and nomenclature to many techniques. I first introduce the concept of
10305 the GSVD, followed by a formal definition of the generalized decompositions.
10306 Next I provide some key decisions made during development, and then a number of
10307 examples of how to use GSVD to implement various statistical techniques. These
10308 examples also illustrate one of the goals of GSVD: how others can (or should)
10309 build analysis packages that depend on GSVD. Finally, I discuss the possible
10310 future of GSVD.
10311 </p>
10312 </description>
10313 </item>
10314 <item>
10315 <title>Continuous Chaotic Nonlinear System and Lyapunov controller Optimization using Deep Learning. (arXiv:2010.14746v1 [eess.SY] CROSS LISTED)</title>
10316 <link>http://fr.arxiv.org/abs/2010.14746</link>
10317 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Mahmoud_A/0/1/0/all/0/1">Amr Mahmoud</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ismaeil_Y/0/1/0/all/0/1">Youmna Ismaeil</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zohdy_M/0/1/0/all/0/1">Mohamed Zohdy</a></p>
10318
10319 <p>The introduction of unexpected system disturbances and new system dynamics
10320 does not allow initially selected static system and controller parameters to
10321 guarantee continued system stability and performance. In this research we
10322 present a novel approach for detecting early failure indicators of non-linear
10323 highly chaotic system and accordingly predict the best parameter calibrations
10324 to offset such instability using deep machine learning regression model. The
10325 approach proposed continuously monitors the system and controller signals. The
10326 Re-calibration of the system and controller parameters is triggered according
10327 to a set of conditions designed to maintain system stability without compromise
10328 to the system speed, intended outcome or required processing power. The deep
10329 neural model predicts the parameter values that would best counteract the
10330 expected system in-stability. To demonstrate the effectiveness of the proposed
10331 approach, it is applied to the non-linear complex combination of Duffing Van
10332 der pol oscillators. The approach is also tested under different scenarios the
10333 system and controller parameters are initially chosen incorrectly or the system
10334 parameters are changed while running or new system dynamics are introduced
10335 while running to measure effectiveness and reaction time.
10336 </p>
10337 </description>
10338 </item>
10339 </channel>
10340