arxiv.org.rss.20.xml - sfeed_tests - sfeed tests and RSS and Atom files
(HTM) git clone git://git.codemadness.org/sfeed_tests
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
arxiv.org.rss.20.xml (832069B)
---
1 <?xml version="1.0" encoding="UTF-8"?>
2
3 <rss version="2.0"
4 xmlns:content="http://purl.org/rss/1.0/modules/content/"
5 xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
6 xmlns:dc="http://purl.org/dc/elements/1.1/"
7 xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
8 xmlns:admin="http://webns.net/mvcb/"
9 >
10
11 <channel>
12 <title>cs updates on arXiv.org</title>
13 <link>http://fr.arxiv.org/</link>
14 <description>Computer Science (cs) updates on the arXiv.org e-print archive</description>
15 <language>en-us</language>
16 <pubDate>Fri, 30 Oct 2020 00:30:00 GMT</pubDate>
17 <lastBuildDate>Thu, 29 Oct 2020 20:30:00 -0500</lastBuildDate>
18 <managingEditor>www-admin@arxiv.org</managingEditor>
19
20 <image>
21 <title>arXiv.org</title>
22 <url>http://fr.arxiv.org/icons/sfx.gif</url>
23 <link>http://fr.arxiv.org/</link>
24 </image>
25 <item>
26 <title>Raw Audio for Depression Detection Can Be More Robust Against Gender Imbalance than Mel-Spectrogram Features. (arXiv:2010.15120v1 [cs.SD])</title>
27 <link>http://fr.arxiv.org/abs/2010.15120</link>
28 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bailey_A/0/1/0/all/0/1">Andrew Bailey</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plumbley_M/0/1/0/all/0/1">Mark D. Plumbley</a></p>
29
30 <p>Depression is a large-scale mental health problem and a challenging area for
31 machine learning researchers in terms of the detection of depression. Datasets
32 such as the Distress Analysis Interview Corpus - Wizard of Oz have been created
33 to aid research in this area. However, on top of the challenges inherent in
34 accurately detecting depression, biases in datasets may result in skewed
35 classification performance. In this paper we examine gender bias in the
36 DAIC-WOZ dataset using audio-based deep neural networks. We show that gender
37 biases in DAIC-WOZ can lead to an overreporting of performance, which has been
38 overlooked in the past due to the same gender biases being present in the test
39 set. By using raw audio and different concepts from Fair Machine Learning, such
40 as data re-distribution, we can mitigate against the harmful effects of bias.
41 </p>
42 </description>
43 <guid isPermaLink="false">oai:arXiv.org:2010.15120</guid>
44 </item>
45 <item>
46 <title>papaya2: 2D Irreducible Minkowski Tensor computation. (arXiv:2010.15138v1 [cs.GR])</title>
47 <link>http://fr.arxiv.org/abs/2010.15138</link>
48 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Schaller_F/0/1/0/all/0/1">Fabian M. Schaller</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wagner_J/0/1/0/all/0/1">Jenny Wagner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kapfer_S/0/1/0/all/0/1">Sebastian C. Kapfer</a></p>
49
50 <p>A common challenge in scientific and technical domains is the quantitative
51 description of geometries and shapes, e.g. in the analysis of microscope
52 imagery or astronomical observation data. Frequently, it is desirable to go
53 beyond scalar shape metrics such as porosity and surface to volume ratios
54 because the samples are anisotropic or because direction-dependent quantities
55 such as conductances or elasticity are of interest. Minkowski Tensors are a
56 systematic family of versatile and robust higher-order shape descriptors that
57 allow for shape characterization of arbitrary order and promise a path to
58 systematic structure-function relationships for direction-dependent properties.
59 Papaya2 is a software to calculate 2D higher-order shape metrics with a library
60 interface, support for Irreducible Minkowski Tensors and interpolated marching
61 squares. Extensions to Matlab, JavaScript and Python are provided as well.
62 While the tensor of inertia is computed by many tools, we are not aware of
63 other open-source software which provides higher-rank shape characterization in
64 2D.
65 </p>
66 </description>
67 <guid isPermaLink="false">oai:arXiv.org:2010.15138</guid>
68 </item>
69 <item>
70 <title>DeSMOG: Detecting Stance in Media On Global Warming. (arXiv:2010.15149v1 [cs.CL])</title>
71 <link>http://fr.arxiv.org/abs/2010.15149</link>
72 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Luo_Y/0/1/0/all/0/1">Yiwei Luo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Card_D/0/1/0/all/0/1">Dallas Card</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jurafsky_D/0/1/0/all/0/1">Dan Jurafsky</a></p>
73
74 <p>Citing opinions is a powerful yet understudied strategy in argumentation. For
75 example, an environmental activist might say, "Leading scientists agree that
76 global warming is a serious concern," framing a clause which affirms their own
77 stance ("that global warming is serious") as an opinion endorsed ("[scientists]
78 agree") by a reputable source ("leading"). In contrast, a global warming denier
79 might frame the same clause as the opinion of an untrustworthy source with a
80 predicate connoting doubt: "Mistaken scientists claim [...]." Our work studies
81 opinion-framing in the global warming (GW) debate, an increasingly partisan
82 issue that has received little attention in NLP. We introduce DeSMOG, a dataset
83 of stance-labeled GW sentences, and train a BERT classifier to study novel
84 aspects of argumentation in how different sides of a debate represent their own
85 and each other's opinions. From 56K news articles, we find that similar
86 linguistic devices for self-affirming and opponent-doubting discourse are used
87 across GW-accepting and skeptic media, though GW-skeptical media shows more
88 opponent-doubt. We also find that authors often characterize sources as
89 hypocritical, by ascribing opinions expressing the author's own view to source
90 entities known to publicly endorse the opposing view. We release our stance
91 dataset, model, and lexicons of framing devices for future work on
92 opinion-framing and the automatic detection of GW stance.
93 </p>
94 </description>
95 <guid isPermaLink="false">oai:arXiv.org:2010.15149</guid>
96 </item>
97 <item>
98 <title>On the Optimality and Convergence Properties of the Learning Model Predictive Controller. (arXiv:2010.15153v1 [math.OC])</title>
99 <link>http://fr.arxiv.org/abs/2010.15153</link>
100 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Rosolia_U/0/1/0/all/0/1">Ugo Rosolia</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lian_Y/0/1/0/all/0/1">Yingzhao Lian</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Maddalena_E/0/1/0/all/0/1">Emilio T. Maddalena</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ferrari_Trecate_G/0/1/0/all/0/1">Giancarlo Ferrari-Trecate</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Jones_C/0/1/0/all/0/1">Colin N. Jones</a></p>
101
102 <p>In this technical note we analyse the performance improvement and optimality
103 properties of the Learning Model Predictive Control (LMPC) strategy for linear
104 deterministic systems. The LMPC framework is a policy iteration scheme where
105 closed-loop trajectories are used to update the control policy for the next
106 execution of the control task. We show that, when a Linear Independence
107 Constraint Qualification (LICQ) condition holds, the LMPC scheme guarantees
108 strict iterative performance improvement and optimality, meaning that the
109 closed-loop cost evaluated over the entire task converges asymptotically to the
110 optimal cost of the infinite-horizon control problem. Compared to previous
111 works this sufficient LICQ condition can be easily checked, it holds for a
112 larger class of systems and it can be used to adaptively select the prediction
113 horizon of the controller, as demonstrated by a numerical example.
114 </p>
115 </description>
116 <guid isPermaLink="false">oai:arXiv.org:2010.15153</guid>
117 </item>
118 <item>
119 <title>Kernel Aggregated Fast Multipole Method: Efficient summation of Laplace and Stokes kernel functions. (arXiv:2010.15155v1 [math.NA])</title>
120 <link>http://fr.arxiv.org/abs/2010.15155</link>
121 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Yan_W/0/1/0/all/0/1">Wen Yan</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Blackwell_R/0/1/0/all/0/1">Robert Blackwell</a></p>
122
123 <p>Many different simulation methods for Stokes flow problems involve a common
124 computationally intense task---the summation of a kernel function over $O(N^2)$
125 pairs of points. One popular technique is the Kernel Independent Fast Multipole
126 Method (KIFMM), which constructs a spatial adaptive octree and places a small
127 number of equivalent multipole and local points around each octree box, and
128 completes the kernel sum with $O(N)$ performance. However, the KIFMM cannot be
129 used directly with nonlinear kernels, can be inefficient for complicated linear
130 kernels, and in general is difficult to implement compared to less-efficient
131 alternatives such as Ewald-type methods. Here we present the Kernel Aggregated
132 Fast Multipole Method (KAFMM), which overcomes these drawbacks by allowing
133 different kernel functions to be used for specific stages of octree traversal.
134 In many cases a simpler linear kernel suffices during the most extensive stage
135 of octree traversal, even for nonlinear kernel summation problems. The KAFMM
136 thereby improves computational efficiency in general and also allows efficient
137 evaluation of some nonlinear kernel functions such as the regularized
138 Stokeslet. We have implemented our method as an open-source software library
139 STKFMM with support for Laplace kernels, the Stokeslet, regularized Stokeslet,
140 Rotne-Prager-Yamakawa (RPY) tensor, and the Stokes double-layer and traction
141 operators. Open and periodic boundary conditions are supported for all kernels,
142 and the no-slip wall boundary condition is supported for the Stokeslet and RPY
143 tensor. The package is designed to be ready-to-use as well as being readily
144 extensible to additional kernels. Massive parallelism is supported with mixed
145 OpenMP and MPI.
146 </p>
147 </description>
148 <guid isPermaLink="false">oai:arXiv.org:2010.15155</guid>
149 </item>
150 <item>
151 <title>Diagnostic data integration using deep neural networks for real-time plasma analysis. (arXiv:2010.15156v1 [physics.comp-ph])</title>
152 <link>http://fr.arxiv.org/abs/2010.15156</link>
153 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Garola_A/0/1/0/all/0/1">A. Rigoni Garola</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Cavazzana_R/0/1/0/all/0/1">R. Cavazzana</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Gobbin_M/0/1/0/all/0/1">M. Gobbin</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Delogu_R/0/1/0/all/0/1">R.S. Delogu</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Manduchi_G/0/1/0/all/0/1">G. Manduchi</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Taliercio_C/0/1/0/all/0/1">C. Taliercio</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Luchetta_A/0/1/0/all/0/1">A. Luchetta</a></p>
154
155 <p>Recent advances in acquisition equipment is providing experiments with
156 growing amounts of precise yet affordable sensors. At the same time an improved
157 computational power, coming from new hardware resources (GPU, FPGA, ACAP), has
158 been made available at relatively low costs. This led us to explore the
159 possibility of completely renewing the chain of acquisition for a fusion
160 experiment, where many high-rate sources of data, coming from different
161 diagnostics, can be combined in a wide framework of algorithms. If on one hand
162 adding new data sources with different diagnostics enriches our knowledge about
163 physical aspects, on the other hand the dimensions of the overall model grow,
164 making relations among variables more and more opaque. A new approach for the
165 integration of such heterogeneous diagnostics, based on composition of deep
166 \textit{variational autoencoders}, could ease this problem, acting as a
167 structural sparse regularizer. This has been applied to RFX-mod experiment
168 data, integrating the soft X-ray linear images of plasma temperature with the
169 magnetic state.
170 </p>
171 <p>However to ensure a real-time signal analysis, those algorithmic techniques
172 must be adapted to run in well suited hardware. In particular it is shown that,
173 attempting a quantization of neurons transfer functions, such models can be
174 modified to create an embedded firmware. This firmware, approximating the deep
175 inference model to a set of simple operations, fits well with the simple logic
176 units that are largely abundant in FPGAs. This is the key factor that permits
177 the use of affordable hardware with complex deep neural topology and operates
178 them in real-time.
179 </p>
180 </description>
181 <guid isPermaLink="false">oai:arXiv.org:2010.15156</guid>
182 </item>
183 <item>
184 <title>Panoster: End-to-end Panoptic Segmentation of LiDAR Point Clouds. (arXiv:2010.15157v1 [cs.CV])</title>
185 <link>http://fr.arxiv.org/abs/2010.15157</link>
186 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gasperini_S/0/1/0/all/0/1">Stefano Gasperini</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mahani_M/0/1/0/all/0/1">Mohammad-Ali Nikouei Mahani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marcos_Ramiro_A/0/1/0/all/0/1">Alvaro Marcos-Ramiro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Navab_N/0/1/0/all/0/1">Nassir Navab</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tombari_F/0/1/0/all/0/1">Federico Tombari</a></p>
187
188 <p>Panoptic segmentation has recently unified semantic and instance
189 segmentation, previously addressed separately, thus taking a step further
190 towards creating more comprehensive and efficient perception systems. In this
191 paper, we present Panoster, a novel proposal-free panoptic segmentation method
192 for point clouds. Unlike previous approaches relying on several steps to group
193 pixels or points into objects, Panoster proposes a simplified framework
194 incorporating a learning-based clustering solution to identify instances. At
195 inference time, this acts as a class-agnostic semantic segmentation, allowing
196 Panoster to be fast, while outperforming prior methods in terms of accuracy.
197 Additionally, we showcase how our approach can be flexibly and effectively
198 applied on diverse existing semantic architectures to deliver panoptic
199 predictions.
200 </p>
201 </description>
202 <guid isPermaLink="false">oai:arXiv.org:2010.15157</guid>
203 </item>
204 <item>
205 <title>CNN Profiler on Polar Coordinate Images for Tropical Cyclone Structure Analysis. (arXiv:2010.15158v1 [cs.CV])</title>
206 <link>http://fr.arxiv.org/abs/2010.15158</link>
207 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1">Boyo Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1">Buo-Fu Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hsiao_C/0/1/0/all/0/1">Chun-Min Hsiao</a></p>
208
209 <p>Convolutional neural networks (CNN) have achieved great success in analyzing
210 tropical cyclones (TC) with satellite images in several tasks, such as TC
211 intensity estimation. In contrast, TC structure, which is conventionally
212 described by a few parameters estimated subjectively by meteorology
213 specialists, is still hard to be profiled objectively and routinely. This study
214 applies CNN on satellite images to create the entire TC structure profiles,
215 covering all the structural parameters. By utilizing the meteorological domain
216 knowledge to construct TC wind profiles based on historical structure
217 parameters, we provide valuable labels for training in our newly released
218 benchmark dataset. With such a dataset, we hope to attract more attention to
219 this crucial issue among data scientists. Meanwhile, a baseline is established
220 with a specialized convolutional model operating on polar-coordinates. We
221 discovered that it is more feasible and physically reasonable to extract
222 structural information on polar-coordinates, instead of Cartesian coordinates,
223 according to a TC's rotational and spiral natures. Experimental results on the
224 released benchmark dataset verified the robustness of the proposed model and
225 demonstrated the potential for applying deep learning techniques for this
226 barely developed yet important topic.
227 </p>
228 </description>
229 <guid isPermaLink="false">oai:arXiv.org:2010.15158</guid>
230 </item>
231 <item>
232 <title>Sizeless: Predicting the optimal size of serverless functions. (arXiv:2010.15162v1 [cs.DC])</title>
233 <link>http://fr.arxiv.org/abs/2010.15162</link>
234 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Eismann_S/0/1/0/all/0/1">Simon Eismann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bui_L/0/1/0/all/0/1">Long Bui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Grohmann_J/0/1/0/all/0/1">Johannes Grohmann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Abad_C/0/1/0/all/0/1">Cristina L. Abad</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Herbst_N/0/1/0/all/0/1">Nikolas Herbst</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kounev_S/0/1/0/all/0/1">Samuel Kounev</a></p>
235
236 <p>Serverless functions are a cloud computing paradigm that reduces operational
237 overheads for developers, because the cloud provider takes care of resource
238 management tasks such as resource provisioning, deployment, and auto-scaling.
239 The only resource management task that developers are still in charge of is
240 resource sizing, that is, selecting how much resources are allocated to each
241 worker instance. However, due to the challenging nature of resource sizing,
242 developers often neglect it despite its significant cost and performance
243 benefits. Existing approaches aiming to automate serverless functions resource
244 sizing require dedicated performance tests, which are time consuming to
245 implement and maintain.
246 </p>
247 <p>In this paper, we introduce Sizeless -- an approach to predict the optimal
248 resource size of a serverless function using monitoring data from a single
249 resource size. As our approach requires only production monitoring data,
250 developers no longer need to implement and maintain representative performance
251 tests. Furthermore, it enables cloud providers, which cannot engage in testing
252 the performance of user functions, to implement resource sizing on a platform
253 level and automate the last resource management task associated with serverless
254 functions. In our evaluation, Sizeless was able to predict the execution time
255 of the serverless functions of a realistic server-less application with a
256 median prediction accuracy of 93.1%. Using Sizeless to optimize the memory size
257 of this application results in a speedup of 16.7% while simultaneously
258 decreasing costs by 2.5%.
259 </p>
260 </description>
261 <guid isPermaLink="false">oai:arXiv.org:2010.15162</guid>
262 </item>
263 <item>
264 <title>Polymer Informatics with Multi-Task Learning. (arXiv:2010.15166v1 [cond-mat.mtrl-sci])</title>
265 <link>http://fr.arxiv.org/abs/2010.15166</link>
266 <description><p>Authors: <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Kunneth_C/0/1/0/all/0/1">Christopher K&#xfc;nneth</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Rajan_A/0/1/0/all/0/1">Arunkumar Chitteth Rajan</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Tran_H/0/1/0/all/0/1">Huan Tran</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Chen_L/0/1/0/all/0/1">Lihua Chen</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Kim_C/0/1/0/all/0/1">Chiho Kim</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Ramprasad_R/0/1/0/all/0/1">Rampi Ramprasad</a></p>
267
268 <p>Modern data-driven tools are transforming application-specific polymer
269 development cycles. Surrogate models that can be trained to predict the
270 properties of new polymers are becoming commonplace. Nevertheless, these models
271 do not utilize the full breadth of the knowledge available in datasets, which
272 are oftentimes sparse; inherent correlations between different property
273 datasets are disregarded. Here, we demonstrate the potency of multi-task
274 learning approaches that exploit such inherent correlations effectively,
275 particularly when some property dataset sizes are small. Data pertaining to 36
276 different properties of over $13, 000$ polymers (corresponding to over $23,000$
277 data points) are coalesced and supplied to deep-learning multi-task
278 architectures. Compared to conventional single-task learning models (that are
279 trained on individual property datasets independently), the multi-task approach
280 is accurate, efficient, scalable, and amenable to transfer learning as more
281 data on the same or different properties become available. Moreover, these
282 models are interpretable. Chemical rules, that explain how certain features
283 control trends in specific property values, emerge from the present work,
284 paving the way for the rational design of application specific polymers meeting
285 desired property or performance objectives.
286 </p>
287 </description>
288 <guid isPermaLink="false">oai:arXiv.org:2010.15166</guid>
289 </item>
290 <item>
291 <title>Semi-Grant-Free NOMA: Ergodic Rates Analysis with Random Deployed Users. (arXiv:2010.15169v1 [cs.IT])</title>
292 <link>http://fr.arxiv.org/abs/2010.15169</link>
293 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1">Chao Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1">Yuanwei Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yi_W/0/1/0/all/0/1">Wenqiang Yi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_Z/0/1/0/all/0/1">Zhijin Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_Z/0/1/0/all/0/1">Zhiguo Ding</a></p>
294
295 <p>Semi-grant-free (Semi-GF) non-orthogonal multiple access (NOMA) enables
296 grant-free (GF) and grant-based (GB) users to share the same resource blocks,
297 thereby balancing the connectivity and stability of communications. This letter
298 analyzes ergodic rates of Semi-GF NOMA systems. First, this paper exploits a
299 Semi-GF protocol, denoted as dynamic protocol, for selecting GF users into the
300 occupied GB channels via the GB user's instantaneous received power. Under this
301 protocol, the closed-form analytical and approximated expressions for ergodic
302 rates are derived. The numerical results illustrate that the GF user (weak NOMA
303 user) has a performance upper limit, while the ergodic rate of the GB user
304 (strong NOMA user) increases linearly versus the transmit signal-to-noise
305 ratio.
306 </p>
307 </description>
308 <guid isPermaLink="false">oai:arXiv.org:2010.15169</guid>
309 </item>
310 <item>
311 <title>Slicing a single wireless collision channel among throughput- and timeliness-sensitive services. (arXiv:2010.15171v1 [cs.IT])</title>
312 <link>http://fr.arxiv.org/abs/2010.15171</link>
313 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Leyva_Mayorga_I/0/1/0/all/0/1">Israel Leyva-Mayorga</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chiariotti_F/0/1/0/all/0/1">Federico Chiariotti</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stefanovic_C/0/1/0/all/0/1">&#x10c;edomir Stefanovi&#x107;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kalor_A/0/1/0/all/0/1">Anders E. Kal&#xf8;r</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Popovski_P/0/1/0/all/0/1">Petar Popovski</a></p>
314
315 <p>The fifth generation (5G) wireless system has a platform-driven approach,
316 aiming to support heterogeneous connections with very diverse requirements. The
317 shared wireless resources should be sliced in a way that each user perceives
318 that its requirement has been met. Heterogeneity challenges the traditional
319 notion of resource efficiency, as the resource usage has cater for, e.g. rate
320 maximization for one user and timeliness requirement for another user. This
321 paper treats a model for radio access network (RAN) uplink, where a
322 throughput-demanding broadband user shares wireless resources with an
323 intermittently active user that wants to optimize the timeliness, expressed in
324 terms of latency-reliability or Age of Information (AoI). We evaluate the
325 trade-offs between throughput and timeliness for Orthogonal Multiple Access
326 (OMA) as well as Non-Orthogonal Multiple Access (NOMA) with successive
327 interference cancellation (SIC). We observe that NOMA with SIC, in a
328 conservative scenario with destructive collisions, is just slightly inferior to
329 that of OMA, which indicates that it may offer significant benefits in
330 practical deployments where the capture effect is frequently encountered. On
331 the other hand, finding the optimal configuration of NOMA with SIC depends on
332 the activity pattern of the intermittent user, to which OMA is insensitive.
333 </p>
334 </description>
335 <guid isPermaLink="false">oai:arXiv.org:2010.15171</guid>
336 </item>
337 <item>
338 <title>Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement. (arXiv:2010.15174v1 [cs.SD])</title>
339 <link>http://fr.arxiv.org/abs/2010.15174</link>
340 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hsieh_T/0/1/0/all/0/1">Tsun-An Hsieh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_C/0/1/0/all/0/1">Cheng Yu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_S/0/1/0/all/0/1">Szu-Wei Fu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lu_X/0/1/0/all/0/1">Xugang Lu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tsao_Y/0/1/0/all/0/1">Yu Tsao</a></p>
341
342 <p>Speech enhancement (SE) aims to improve speech quality and intelligibility,
343 which are both related to a smooth transition in speech segments that may carry
344 linguistic information, e.g. phones and syllables. In this study, we took
345 phonetic characteristics into account in the SE training process. Hence, we
346 designed a phone-fortified perceptual (PFP) loss, and the training of our SE
347 model was guided by PFP loss. In PFP loss, phonetic characteristics are
348 extracted by wav2vec, an unsupervised learning model based on the contrastive
349 predictive coding (CPC) criterion. Different from previous deep-feature-based
350 approaches, the proposed approach explicitly uses the phonetic information in
351 the deep feature extraction process to guide the SE model training. To test the
352 proposed approach, we first confirmed that the wav2vec representations carried
353 clear phonetic information using a t-distributed stochastic neighbor embedding
354 (t-SNE) analysis. Next, we observed that the proposed PFP loss was more
355 strongly correlated with the perceptual evaluation metrics than point-wise and
356 signal-level losses, thus achieving higher scores for standardized quality and
357 intelligibility evaluation metrics in the Voice Bank--DEMAND dataset.
358 </p>
359 </description>
360 <guid isPermaLink="false">oai:arXiv.org:2010.15174</guid>
361 </item>
362 <item>
363 <title>A Study on Efficiency in Continual Learning Inspired by Human Learning. (arXiv:2010.15187v1 [cs.LG])</title>
364 <link>http://fr.arxiv.org/abs/2010.15187</link>
365 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ball_P/0/1/0/all/0/1">Philip J. Ball</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yingzhen Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lamb_A/0/1/0/all/0/1">Angus Lamb</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1">Cheng Zhang</a></p>
366
367 <p>Humans are efficient continual learning systems; we continually learn new
368 skills from birth with finite cells and resources. Our learning is highly
369 optimized both in terms of capacity and time while not suffering from
370 catastrophic forgetting. In this work we study the efficiency of continual
371 learning systems, taking inspiration from human learning. In particular,
372 inspired by the mechanisms of sleep, we evaluate popular pruning-based
373 continual learning algorithms, using PackNet as a case study. First, we
374 identify that weight freezing, which is used in continual learning without
375 biological justification, can result in over $2\times$ as many weights being
376 used for a given level of performance. Secondly, we note the similarity in
377 human day and night time behaviors to the training and pruning phases
378 respectively of PackNet. We study a setting where the pruning phase is given a
379 time budget, and identify connections between iterative pruning and multiple
380 sleep cycles in humans. We show there exists an optimal choice of iteration
381 v.s. epochs given different tasks.
382 </p>
383 </description>
384 <guid isPermaLink="false">oai:arXiv.org:2010.15187</guid>
385 </item>
386 <item>
387 <title>Explicit stabilized multirate method for stiff stochastic differential equations. (arXiv:2010.15193v1 [math.NA])</title>
388 <link>http://fr.arxiv.org/abs/2010.15193</link>
389 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Abdulle_A/0/1/0/all/0/1">Assyr Abdulle</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Souza_G/0/1/0/all/0/1">Giacomo Rosilho de Souza</a></p>
390
391 <p>Stabilized explicit methods are particularly efficient for large systems of
392 stiff stochastic differential equations (SDEs) due to their extended stability
393 domain. However, they loose their efficiency when a severe stiffness is induced
394 by very few "fast" degrees of freedom, as the stiff and nonstiff terms are
395 evaluated concurrently. Therefore, inspired by [A. Abdulle, M. J. Grote, and G.
396 Rosilho de Souza, Preprint (2020), <a href="/abs/2006.00744">arXiv:2006.00744</a>] we introduce a stochastic
397 modified equation whose stiffness depends solely on the "slow" terms. By
398 integrating this modified equation with a stabilized explicit scheme we devise
399 a multirate method which overcomes the bottleneck caused by a few severely
400 stiff terms and recovers the efficiency of stabilized schemes for large systems
401 of nonlinear SDEs. The scheme is not based on any scale separation assumption
402 of the SDE and therefore it is employable for problems stemming from the
403 spatial discretization of stochastic parabolic partial differential equations
404 on locally refined grids. The multirate scheme has strong order 1/2, weak order
405 1 and its stability is proved on a model problem. Numerical experiments confirm
406 the efficiency and accuracy of the scheme.
407 </p>
408 </description>
409 <guid isPermaLink="false">oai:arXiv.org:2010.15193</guid>
410 </item>
411 <item>
412 <title>Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments. (arXiv:2010.15195v1 [cs.LG])</title>
413 <link>http://fr.arxiv.org/abs/2010.15195</link>
414 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Carvalho_W/0/1/0/all/0/1">Wilka Carvalho</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liang_A/0/1/0/all/0/1">Anthony Liang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_K/0/1/0/all/0/1">Kimin Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sohn_S/0/1/0/all/0/1">Sungryull Sohn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_H/0/1/0/all/0/1">Honglak Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lewis_R/0/1/0/all/0/1">Richard L. Lewis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_S/0/1/0/all/0/1">Satinder Singh</a></p>
415
416 <p>First-person object-interaction tasks in high-fidelity, 3D, simulated
417 environments such as the AI2Thor virtual home-environment pose significant
418 sample-efficiency challenges for reinforcement learning (RL) agents learning
419 from sparse task rewards. To alleviate these challenges, prior work has
420 provided extensive supervision via a combination of reward-shaping,
421 ground-truth object-information, and expert demonstrations. In this work, we
422 show that one can learn object-interaction tasks from scratch without
423 supervision by learning an attentive object-model as an auxiliary task during
424 task learning with an object-centric relational RL agent. Our key insight is
425 that learning an object-model that incorporates object-attention into forward
426 prediction provides a dense learning signal for unsupervised representation
427 learning of both objects and their relationships. This, in turn, enables faster
428 policy learning for an object-centric relational RL agent. We demonstrate our
429 agent by introducing a set of challenging object-interaction tasks in the
430 AI2Thor environment where learning with our attentive object-model is key to
431 strong performance. Specifically, we compare our agent and relational RL agents
432 with alternative auxiliary tasks to a relational RL agent equipped with
433 ground-truth object-information, and show that learning with our object-model
434 best closes the performance gap in terms of both learning speed and maximum
435 success rate. Additionally, we find that incorporating object-attention into an
436 object-model's forward predictions is key to learning representations which
437 capture object-category and object-state.
438 </p>
439 </description>
440 <guid isPermaLink="false">oai:arXiv.org:2010.15195</guid>
441 </item>
442 <item>
443 <title>A fast and scalable computational framework for large-scale and high-dimensional Bayesian optimal experimental design. (arXiv:2010.15196v1 [math.NA])</title>
444 <link>http://fr.arxiv.org/abs/2010.15196</link>
445 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Wu_K/0/1/0/all/0/1">Keyi Wu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Chen_P/0/1/0/all/0/1">Peng Chen</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ghattas_O/0/1/0/all/0/1">Omar Ghattas</a></p>
446
447 <p>We develop a fast and scalable computational framework to solve large-scale
448 and high-dimensional Bayesian optimal experimental design problems. In
449 particular, we consider the problem of optimal observation sensor placement for
450 Bayesian inference of high-dimensional parameters governed by partial
451 differential equations (PDEs), which is formulated as an optimization problem
452 that seeks to maximize an expected information gain (EIG). Such optimization
453 problems are particularly challenging due to the curse of dimensionality for
454 high-dimensional parameters and the expensive solution of large-scale PDEs. To
455 address these challenges, we exploit two essential properties of such problems:
456 the low-rank structure of the Jacobian of the parameter-to-observable map to
457 extract the intrinsically low-dimensional data-informed subspace, and the high
458 correlation of the approximate EIGs by a series of approximations to reduce the
459 number of PDE solves. We propose an efficient offline-online decomposition for
460 the optimization problem: an offline stage of computing all the quantities that
461 require a limited number of PDE solves independent of parameter and data
462 dimensions, and an online stage of optimizing sensor placement that does not
463 require any PDE solve. For the online optimization, we propose a swapping
464 greedy algorithm that first construct an initial set of sensors using leverage
465 scores and then swap the chosen sensors with other candidates until certain
466 convergence criteria are met. We demonstrate the efficiency and scalability of
467 the proposed computational framework by a linear inverse problem of inferring
468 the initial condition for an advection-diffusion equation, and a nonlinear
469 inverse problem of inferring the diffusion coefficient of a log-normal
470 diffusion equation, with both the parameter and data dimensions ranging from a
471 few tens to a few thousands.
472 </p>
473 </description>
474 <guid isPermaLink="false">oai:arXiv.org:2010.15196</guid>
475 </item>
476 <item>
477 <title>Forecasting Hamiltonian dynamics without canonical coordinates. (arXiv:2010.15201v1 [cs.LG])</title>
478 <link>http://fr.arxiv.org/abs/2010.15201</link>
479 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Choudhary_A/0/1/0/all/0/1">Anshul Choudhary</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lindner_J/0/1/0/all/0/1">John F. Lindner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Holliday_E/0/1/0/all/0/1">Elliott G. Holliday</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miller_S/0/1/0/all/0/1">Scott T. Miller</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sinha_S/0/1/0/all/0/1">Sudeshna Sinha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ditto_W/0/1/0/all/0/1">William L. Ditto</a></p>
480
481 <p>Conventional neural networks are universal function approximators, but
482 because they are unaware of underlying symmetries or physical laws, they may
483 need impractically many training data to approximate nonlinear dynamics.
484 Recently introduced Hamiltonian neural networks can efficiently learn and
485 forecast dynamical systems that conserve energy, but they require special
486 inputs called canonical coordinates, which may be hard to infer from data. Here
487 we significantly expand the scope of such networks by demonstrating a simple
488 way to train them with any set of generalised coordinates, including easily
489 observable ones.
490 </p>
491 </description>
492 <guid isPermaLink="false">oai:arXiv.org:2010.15201</guid>
493 </item>
494 <item>
495 <title>Micromobility in Smart Cities: A Closer Look at Shared Dockless E-Scooters via Big Social Data. (arXiv:2010.15203v1 [cs.SI])</title>
496 <link>http://fr.arxiv.org/abs/2010.15203</link>
497 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_Y/0/1/0/all/0/1">Yunhe Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhong_D/0/1/0/all/0/1">Dong Zhong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sun_P/0/1/0/all/0/1">Peng Sun</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zheng_W/0/1/0/all/0/1">Weijian Zheng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_Q/0/1/0/all/0/1">Qinglei Cao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Luo_X/0/1/0/all/0/1">Xi Luo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lu_Z/0/1/0/all/0/1">Zheng Lu</a></p>
498
499 <p>The micromobility is shaping first- and last-mile travels in urban areas.
500 Recently, shared dockless electric scooters (e-scooters) have emerged as a
501 daily alternative to driving for short-distance commuters in large cities due
502 to the affordability, easy accessibility via an app, and zero emissions.
503 Meanwhile, e-scooters come with challenges in city management, such as traffic
504 rules, public safety, parking regulations, and liability issues. In this paper,
505 we collected and investigated 5.8 million scooter-tagged tweets and 144,197
506 images, generated by 2.7 million users from October 2018 to March 2020, to take
507 a closer look at shared e-scooters via crowdsourcing data analytics. We
508 profiled e-scooter usages from spatial-temporal perspectives, explored
509 different business roles (i.e., riders, gig workers, and ridesharing
510 companies), examined operation patterns (e.g., injury types, and parking
511 behaviors), and conducted sentiment analysis. To our best knowledge, this paper
512 is the first large-scale systematic study on shared e-scooters using big social
513 data.
514 </p>
515 </description>
516 <guid isPermaLink="false">oai:arXiv.org:2010.15203</guid>
517 </item>
518 <item>
519 <title>Rosella: A Self-Driving Distributed Scheduler for Heterogeneous Clusters. (arXiv:2010.15206v1 [cs.DC])</title>
520 <link>http://fr.arxiv.org/abs/2010.15206</link>
521 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Q/0/1/0/all/0/1">Qiong Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Manandhar_S/0/1/0/all/0/1">Sunil Manandhar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Z/0/1/0/all/0/1">Zhenming Liu</a></p>
522
523 <p>Large-scale interactive web services and advanced AI applications make
524 sophisticated decisions in real-time, based on executing a massive amount of
525 computation tasks on thousands of servers. Task schedulers, which often operate
526 in heterogeneous and volatile environments, require high throughput, i.e.,
527 scheduling millions of tasks per second, and low latency, i.e., incurring
528 minimal scheduling delays for millisecond-level tasks. Scheduling is further
529 complicated by other users' workloads in a shared system, other background
530 activities, and the diverse hardware configurations inside datacenters.
531 </p>
532 <p>We present Rosella, a new self-driving, distributed approach for task
533 scheduling in heterogeneous clusters. Our system automatically learns the
534 compute environment and adjust its scheduling policy in real-time. The solution
535 provides high throughput and low latency simultaneously, because it runs in
536 parallel on multiple machines with minimum coordination and only performs
537 simple operations for each scheduling decision. Our learning module monitors
538 total system load, and uses the information to dynamically determine optimal
539 estimation strategy for the backends' compute-power. Our scheduling policy
540 generalizes power-of-two-choice algorithms to handle heterogeneous workers,
541 reducing the max queue length of $O(\log n)$ obtained by prior algorithms to
542 $O(\log \log n)$. We implement a Rosella prototype and evaluate it with a
543 variety of workloads. Experimental results show that Rosella significantly
544 reduces task response times, and adapts to environment changes quickly.
545 </p>
546 </description>
547 <guid isPermaLink="false">oai:arXiv.org:2010.15206</guid>
548 </item>
549 <item>
550 <title>Ground Roll Suppression using Convolutional Neural Networks. (arXiv:2010.15209v1 [eess.IV])</title>
551 <link>http://fr.arxiv.org/abs/2010.15209</link>
552 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Oliveira_D/0/1/0/all/0/1">Dario Augusto Borges Oliveira</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Semin_D/0/1/0/all/0/1">Daniil Semin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zaytsev_S/0/1/0/all/0/1">Semen Zaytsev</a></p>
553
554 <p>Seismic data processing plays a major role in seismic exploration as it
555 conditions much of the seismic interpretation performance. In this context,
556 generating reliable post-stack seismic data depends also on disposing of an
557 efficient pre-stack noise attenuation tool. Here we tackle ground roll noise,
558 one of the most challenging and common noises observed in pre-stack seismic
559 data. Since ground roll is characterized by relative low frequencies and high
560 amplitudes, most commonly used approaches for its suppression are based on
561 frequency-amplitude filters for ground roll characteristic bands. However, when
562 signal and noise share the same frequency ranges, these methods usually deliver
563 also signal suppression or residual noise. In this paper we take advantage of
564 the highly non-linear features of convolutional neural networks, and propose to
565 use different architectures to detect ground roll in shot gathers and
566 ultimately to suppress them using conditional generative adversarial networks.
567 Additionally, we propose metrics to evaluate ground roll suppression, and
568 report strong results compared to expert filtering. Finally, we discuss
569 generalization of trained models for similar and different geologies to better
570 understand the feasibility of our proposal in real applications.
571 </p>
572 </description>
573 <guid isPermaLink="false">oai:arXiv.org:2010.15209</guid>
574 </item>
575 <item>
576 <title>On Linearizability and the Termination of Randomized Algorithms. (arXiv:2010.15210v1 [cs.DC])</title>
577 <link>http://fr.arxiv.org/abs/2010.15210</link>
578 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hadzilacos_V/0/1/0/all/0/1">Vassos Hadzilacos</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_X/0/1/0/all/0/1">Xing Hu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Toueg_S/0/1/0/all/0/1">Sam Toueg</a></p>
579
580 <p>We study the question of whether the "termination with probability 1"
581 property of a randomized algorithm is preserved when one replaces the atomic
582 registers that the algorithm uses with linearizable (implementations of)
583 registers. We show that in general this is not so: roughly speaking, every
584 randomized algorithm A has a corresponding algorithm A' that solves the same
585 problem if the registers that it uses are atomic or strongly-linearizable, but
586 does not terminate if these registers are replaced with "merely" linearizable
587 ones. Together with a previous result shown in [15], this implies that one
588 cannot use the well-known ABD implementation of registers in message-passing
589 systems to automatically transform any randomized algorithm that works in
590 shared-memory systems into a randomized algorithm that works in message-passing
591 systems: with a strong adversary the resulting algorithm may not terminate.
592 </p>
593 </description>
594 <guid isPermaLink="false">oai:arXiv.org:2010.15210</guid>
595 </item>
596 <item>
597 <title>Safety-Aware Cascade Controller Tuning Using Constrained Bayesian Optimization. (arXiv:2010.15211v1 [eess.SY])</title>
598 <link>http://fr.arxiv.org/abs/2010.15211</link>
599 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Konig_C/0/1/0/all/0/1">Christopher K&#xf6;nig</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Khosravi_M/0/1/0/all/0/1">Mohammad Khosravi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Maier_M/0/1/0/all/0/1">Markus Maier</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Smith_R/0/1/0/all/0/1">Roy S. Smith</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Rupenyan_A/0/1/0/all/0/1">Alisa Rupenyan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lygeros_J/0/1/0/all/0/1">John Lygeros</a></p>
600
601 <p>This paper presents an automated, model-free, data-driven method for the safe
602 tuning of PID cascade controller gains based on Bayesian optimization. The
603 optimization objective is composed of data-driven performance metrics and
604 modeled using Gaussian processes. We further introduce a data-driven constraint
605 that captures the stability requirements from system data. Numerical evaluation
606 shows that the proposed approach outperforms relay feedback autotuning and
607 quickly converges to the global optimum, thanks to a tailored stopping
608 criterion. We demonstrate the performance of the method in simulations and
609 experiments on a linear axis drive of a grinding machine. For experimental
610 implementation, in addition to the introduced safety constraint, we integrate a
611 method for automatic detection of the critical gains and extend the
612 optimization objective with a penalty depending on the proximity of the current
613 candidate points to the critical gains. The resulting automated tuning method
614 optimizes system performance while ensuring stability and standardization.
615 </p>
616 </description>
617 <guid isPermaLink="false">oai:arXiv.org:2010.15211</guid>
618 </item>
619 <item>
620 <title>Away from Trolley Problems and Toward Risk Management. (arXiv:2010.15217v1 [cs.CY])</title>
621 <link>http://fr.arxiv.org/abs/2010.15217</link>
622 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goodall_N/0/1/0/all/0/1">Noah J. Goodall</a></p>
623
624 <p>As automated vehicles receive more attention from the media, there has been
625 an equivalent increase in the coverage of the ethical choices a vehicle may be
626 forced to make in certain crash situations with no clear safe outcome. Much of
627 this coverage has focused on a philosophical thought experiment known as the
628 "trolley problem," and substituting an automated vehicle for the trolley and
629 the car's software for the bystander. While this is a stark and straightforward
630 example of ethical decision making for an automated vehicle, it risks
631 marginalizing the entire field if it is to become the only ethical problem in
632 the public's mind. In this chapter, I discuss the shortcomings of the trolley
633 problem, and introduce more nuanced examples that involve crash risk and
634 uncertainty. Risk management is introduced as an alternative approach, and its
635 ethical dimensions are discussed.
636 </p>
637 </description>
638 <guid isPermaLink="false">oai:arXiv.org:2010.15217</guid>
639 </item>
640 <item>
641 <title>StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems. (arXiv:2010.15218v1 [cs.DC])</title>
642 <link>http://fr.arxiv.org/abs/2010.15218</link>
643 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Licht_J/0/1/0/all/0/1">Johannes de Fine Licht</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kuster_A/0/1/0/all/0/1">Andreas Kuster</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Matteis_T/0/1/0/all/0/1">Tiziano De Matteis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ben_Nun_T/0/1/0/all/0/1">Tal Ben-Nun</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hofer_D/0/1/0/all/0/1">Dominic Hofer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hoefler_T/0/1/0/all/0/1">Torsten Hoefler</a></p>
644
645 <p>Spatial computing devices have been shown to significantly accelerate stencil
646 computations, but have so far relied on unrolling the iterative dimension of a
647 single stencil operation to increase temporal locality. This work considers the
648 general case of mapping directed acyclic graphs of heterogeneous stencil
649 computations to spatial computing systems, assuming large input programs
650 without an iterative component. StencilFlow maximizes temporal locality and
651 ensures deadlock freedom in this setting, providing end-to-end analysis and
652 mapping from a high-level program description to distributed hardware. We
653 evaluate the generated architectures on an FPGA testbed, demonstrating the
654 highest single-device and multi-device performance recorded for stencil
655 programs on FPGAs to date, then leverage the framework to study a complex
656 stencil program from a production weather simulation application. Our work
657 enables productively targeting distributed spatial computing systems with large
658 stencil programs, and offers insight into architecture characteristics required
659 for their efficient execution in practice.
660 </p>
661 </description>
662 <guid isPermaLink="false">oai:arXiv.org:2010.15218</guid>
663 </item>
664 <item>
665 <title>Geometric Sampling of Networks. (arXiv:2010.15221v1 [math.DG])</title>
666 <link>http://fr.arxiv.org/abs/2010.15221</link>
667 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Barkanass_V/0/1/0/all/0/1">Vladislav Barkanass</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Jost_J/0/1/0/all/0/1">J&#xfc;rgen Jost</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Saucan_E/0/1/0/all/0/1">Emil Saucan</a></p>
668
669 <p>Motivated by the methods and results of manifold sampling based on Ricci
670 curvature, we propose a similar approach for networks. To this end we make
671 appeal to three types of discrete curvature, namely the graph Forman-, full
672 Forman- and Haantjes-Ricci curvatures for edge-based and node-based sampling.
673 We present the results of experiments on real life networks, as well as for
674 square grids arising in Image Processing. Moreover, we consider fitting Ricci
675 flows and we employ them for the detection of networks' backbone. We also
676 develop embedding kernels related to the Forman-Ricci curvatures and employ
677 them for the detection of the coarse structure of networks, as well as for
678 network visualization with applications to SVM. The relation between the Ricci
679 curvature of the original manifold and that of a Ricci curvature driven
680 discretization is also studied.
681 </p>
682 </description>
683 <guid isPermaLink="false">oai:arXiv.org:2010.15221</guid>
684 </item>
685 <item>
686 <title>Exploring complex networks with the ICON R package. (arXiv:2010.15222v1 [cs.SI])</title>
687 <link>http://fr.arxiv.org/abs/2010.15222</link>
688 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wadhwa_R/0/1/0/all/0/1">Raoul R. Wadhwa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Scott_J/0/1/0/all/0/1">Jacob G. Scott</a></p>
689
690 <p>We introduce ICON, an R package that contains 1075 complex network datasets
691 in a standard edgelist format. All provided datasets have associated citations
692 and have been indexed by the Colorado Index of Complex Networks - also referred
693 to as ICON. In addition to supplying a large and diverse corpus of useful
694 real-world networks, ICON also implements an S3 generic to work with the
695 network and ggnetwork R packages for network analysis and visualization,
696 respectively. Sample code in this report also demonstrates how ICON can be used
697 in conjunction with the igraph package. Currently, the Comprehensive R Archive
698 Network hosts ICON v0.4.0. We hope that ICON will serve as a standard corpus
699 for complex network research and prevent redundant work that would be otherwise
700 necessary by individual research groups. The open source code for ICON and for
701 this reproducible report can be found at https://github.com/rrrlw/ICON.
702 </p>
703 </description>
704 <guid isPermaLink="false">oai:arXiv.org:2010.15222</guid>
705 </item>
706 <item>
707 <title>A Visuospatial Dataset for Naturalistic Verb Learning. (arXiv:2010.15225v1 [cs.CL])</title>
708 <link>http://fr.arxiv.org/abs/2010.15225</link>
709 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ebert_D/0/1/0/all/0/1">Dylan Ebert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pavlick_E/0/1/0/all/0/1">Ellie Pavlick</a></p>
710
711 <p>We introduce a new dataset for training and evaluating grounded language
712 models. Our data is collected within a virtual reality environment and is
713 designed to emulate the quality of language data to which a pre-verbal child is
714 likely to have access: That is, naturalistic, spontaneous speech paired with
715 richly grounded visuospatial context. We use the collected data to compare
716 several distributional semantics models for verb learning. We evaluate neural
717 models based on 2D (pixel) features as well as feature-engineered models based
718 on 3D (symbolic, spatial) features, and show that neither modeling approach
719 achieves satisfactory performance. Our results are consistent with evidence
720 from child language acquisition that emphasizes the difficulty of learning
721 verbs from naive distributional data. We discuss avenues for future work on
722 cognitively-inspired grounded language learning, and release our corpus with
723 the intent of facilitating research on the topic.
724 </p>
725 </description>
726 <guid isPermaLink="false">oai:arXiv.org:2010.15225</guid>
727 </item>
728 <item>
729 <title>Speech-Based Emotion Recognition using Neural Networks and Information Visualization. (arXiv:2010.15229v1 [cs.HC])</title>
730 <link>http://fr.arxiv.org/abs/2010.15229</link>
731 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Almahmoud_J/0/1/0/all/0/1">Jumana Almahmoud</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kikkeri_K/0/1/0/all/0/1">Kruthika Kikkeri</a></p>
732
733 <p>Emotions recognition is commonly employed for health assessment. However, the
734 typical metric for evaluation in therapy is based on patient-doctor appraisal.
735 This process can fall into the issue of subjectivity, while also requiring
736 healthcare professionals to deal with copious amounts of information. Thus,
737 machine learning algorithms can be a useful tool for the classification of
738 emotions. While several models have been developed in this domain, there is a
739 lack of userfriendly representations of the emotion classification systems for
740 therapy. We propose a tool which enables users to take speech samples and
741 identify a range of emotions (happy, sad, angry, surprised, neutral, clam,
742 disgust, and fear) from audio elements through a machine learning model. The
743 dashboard is designed based on local therapists' needs for intuitive
744 representations of speech data in order to gain insights and informative
745 analyses of their sessions with their patients.
746 </p>
747 </description>
748 <guid isPermaLink="false">oai:arXiv.org:2010.15229</guid>
749 </item>
750 <item>
751 <title>Construction Payment Automation Using Blockchain-Enabled Smart Contracts and Reality Capture Technologies. (arXiv:2010.15232v1 [cs.CR])</title>
752 <link>http://fr.arxiv.org/abs/2010.15232</link>
753 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hamledari_H/0/1/0/all/0/1">Hesam Hamledari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fischer_M/0/1/0/all/0/1">Martin Fischer</a></p>
754
755 <p>This paper presents a smart contract-based solution for autonomous
756 administration of construction progress payments. It bridges the gap between
757 payments (cash flow) and the progress assessments at job sites (product flow)
758 enabled by reality capture technologies and building information modeling
759 (BIM). The approach eliminates the reliance on the centralized and heavily
760 intermediated mechanisms of existing payment applications. The construction
761 progress is stored in a distributed manner using content addressable file
762 sharing; it is broadcasted to a smart contract which automates the on-chain
763 payment settlements and the transfer of lien rights. The method was
764 successfully used for processing payments to 7 subcontractors in two commercial
765 construction projects where progress monitoring was performed using a
766 camera-equipped unmanned aerial vehicle (UAV) and an unmanned ground vehicle
767 (UGV) equipped with a laser scanner. The results show promise for the method's
768 potential for increasing the frequency, granularity, and transparency of
769 payments. The paper is concluded with a discussion of implications for project
770 management, introducing a new model of project as a singleton state machine.
771 </p>
772 </description>
773 <guid isPermaLink="false">oai:arXiv.org:2010.15232</guid>
774 </item>
775 <item>
776 <title>Accurate Prostate Cancer Detection and Segmentation on Biparametric MRI using Non-local Mask R-CNN with Histopathological Ground Truth. (arXiv:2010.15233v1 [eess.IV])</title>
777 <link>http://fr.arxiv.org/abs/2010.15233</link>
778 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Dai_Z/0/1/0/all/0/1">Zhenzhen Dai</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jambor_I/0/1/0/all/0/1">Ivan Jambor</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Taimen_P/0/1/0/all/0/1">Pekka Taimen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pantelic_M/0/1/0/all/0/1">Milan Pantelic</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Elshaikh_M/0/1/0/all/0/1">Mohamed Elshaikh</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Rogers_C/0/1/0/all/0/1">Craig Rogers</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ettala_O/0/1/0/all/0/1">Otto Ettala</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bostrom_P/0/1/0/all/0/1">Peter Bostr&#xf6;m</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Aronen_H/0/1/0/all/0/1">Hannu Aronen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Merisaari_H/0/1/0/all/0/1">Harri Merisaari</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wen_N/0/1/0/all/0/1">Ning Wen</a></p>
779
780 <p>Purpose: We aimed to develop deep machine learning (DL) models to improve the
781 detection and segmentation of intraprostatic lesions (IL) on bp-MRI by using
782 whole amount prostatectomy specimen-based delineations. We also aimed to
783 investigate whether transfer learning and self-training would improve results
784 with small amount labelled data.
785 </p>
786 <p>Methods: 158 patients had suspicious lesions delineated on MRI based on
787 bp-MRI, 64 patients had ILs delineated on MRI based on whole mount
788 prostatectomy specimen sections, 40 patients were unlabelled. A non-local Mask
789 R-CNN was proposed to improve the segmentation accuracy. Transfer learning was
790 investigated by fine-tuning a model trained using MRI-based delineations with
791 prostatectomy-based delineations. Two label selection strategies were
792 investigated in self-training. The performance of models was evaluated by 3D
793 detection rate, dice similarity coefficient (DSC), 95 percentile Hausdrauff (95
794 HD, mm) and true positive ratio (TPR).
795 </p>
796 <p>Results: With prostatectomy-based delineations, the non-local Mask R-CNN with
797 fine-tuning and self-training significantly improved all evaluation metrics.
798 For the model with the highest detection rate and DSC, 80.5% (33/41) of lesions
799 in all Gleason Grade Groups (GGG) were detected with DSC of 0.548[0.165], 95 HD
800 of 5.72[3.17] and TPR of 0.613[0.193]. Among them, 94.7% (18/19) of lesions
801 with GGG &gt; 2 were detected with DSC of 0.604[0.135], 95 HD of 6.26[3.44] and
802 TPR of 0.580[0.190].
803 </p>
804 <p>Conclusion: DL models can achieve high prostate cancer detection and
805 segmentation accuracy on bp-MRI based on annotations from histologic images. To
806 further improve the performance, more data with annotations of both MRI and
807 whole amount prostatectomy specimens are required.
808 </p>
809 </description>
810 <guid isPermaLink="false">oai:arXiv.org:2010.15233</guid>
811 </item>
812 <item>
813 <title>Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions. (arXiv:2010.15234v1 [cs.LG])</title>
814 <link>http://fr.arxiv.org/abs/2010.15234</link>
815 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ahuja_K/0/1/0/all/0/1">Kartik Ahuja</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shanmugam_K/0/1/0/all/0/1">Karthikeyan Shanmugam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dhurandhar_A/0/1/0/all/0/1">Amit Dhurandhar</a></p>
816
817 <p>Recently, invariant risk minimization (IRM) (Arjovsky et al.) was proposed as
818 a promising solution to address out-of-distribution (OOD) generalization. In
819 Ahuja et al., it was shown that solving for the Nash equilibria of a new class
820 of "ensemble-games" is equivalent to solving IRM. In this work, we extend the
821 framework in Ahuja et al. for linear regressions by projecting the
822 ensemble-game on an $\ell_{\infty}$ ball. We show that such projections help
823 achieve non-trivial OOD guarantees despite not achieving perfect invariance.
824 For linear models with confounders, we prove that Nash equilibria of these
825 games are closer to the ideal OOD solutions than the standard empirical risk
826 minimization (ERM) and we also provide learning algorithms that provably
827 converge to these Nash Equilibria. Empirical comparisons of the proposed
828 approach with the state-of-the-art show consistent gains in achieving OOD
829 solutions in several settings involving anti-causal variables and confounders.
830 </p>
831 </description>
832 <guid isPermaLink="false">oai:arXiv.org:2010.15234</guid>
833 </item>
834 <item>
835 <title>SD-Access: Practical Experiences in Designing and Deploying Software Defined Enterprise Networks. (arXiv:2010.15236v1 [cs.NI])</title>
836 <link>http://fr.arxiv.org/abs/2010.15236</link>
837 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Paillisse_J/0/1/0/all/0/1">Jordi Paillisse</a> (1 and 2), <a href="http://fr.arxiv.org/find/cs/1/au:+Portoles_M/0/1/0/all/0/1">Marc Portoles</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Lopez_A/0/1/0/all/0/1">Albert Lopez</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Rodriguez_Natal_A/0/1/0/all/0/1">Alberto Rodriguez-Natal</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Iacobacci_D/0/1/0/all/0/1">David Iacobacci</a> (3), <a href="http://fr.arxiv.org/find/cs/1/au:+Leong_J/0/1/0/all/0/1">Johnson Leong</a> (4), <a href="http://fr.arxiv.org/find/cs/1/au:+Moreno_V/0/1/0/all/0/1">Victor Moreno</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Cabellos_A/0/1/0/all/0/1">Albert Cabellos</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Maino_F/0/1/0/all/0/1">Fabio Maino</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Hooda_S/0/1/0/all/0/1">Sanjay Hooda</a> (2) ((1) UPC-BarcelonaTech, Barcelona, Spain, (2) Cisco, San Jose, USA, (3) BMP LLP, (4) Uber Technologies Inc., San Francisco, USA)</p>
838
839 <p>Enterprise Networks, over the years, have become more and more complex trying
840 to keep up with new requirements that challenge traditional solutions. Just to
841 mention one out of many possible examples, technologies such as Virtual LANs
842 (VLANs) struggle to address the scalability and operational requirements
843 introduced by Internet of Things (IoT) use cases. To keep up with these
844 challenges we have identified four main requirements that are common across
845 modern enterprise networks: (i) scalable mobility, (ii) endpoint segmentation,
846 (iii) simplified administration, and (iv) resource optimization. To address
847 these challenges we designed SDA (Software Defined Access), a solution for
848 modern enterprise networks that leverages Software-Defined Networking (SDN) and
849 other state of the art techniques. In this paper we present the design,
850 implementation and evaluation of SDA. Specifically, SDA: (i) leverages a
851 combination of an overlay approach with an event-driven protocol (LISP) to
852 dynamically adapt to traffic and mobility patterns while preserving resources,
853 and (ii) enforces dynamic endpoint groups for scalable segmentation with low
854 operational burden. We present our experience with deploying SDA in two
855 real-life scenarios: an enterprise campus, and a large warehouse with mobile
856 robots. Our evaluation shows that SDA, when compared with traditional
857 enterprise networks, can (i) reduce overall data plane forwarding state up to
858 70% thanks to a reactive protocol using a centralized routing server, and (ii)
859 reduce by an order of magnitude the handover delays in scenarios of massive
860 mobility with respect to other approaches. Finally, we discuss lessons learned
861 while deploying and operating SDA, and possible optimizations regarding the use
862 of an event-driven protocol and group-based segmentation.
863 </p>
864 </description>
865 <guid isPermaLink="false">oai:arXiv.org:2010.15236</guid>
866 </item>
867 <item>
868 <title>Bandit Policies for Reliable Cellular Network Handovers in Extreme Mobility. (arXiv:2010.15237v1 [cs.LG])</title>
869 <link>http://fr.arxiv.org/abs/2010.15237</link>
870 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yuanjie Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Datta_E/0/1/0/all/0/1">Esha Datta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_J/0/1/0/all/0/1">Jiaxin Ding</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shroff_N/0/1/0/all/0/1">Ness Shroff</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_X/0/1/0/all/0/1">Xin Liu</a></p>
871
872 <p>The demand for seamless Internet access under extreme user mobility, such as
873 on high-speed trains and vehicles, has become a norm rather than an exception.
874 However, the 4G/5G mobile network is not always reliable to meet this demand,
875 with non-negligible failures during the handover between base stations. A
876 fundamental challenge of reliability is to balance the exploration of more
877 measurements for satisfactory handover, and exploitation for timely handover
878 (before the fast-moving user leaves the serving base station's radio coverage).
879 This paper formulates this trade-off in extreme mobility as a composition of
880 two distinct multi-armed bandit problems. We propose Bandit and Threshold
881 Tuning (BATT) to minimize the regret of handover failures in extreme mobility.
882 BATT uses $\epsilon$-binary-search to optimize the threshold of the serving
883 cell's signal strength to initiate the handover procedure with
884 $\mathcal{O}(\log J \log T)$ regret.It further devises opportunistic Thompson
885 sampling, which optimizes the sequence of the target cells to measure for
886 reliable handover with $\mathcal{O}(\log T)$ regret.Our experiment over a real
887 LTE dataset from Chinese high-speed rails validates significant regret
888 reduction and a 29.1% handover failure reduction.
889 </p>
890 </description>
891 <guid isPermaLink="false">oai:arXiv.org:2010.15237</guid>
892 </item>
893 <item>
894 <title>Cloud-Based Dynamic Programming for an Electric City Bus Energy Management Considering Real-Time Passenger Load Prediction. (arXiv:2010.15239v1 [eess.SY])</title>
895 <link>http://fr.arxiv.org/abs/2010.15239</link>
896 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Shi_J/0/1/0/all/0/1">Junzhe Shi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_B/0/1/0/all/0/1">Bin Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhou_X/0/1/0/all/0/1">Xingyu Zhou</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Hou_J/0/1/0/all/0/1">Jun Hou</a></p>
897
898 <p>Electric city bus gains popularity in recent years for its low greenhouse gas
899 emission, low noise level, etc. Different from a passenger car, the weight of a
900 city bus varies significantly with different amounts of onboard passengers,
901 which is not well studied in existing literature. This study proposes a
902 passenger load prediction model using day-of-week, time-of-day, weather,
903 temperatures, wind levels, and holiday information as inputs. The average
904 model, Regression Tree, Gradient Boost Decision Tree, and Neural Networks
905 models are compared in the passenger load prediction. The Gradient Boost
906 Decision Tree model is selected due to its best accuracy and high stability.
907 Given the predicted passenger load, dynamic programming algorithm determines
908 the optimal power demand for supercapacitor and battery by optimizing the
909 battery aging and energy usage in the cloud. Then rule extraction is conducted
910 on dynamic programming results, and the rule is real-time loaded to onboard
911 controllers of vehicles. The proposed cloud-based dynamic programming and rule
912 extraction framework with the passenger load prediction shows 4% and 11% fewer
913 bus operating costs in off-peak and peak hours, respectively. The operating
914 cost by the proposed framework is less than 1% shy of the dynamic programming
915 with the true passenger load information.
916 </p>
917 </description>
918 <guid isPermaLink="false">oai:arXiv.org:2010.15239</guid>
919 </item>
920 <item>
921 <title>Test Set Optimization by Machine Learning Algorithms. (arXiv:2010.15240v1 [cs.LG])</title>
922 <link>http://fr.arxiv.org/abs/2010.15240</link>
923 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_K/0/1/0/all/0/1">Kaiming Fu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jin_Y/0/1/0/all/0/1">Yulu Jin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Z/0/1/0/all/0/1">Zhousheng Chen</a></p>
924
925 <p>Diagnosis results are highly dependent on the volume of test set. To derive
926 the most efficient test set, we propose several machine learning based methods
927 to predict the minimum amount of test data that produces relatively accurate
928 diagnosis. By collecting outputs from failing circuits, the feature matrix and
929 label vector are generated, which involves the inference information of the
930 test termination point. Thus we develop a prediction model to fit the data and
931 determine when to terminate testing. The considered methods include LASSO and
932 Support Vector Machine(SVM) where the relationship between goals(label) and
933 predictors(feature matrix) are considered to be linear in LASSO and nonlinear
934 in SVM. Numerical results show that SVM reaches a diagnosis accuracy of 90.4%
935 while deducting the volume of test set by 35.24%.
936 </p>
937 </description>
938 <guid isPermaLink="false">oai:arXiv.org:2010.15240</guid>
939 </item>
940 <item>
941 <title>A marine radioisotope gamma-ray spectrum analysis method based on Monte Carlo simulation and MLP neural network. (arXiv:2010.15245v1 [physics.ins-det])</title>
942 <link>http://fr.arxiv.org/abs/2010.15245</link>
943 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Dai_W/0/1/0/all/0/1">Wenhan Dai</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Zeng_Z/0/1/0/all/0/1">Zhi Zeng</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Dou_D/0/1/0/all/0/1">Daowei Dou</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Ma_H/0/1/0/all/0/1">Hao Ma</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Chen_J/0/1/0/all/0/1">Jianping Chen</a> (1 and 2), <a href="http://fr.arxiv.org/find/physics/1/au:+Li_J/0/1/0/all/0/1">Junli Li</a> (1), <a href="http://fr.arxiv.org/find/physics/1/au:+Zhang_H/0/1/0/all/0/1">Hui Zhang</a> (1) ((1) Department of Engineering Physics, Tsinghua University, Beijing, China, (2) College of Nuclear Science and Technology, Beijing Normal University, Beijing, China)</p>
944
945 <p>A multilayer perceptron (MLP) neural network is built to analyze the Cs-137
946 concentration in seawater via gamma-ray spectrums measured by a LaBr3 detector.
947 The MLP is trained and tested by a large data set generated by combining
948 measured and Monte Carlo simulated spectrums under the assumption that all the
949 measured spectrums have 0 Cs-137 concentration. And the performance of MLP is
950 evaluated and compared with the traditional net-peak area method. The results
951 show an improvement of 7% in accuracy and 0.036 in the ROC-curve area compared
952 to those of the net peak area method. And the influence of the assumption of
953 Cs-137 concentration in the training data set on the classifying performance of
954 MLP is evaluated.
955 </p>
956 </description>
957 <guid isPermaLink="false">oai:arXiv.org:2010.15245</guid>
958 </item>
959 <item>
960 <title>Semantic video segmentation for autonomous driving. (arXiv:2010.15250v1 [cs.CV])</title>
961 <link>http://fr.arxiv.org/abs/2010.15250</link>
962 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chau_M/0/1/0/all/0/1">Minh Triet Chau</a></p>
963
964 <p>We aim to solve semantic video segmentation in autonomous driving, namely
965 road detection in real time video, using techniques discussed in (Shelhamer et
966 al., 2016a). While fully convolutional network gives good result, we show that
967 the speed can be halved while preserving the accuracy. The test dataset being
968 used is KITTI, which consists of real footage from Germany's streets.
969 </p>
970 </description>
971 <guid isPermaLink="false">oai:arXiv.org:2010.15250</guid>
972 </item>
973 <item>
974 <title>Fusion Models for Improved Visual Captioning. (arXiv:2010.15251v1 [cs.CV])</title>
975 <link>http://fr.arxiv.org/abs/2010.15251</link>
976 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kalimuthu_M/0/1/0/all/0/1">Marimuthu Kalimuthu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mogadala_A/0/1/0/all/0/1">Aditya Mogadala</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mosbach_M/0/1/0/all/0/1">Marius Mosbach</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Klakow_D/0/1/0/all/0/1">Dietrich Klakow</a></p>
977
978 <p>Visual captioning aims to generate textual descriptions given images.
979 Traditionally, the captioning models are trained on human annotated datasets
980 such as Flickr30k and MS-COCO, which are limited in size and diversity. This
981 limitation hinders the generalization capabilities of these models while also
982 rendering them to often make mistakes. Language models can, however, be trained
983 on vast amounts of freely available unlabelled data and have recently emerged
984 as successful language encoders and coherent text generators. Meanwhile,
985 several unimodal and multimodal fusion techniques have been proven to work well
986 for natural language generation and automatic speech recognition. Building on
987 these recent developments, and with an aim of improving the quality of
988 generated captions, the contribution of our work in this paper is two-fold:
989 First, we propose a generic multimodal model fusion framework for caption
990 generation as well as emendation where we utilize different fusion strategies
991 to integrate a pretrained Auxiliary Language Model (AuxLM) within the
992 traditional encoder-decoder visual captioning frameworks. Next, we employ the
993 same fusion strategies to integrate a pretrained Masked Language Model (MLM),
994 namely BERT, with a visual captioning model, viz. Show, Attend, and Tell, for
995 emending both syntactic and semantic errors in captions. Our caption emendation
996 experiments on three benchmark image captioning datasets, viz. Flickr8k,
997 Flickr30k, and MSCOCO, show improvements over the baseline, indicating the
998 usefulness of our proposed multimodal fusion strategies. Further, we perform a
999 preliminary qualitative analysis on the emended captions and identify error
1000 categories based on the type of corrections.
1001 </p>
1002 </description>
1003 <guid isPermaLink="false">oai:arXiv.org:2010.15251</guid>
1004 </item>
1005 <item>
1006 <title>Model Minimization For Online Predictability. (arXiv:2010.15255v1 [cs.AI])</title>
1007 <link>http://fr.arxiv.org/abs/2010.15255</link>
1008 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gopalakrishnan_S/0/1/0/all/0/1">Sriram Gopalakrishnan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kambhampati_S/0/1/0/all/0/1">Subbarao Kambhampati</a></p>
1009
1010 <p>For humans in a teaming scenario, context switching between reasoning about a
1011 teammate's behavior and thinking about thier own task can slow us down,
1012 especially if the cognitive cost of predicting the teammate's actions is high.
1013 So if we can make the prediction of a robot-teammate's actions quicker, then
1014 the human can be more productive. In this paper we present an approach to
1015 constrain the actions of a robot so as to increase predictability (specifically
1016 online predictability) while keeping the plan costs of the robot within
1017 acceptable limits. Existing works on human-robot interaction do not consider
1018 the computational cost for predictability, which we consider in our approach.
1019 We approach this problem from the perspective of directed graph minimization,
1020 and we connect the concept of predictability to the out-degree of vertices. We
1021 present an algorithm to minimize graphs for predictability, and contrast this
1022 with minimization for legibility (goal inference) and optimality.
1023 </p>
1024 </description>
1025 <guid isPermaLink="false">oai:arXiv.org:2010.15255</guid>
1026 </item>
1027 <item>
1028 <title>DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors. (arXiv:2010.15258v1 [cs.SD])</title>
1029 <link>http://fr.arxiv.org/abs/2010.15258</link>
1030 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Reddy_C/0/1/0/all/0/1">Chandan K A Reddy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gopal_V/0/1/0/all/0/1">Vishak Gopal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cutler_R/0/1/0/all/0/1">Ross Cutler</a></p>
1031
1032 <p>Human subjective evaluation is the gold standard to evaluate speech quality
1033 optimized for human perception. Perceptual objective metrics serve as a proxy
1034 for subjective scores. The conventional and widely used metrics require a
1035 reference clean speech signal, which is unavailable in real recordings. The
1036 no-reference approaches correlate poorly with human ratings and are not widely
1037 adopted in the research community. One of the biggest use cases of these
1038 perceptual objective metrics is to evaluate noise suppression algorithms. This
1039 paper introduces a multi-stage self-teaching based perceptual objective metric
1040 that is designed to evaluate noise suppressors. The proposed method generalizes
1041 well in challenging test conditions with a high correlation to human ratings.
1042 </p>
1043 </description>
1044 <guid isPermaLink="false">oai:arXiv.org:2010.15258</guid>
1045 </item>
1046 <item>
1047 <title>Object sieving and morphological closing to reduce false detections in wide-area aerial imagery. (arXiv:2010.15260v1 [cs.CV])</title>
1048 <link>http://fr.arxiv.org/abs/2010.15260</link>
1049 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gao_X/0/1/0/all/0/1">Xin Gao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ram_S/0/1/0/all/0/1">Sundaresh Ram</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rodriguez_J/0/1/0/all/0/1">Jeffrey J. Rodriguez</a></p>
1050
1051 <p>For object detection in wide-area aerial imagery, post-processing is usually
1052 needed to reduce false detections. We propose a two-stage post-processing
1053 scheme which comprises an area-thresholding sieving process and a morphological
1054 closing operation. We use two wide-area aerial videos to compare the
1055 performance of five object detection algorithms in the absence and in the
1056 presence of our post-processing scheme. The automatic detection results are
1057 compared with the ground-truth objects. Several metrics are used for
1058 performance comparison.
1059 </p>
1060 </description>
1061 <guid isPermaLink="false">oai:arXiv.org:2010.15260</guid>
1062 </item>
1063 <item>
1064 <title>Deep Shells: Unsupervised Shape Correspondence with Optimal Transport. (arXiv:2010.15261v1 [cs.CV])</title>
1065 <link>http://fr.arxiv.org/abs/2010.15261</link>
1066 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Eisenberger_M/0/1/0/all/0/1">Marvin Eisenberger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Toker_A/0/1/0/all/0/1">Aysim Toker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leal_Taixe_L/0/1/0/all/0/1">Laura Leal-Taix&#xe9;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cremers_D/0/1/0/all/0/1">Daniel Cremers</a></p>
1067
1068 <p>We propose a novel unsupervised learning approach to 3D shape correspondence
1069 that builds a multiscale matching pipeline into a deep neural network. This
1070 approach is based on smooth shells, the current state-of-the-art axiomatic
1071 correspondence method, which requires an a priori stochastic search over the
1072 space of initial poses. Our goal is to replace this costly preprocessing step
1073 by directly learning good initializations from the input surfaces. To that end,
1074 we systematically derive a fully differentiable, hierarchical matching pipeline
1075 from entropy regularized optimal transport. This allows us to combine it with a
1076 local feature extractor based on smooth, truncated spectral convolution
1077 filters. Finally, we show that the proposed unsupervised method significantly
1078 improves over the state-of-the-art on multiple datasets, even in comparison to
1079 the most recent supervised methods. Moreover, we demonstrate compelling
1080 generalization results by applying our learned filters to examples that
1081 significantly deviate from the training set.
1082 </p>
1083 </description>
1084 <guid isPermaLink="false">oai:arXiv.org:2010.15261</guid>
1085 </item>
1086 <item>
1087 <title>CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence Models. (arXiv:2010.15266v1 [cs.CL])</title>
1088 <link>http://fr.arxiv.org/abs/2010.15266</link>
1089 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_A/0/1/0/all/0/1">Abhinav Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xia_P/0/1/0/all/0/1">Patrick Xia</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_G/0/1/0/all/0/1">Guanghui Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yarmohammadi_M/0/1/0/all/0/1">Mahsa Yarmohammadi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Durme_B/0/1/0/all/0/1">Benjamin Van Durme</a></p>
1090
1091 <p>Copy mechanisms are employed in sequence to sequence models (seq2seq) to
1092 generate reproductions of words from the input to the output. These frameworks,
1093 operating at the lexical type level, fail to provide an explicit alignment that
1094 records where each token was copied from. Further, they require contiguous
1095 token sequences from the input (spans) to be copied individually. We present a
1096 model with an explicit token-level copy operation and extend it to copying
1097 entire spans. Our model provides hard alignments between spans in the input and
1098 output, allowing for nontraditional applications of seq2seq, like information
1099 extraction. We demonstrate the approach on Nested Named Entity Recognition,
1100 achieving near state-of-the-art accuracy with an order of magnitude increase in
1101 decoding speed.
1102 </p>
1103 </description>
1104 <guid isPermaLink="false">oai:arXiv.org:2010.15266</guid>
1105 </item>
1106 <item>
1107 <title>Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning. (arXiv:2010.15268v1 [cs.LG])</title>
1108 <link>http://fr.arxiv.org/abs/2010.15268</link>
1109 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Young_K/0/1/0/all/0/1">Kenny Young</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sutton_R/0/1/0/all/0/1">Richard S. Sutton</a></p>
1110
1111 <p>Despite empirical success, the theory of reinforcement learning (RL) with
1112 value function approximation remains fundamentally incomplete. Prior work has
1113 identified a variety of pathological behaviours that arise in RL algorithms
1114 that combine approximate on-policy evaluation and greedification. One prominent
1115 example is policy oscillation, wherein an algorithm may cycle indefinitely
1116 between policies, rather than converging to a fixed point. What is not well
1117 understood however is the quality of the policies in the region of oscillation.
1118 In this paper we present simple examples illustrating that in addition to
1119 policy oscillation and multiple fixed points -- the same basic issue can lead
1120 to convergence to the worst possible policy for a given approximation. Such
1121 behaviours can arise when algorithms optimize evaluation accuracy weighted by
1122 the distribution of states that occur under the current policy, but greedify
1123 based on the value of states which are rare or nonexistent under this
1124 distribution. This means the values used for greedification are unreliable and
1125 can steer the policy in undesirable directions. Our observation that this can
1126 lead to the worst possible policy shows that in a general sense such algorithms
1127 are unreliable. The existence of such examples helps to narrow the kind of
1128 theoretical guarantees that are possible and the kind of algorithmic ideas that
1129 are likely to be helpful. We demonstrate analytically and experimentally that
1130 such pathological behaviours can impact a wide range of RL and dynamic
1131 programming algorithms; such behaviours can arise both with and without
1132 bootstrapping, and with linear function approximation as well as with more
1133 complex parameterized functions like neural networks.
1134 </p>
1135 </description>
1136 <guid isPermaLink="false">oai:arXiv.org:2010.15268</guid>
1137 </item>
1138 <item>
1139 <title>GloFlow: Global Image Alignment for Creation of Whole Slide Images for Pathology from Video. (arXiv:2010.15269v1 [eess.IV])</title>
1140 <link>http://fr.arxiv.org/abs/2010.15269</link>
1141 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Krishna_V/0/1/0/all/0/1">Viswesh Krishna</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Joshi_A/0/1/0/all/0/1">Anirudh Joshi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bulterys_P/0/1/0/all/0/1">Philip L. Bulterys</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Yang_E/0/1/0/all/0/1">Eric Yang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ng_A/0/1/0/all/0/1">Andrew Y. Ng</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Rajpurkar_P/0/1/0/all/0/1">Pranav Rajpurkar</a></p>
1142
1143 <p>The application of deep learning to pathology assumes the existence of
1144 digital whole slide images of pathology slides. However, slide digitization is
1145 bottlenecked by the high cost of precise motor stages in slide scanners that
1146 are needed for position information used for slide stitching. We propose
1147 GloFlow, a two-stage method for creating a whole slide image using optical
1148 flow-based image registration with global alignment using a computationally
1149 tractable graph-pruning approach. In the first stage, we train an optical flow
1150 predictor to predict pairwise translations between successive video frames to
1151 approximate a stitch. In the second stage, this approximate stitch is used to
1152 create a neighborhood graph to produce a corrected stitch. On a simulated
1153 dataset of video scans of WSIs, we find that our method outperforms known
1154 approaches to slide-stitching, and stitches WSIs resembling those produced by
1155 slide scanners.
1156 </p>
1157 </description>
1158 <guid isPermaLink="false">oai:arXiv.org:2010.15269</guid>
1159 </item>
1160 <item>
1161 <title>A globally convergent modified Newton method for the direct minimization of the Ohta-Kawasaki energy with application to the directed self-assembly of diblock copolymers. (arXiv:2010.15271v1 [physics.comp-ph])</title>
1162 <link>http://fr.arxiv.org/abs/2010.15271</link>
1163 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Cao_L/0/1/0/all/0/1">Lianghao Cao</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Ghattas_O/0/1/0/all/0/1">Omar Ghattas</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Oden_J/0/1/0/all/0/1">J. Tinsley Oden</a></p>
1164
1165 <p>We propose a fast and robust scheme for the direct minimization of the
1166 Ohta-Kawasaki energy that characterizes the microphase separation of diblock
1167 copolymer melts. The scheme employs a globally convergent modified Newton
1168 method with line search which is shown to be mass-conservative,
1169 energy-descending, asymptotically quadratically convergent, and three orders of
1170 magnitude more efficient than the commonly-used gradient flow approach. The
1171 regularity and the first-order condition of minimizers are analyzed. A
1172 numerical study of the chemical substrate guided directed self-assembly of
1173 diblock copolymer melts, based on a novel polymer-substrate interaction model
1174 and the proposed scheme, is provided.
1175 </p>
1176 </description>
1177 <guid isPermaLink="false">oai:arXiv.org:2010.15271</guid>
1178 </item>
1179 <item>
1180 <title>The distribution of inhibitory neurons in the C. elegans connectome facilitates self-optimization of coordinated neural activity. (arXiv:2010.15272v1 [q-bio.NC])</title>
1181 <link>http://fr.arxiv.org/abs/2010.15272</link>
1182 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-bio/1/au:+Morales_A/0/1/0/all/0/1">Alejandro Morales</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Froese_T/0/1/0/all/0/1">Tom Froese</a></p>
1183
1184 <p>The nervous system of the nematode soil worm Caenorhabditis elegans exhibits
1185 remarkable complexity despite the worm's small size. A general challenge is to
1186 better understand the relationship between neural organization and neural
1187 activity at the system level, including the functional roles of inhibitory
1188 connections. Here we implemented an abstract simulation model of the C. elegans
1189 connectome that approximates the neurotransmitter identity of each neuron, and
1190 we explored the functional role of these physiological differences for neural
1191 activity. In particular, we created a Hopfield neural network in which all of
1192 the worm's neurons characterized by inhibitory neurotransmitters are assigned
1193 inhibitory outgoing connections. Then, we created a control condition in which
1194 the same number of inhibitory connections are arbitrarily distributed across
1195 the network. A comparison of these two conditions revealed that the biological
1196 distribution of inhibitory connections facilitates the self-optimization of
1197 coordinated neural activity compared with an arbitrary distribution of
1198 inhibitory connections.
1199 </p>
1200 </description>
1201 <guid isPermaLink="false">oai:arXiv.org:2010.15272</guid>
1202 </item>
1203 <item>
1204 <title>Representation learning for improved interpretability and classification accuracy of clinical factors from EEG. (arXiv:2010.15274v1 [cs.LG])</title>
1205 <link>http://fr.arxiv.org/abs/2010.15274</link>
1206 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Honke_G/0/1/0/all/0/1">Garrett Honke</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Higgins_I/0/1/0/all/0/1">Irina Higgins</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Thigpen_N/0/1/0/all/0/1">Nina Thigpen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miskovic_V/0/1/0/all/0/1">Vladimir Miskovic</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Link_K/0/1/0/all/0/1">Katie Link</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_P/0/1/0/all/0/1">Pramod Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Klawohn_J/0/1/0/all/0/1">Julia Klawohn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hajcak_G/0/1/0/all/0/1">Greg Hajcak</a></p>
1207
1208 <p>Despite extensive standardization, diagnostic interviews for mental health
1209 disorders encompass substantial subjective judgment. Previous studies have
1210 demonstrated that EEG-based neural measures can function as reliable objective
1211 correlates of depression, or even predictors of depression and its course.
1212 However, their clinical utility has not been fully realized because of 1) the
1213 lack of automated ways to deal with the inherent noise associated with EEG data
1214 at scale, and 2) the lack of knowledge of which aspects of the EEG signal may
1215 be markers of a clinical disorder. Here we adapt an unsupervised pipeline from
1216 the recent deep representation learning literature to address these problems by
1217 1) learning a disentangled representation using $\beta$-VAE to denoise the
1218 signal, and 2) extracting interpretable features associated with a sparse set
1219 of clinical labels using a Symbol-Concept Association Network (SCAN). We
1220 demonstrate that our method is able to outperform the canonical hand-engineered
1221 baseline classification method on a number of factors, including participant
1222 age and depression diagnosis. Furthermore, our method recovers a representation
1223 that can be used to automatically extract denoised Event Related Potentials
1224 (ERPs) from novel, single EEG trajectories, and supports fast supervised
1225 re-mapping to various clinical labels, allowing clinicians to re-use a single
1226 EEG representation regardless of updates to the standardized diagnostic system.
1227 Finally, single factors of the learned disentangled representations often
1228 correspond to meaningful markers of clinical factors, as automatically detected
1229 by SCAN, allowing for human interpretability and post-hoc expert analysis of
1230 the recommendations made by the model.
1231 </p>
1232 </description>
1233 <guid isPermaLink="false">oai:arXiv.org:2010.15274</guid>
1234 </item>
1235 <item>
1236 <title>A direct method for solving inverse Sturm-Liouville problems. (arXiv:2010.15275v1 [math.NA])</title>
1237 <link>http://fr.arxiv.org/abs/2010.15275</link>
1238 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Kravchenko_V/0/1/0/all/0/1">Vladislav V. Kravchenko</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Torba_S/0/1/0/all/0/1">Sergii M. Torba</a></p>
1239
1240 <p>We consider two main inverse Sturm-Liouville problems: the problem of
1241 recovery of the potential and the boundary conditions from two spectra or from
1242 a spectral density function. A simple method for practical solution of such
1243 problems is developed, based on the transmutation operator approach, new
1244 Neumann series of Bessel functions representations for solutions and the
1245 Gelfand-Levitan equation. The method allows one to reduce the inverse
1246 Sturm-Liouville problem directly to a system of linear algebraic equations,
1247 such that the potential is recovered from the first element of the solution
1248 vector. We prove the stability of the method and show its numerical efficiency
1249 with several numerical examples.
1250 </p>
1251 </description>
1252 <guid isPermaLink="false">oai:arXiv.org:2010.15275</guid>
1253 </item>
1254 <item>
1255 <title>Class-incremental learning: survey and performance evaluation. (arXiv:2010.15277v1 [cs.LG])</title>
1256 <link>http://fr.arxiv.org/abs/2010.15277</link>
1257 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Masana_M/0/1/0/all/0/1">Marc Masana</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_X/0/1/0/all/0/1">Xialei Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Twardowski_B/0/1/0/all/0/1">Bartlomiej Twardowski</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Menta_M/0/1/0/all/0/1">Mikel Menta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bagdanov_A/0/1/0/all/0/1">Andrew D. Bagdanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weijer_J/0/1/0/all/0/1">Joost van de Weijer</a></p>
1258
1259 <p>For future learning systems incremental learning is desirable, because it
1260 allows for: efficient resource usage by eliminating the need to retrain from
1261 scratch at the arrival of new data; reduced memory usage by preventing or
1262 limiting the amount of data required to be stored -- also important when
1263 privacy limitations are imposed; and learning that more closely resembles human
1264 learning. The main challenge for incremental learning is catastrophic
1265 forgetting, which refers to the precipitous drop in performance on previously
1266 learned tasks after learning a new one. Incremental learning of deep neural
1267 networks has seen explosive growth in recent years. Initial work focused on
1268 task incremental learning, where a task-ID is provided at inference time.
1269 Recently we have seen a shift towards class-incremental learning where the
1270 learner must classify at inference time between all classes seen in previous
1271 tasks without recourse to a task-ID. In this paper, we provide a complete
1272 survey of existing methods for incremental learning, and in particular we
1273 perform an extensive experimental evaluation on twelve class-incremental
1274 methods. We consider several new experimental scenarios, including a comparison
1275 of class-incremental methods on multiple large-scale datasets, investigation
1276 into small and large domain shifts, and comparison on various network
1277 architectures.
1278 </p>
1279 </description>
1280 <guid isPermaLink="false">oai:arXiv.org:2010.15277</guid>
1281 </item>
1282 <item>
1283 <title>Specification description and verification of multitask hybrid systems in the OTS/CafeOBJ method. (arXiv:2010.15280v1 [cs.SE])</title>
1284 <link>http://fr.arxiv.org/abs/2010.15280</link>
1285 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nakamura_M/0/1/0/all/0/1">Masaki Nakamura</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sakakibara_K/0/1/0/all/0/1">Kazutoshi Sakakibara</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ogata_K/0/1/0/all/0/1">Kazuhiro Ogata</a></p>
1286
1287 <p>To develop IoT and/or CSP systems, we need consider both continuous data from
1288 physical world and discrete data in computer systems. Such a system is called a
1289 hybrid system. Because of density of continuous data, it is not easy to do
1290 software testing to ensure reliability of hybrid systems. Moreover, the size of
1291 the state space increases exponentially for multitask systems. Formal
1292 descriptions of hybrid systems may help us to verify desired properties of a
1293 given system formally with computer supports. In this paper, we propose a way
1294 to describe a formal specification of a given multitask hybrid system as an
1295 observational transition system in CafeOBJ algebraic specification language and
1296 verify it by the proof score method based on equational reasoning implemented
1297 in CafeOBJ interpreter.
1298 </p>
1299 </description>
1300 <guid isPermaLink="false">oai:arXiv.org:2010.15280</guid>
1301 </item>
1302 <item>
1303 <title>GENs: Generative Encoding Networks. (arXiv:2010.15283v1 [cs.LG])</title>
1304 <link>http://fr.arxiv.org/abs/2010.15283</link>
1305 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Saha_S/0/1/0/all/0/1">Surojit Saha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Elhabian_S/0/1/0/all/0/1">Shireen Elhabian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Whitaker_R/0/1/0/all/0/1">Ross T. Whitaker</a></p>
1306
1307 <p>Mapping data from and/or onto a known family of distributions has become an
1308 important topic in machine learning and data analysis. Deep generative models
1309 (e.g., generative adversarial networks ) have been used effectively to match
1310 known and unknown distributions. Nonetheless, when the form of the target
1311 distribution is known, analytical methods are advantageous in providing robust
1312 results with provable properties. In this paper, we propose and analyze the use
1313 of nonparametric density methods to estimate the Jensen-Shannon divergence for
1314 matching unknown data distributions to known target distributions, such
1315 Gaussian or mixtures of Gaussians, in latent spaces. This analytical method has
1316 several advantages: better behavior when training sample quantity is low,
1317 provable convergence properties, and relatively few parameters, which can be
1318 derived analytically. Using the proposed method, we enforce the latent
1319 representation of an autoencoder to match a target distribution in a learning
1320 framework that we call a {\em generative encoding network}. Here, we present
1321 the numerical methods; derive the expected distribution of the data in the
1322 latent space; evaluate the properties of the latent space, sample
1323 reconstruction, and generated samples; show the advantages over the adversarial
1324 counterpart; and demonstrate the application of the method in real world.
1325 </p>
1326 </description>
1327 <guid isPermaLink="false">oai:arXiv.org:2010.15283</guid>
1328 </item>
1329 <item>
1330 <title>Speech-Image Semantic Alignment Does Not Depend on Any Prior Classification Tasks. (arXiv:2010.15288v1 [cs.LG])</title>
1331 <link>http://fr.arxiv.org/abs/2010.15288</link>
1332 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mortazavi_M/0/1/0/all/0/1">Masood S. Mortazavi</a></p>
1333
1334 <p>Semantically-aligned $(speech, image)$ datasets can be used to explore
1335 "visually-grounded speech". In a majority of existing investigations, features
1336 of an image signal are extracted using neural networks "pre-trained" on other
1337 tasks (e.g., classification on ImageNet). In still others, pre-trained networks
1338 are used to extract audio features prior to semantic embedding. Without
1339 "transfer learning" through pre-trained initialization or pre-trained feature
1340 extraction, previous results have tended to show low rates of recall in $speech
1341 \rightarrow image$ and $image \rightarrow speech$ queries.
1342 </p>
1343 <p>Choosing appropriate neural architectures for encoders in the speech and
1344 image branches and using large datasets, one can obtain competitive recall
1345 rates without any reliance on any pre-trained initialization or feature
1346 extraction: $(speech,image)$ semantic alignment and $speech \rightarrow image$
1347 and $image \rightarrow speech$ retrieval are canonical tasks worthy of
1348 independent investigation of their own and allow one to explore other
1349 questions---e.g., the size of the audio embedder can be reduced significantly
1350 with little loss of recall rates in $speech \rightarrow image$ and $image
1351 \rightarrow speech$ queries.
1352 </p>
1353 </description>
1354 <guid isPermaLink="false">oai:arXiv.org:2010.15288</guid>
1355 </item>
1356 <item>
1357 <title>Link inference of noisy delay-coupled networks: Machine learning and opto-electronic experimental tests. (arXiv:2010.15289v1 [nlin.AO])</title>
1358 <link>http://fr.arxiv.org/abs/2010.15289</link>
1359 <description><p>Authors: <a href="http://fr.arxiv.org/find/nlin/1/au:+Banerjee_A/0/1/0/all/0/1">Amitava Banerjee</a>, <a href="http://fr.arxiv.org/find/nlin/1/au:+Hart_J/0/1/0/all/0/1">Joseph D. Hart</a>, <a href="http://fr.arxiv.org/find/nlin/1/au:+Roy_R/0/1/0/all/0/1">Rajarshi Roy</a>, <a href="http://fr.arxiv.org/find/nlin/1/au:+Ott_E/0/1/0/all/0/1">Edward Ott</a></p>
1360
1361 <p>We devise a machine learning technique to solve the general problem of
1362 inferring network links that have time-delays. The goal is to do this purely
1363 from time-series data of the network nodal states. This task has applications
1364 in fields ranging from applied physics and engineering to neuroscience and
1365 biology. To achieve this, we first train a type of machine learning system
1366 known as reservoir computing to mimic the dynamics of the unknown network. We
1367 formulate and test a technique that uses the trained parameters of the
1368 reservoir system output layer to deduce an estimate of the unknown network
1369 structure. Our technique, by its nature, is non-invasive, but is motivated by
1370 the widely-used invasive network inference method whereby the responses to
1371 active perturbations applied to the network are observed and employed to infer
1372 network links (e.g., knocking down genes to infer gene regulatory networks). We
1373 test this technique on experimental and simulated data from delay-coupled
1374 opto-electronic oscillator networks. We show that the technique often yields
1375 very good results particularly if the system does not exhibit synchrony. We
1376 also find that the presence of dynamical noise can strikingly enhance the
1377 accuracy and ability of our technique, especially in networks that exhibit
1378 synchrony.
1379 </p>
1380 </description>
1381 <guid isPermaLink="false">oai:arXiv.org:2010.15289</guid>
1382 </item>
1383 <item>
1384 <title>Fact or Factitious? Contextualized Opinion Spam Detection. (arXiv:2010.15296v1 [cs.AI])</title>
1385 <link>http://fr.arxiv.org/abs/2010.15296</link>
1386 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kennedy_S/0/1/0/all/0/1">Stefan Kennedy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Walsh_N/0/1/0/all/0/1">Niall Walsh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sloka_K/0/1/0/all/0/1">Kirils Sloka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Foster_J/0/1/0/all/0/1">Jennifer Foster</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+McCarren_A/0/1/0/all/0/1">Andrew McCarren</a></p>
1387
1388 <p>In this paper we perform an analytic comparison of a number of techniques
1389 used to detect fake and deceptive online reviews. We apply a number machine
1390 learning approaches found to be effective, and introduce our own approach by
1391 fine-tuning state of the art contextualised embeddings. The results we obtain
1392 show the potential of contextualised embeddings for fake review detection, and
1393 lay the groundwork for future research in this area.
1394 </p>
1395 </description>
1396 <guid isPermaLink="false">oai:arXiv.org:2010.15296</guid>
1397 </item>
1398 <item>
1399 <title>Analysis of Chorin-Type Projection Methods for the Stochastic Stokes Equations with General Multiplicative Noises. (arXiv:2010.15297v1 [math.NA])</title>
1400 <link>http://fr.arxiv.org/abs/2010.15297</link>
1401 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Feng_X/0/1/0/all/0/1">Xiaobing Feng</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Vo_L/0/1/0/all/0/1">Liet Vo</a></p>
1402
1403 <p>This paper is concerned with numerical analysis of two fully discrete
1404 Chorin-type projection methods for the stochastic Stokes equations with general
1405 non-solenoidal multiplicative noise. The first scheme is the standard Chorin
1406 scheme and the second one is a modified Chorin scheme which is designed by
1407 employing the Helmholtz decomposition on the noise function at each time step
1408 to produce a projected divergence-free noise and a "pseudo pressure" after
1409 combining the original pressure and the curl-free part of the decomposition.
1410 Optimal order rates of the convergence are proved for both velocity and
1411 pressure approximations of these two (semi-discrete) Chorin schemes. It is
1412 crucial to measure the errors in appropriate norms. The fully discrete finite
1413 element methods are formulated by discretizing both semi-discrete Chorin
1414 schemes in space by the standard finite element method. Suboptimal order error
1415 estimates are derived for both fully discrete methods. It is proved that all
1416 spatial error constants contain a growth factor $k^{-1/2}$, where $k$ denotes
1417 the time step size, which explains the deteriorating performance of the
1418 standard Chorin scheme when $k\to 0$ and the space mesh size is fixed as
1419 observed earlier in the numerical tests of [9]. Numerical results are also
1420 provided to guage the performance of the proposed numerical methods and to
1421 validate the sharpness of the theoretical error estimates.
1422 </p>
1423 </description>
1424 <guid isPermaLink="false">oai:arXiv.org:2010.15297</guid>
1425 </item>
1426 <item>
1427 <title>Uncovering Latent Biases in Text: Method and Application to Peer Review. (arXiv:2010.15300v1 [cs.CL])</title>
1428 <link>http://fr.arxiv.org/abs/2010.15300</link>
1429 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Manzoor_E/0/1/0/all/0/1">Emaad Manzoor</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shah_N/0/1/0/all/0/1">Nihar B. Shah</a></p>
1430
1431 <p>Quantifying systematic disparities in numerical quantities such as employment
1432 rates and wages between population subgroups provides compelling evidence for
1433 the existence of societal biases. However, biases in the text written for
1434 members of different subgroups (such as in recommendation letters for male and
1435 non-male candidates), though widely reported anecdotally, remain challenging to
1436 quantify. In this work, we introduce a novel framework to quantify bias in text
1437 caused by the visibility of subgroup membership indicators. We develop a
1438 nonparametric estimation and inference procedure to estimate this bias. We then
1439 formalize an identification strategy to causally link the estimated bias to the
1440 visibility of subgroup membership indicators, provided observations from time
1441 periods both before and after an identity-hiding policy change. We identify an
1442 application wherein "ground truth" bias can be inferred to evaluate our
1443 framework, instead of relying on synthetic or secondary data. Specifically, we
1444 apply our framework to quantify biases in the text of peer reviews from a
1445 reputed machine learning conference before and after the conference adopted a
1446 double-blind reviewing policy. We show evidence of biases in the review ratings
1447 that serves as "ground truth", and show that our proposed framework accurately
1448 detects these biases from the review text without having access to the review
1449 ratings.
1450 </p>
1451 </description>
1452 <guid isPermaLink="false">oai:arXiv.org:2010.15300</guid>
1453 </item>
1454 <item>
1455 <title>Point Cloud Attribute Compression via Successive Subspace Graph Transform. (arXiv:2010.15302v1 [cs.CV])</title>
1456 <link>http://fr.arxiv.org/abs/2010.15302</link>
1457 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yueru Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shao_Y/0/1/0/all/0/1">Yiting Shao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_J/0/1/0/all/0/1">Jing Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_G/0/1/0/all/0/1">Ge Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kuo_C/0/1/0/all/0/1">C.-C. Jay Kuo</a></p>
1458
1459 <p>Inspired by the recently proposed successive subspace learning (SSL)
1460 principles, we develop a successive subspace graph transform (SSGT) to address
1461 point cloud attribute compression in this work. The octree geometry structure
1462 is utilized to partition the point cloud, where every node of the octree
1463 represents a point cloud subspace with a certain spatial size. We design a
1464 weighted graph with self-loop to describe the subspace and define a graph
1465 Fourier transform based on the normalized graph Laplacian. The transforms are
1466 applied to large point clouds from the leaf nodes to the root node of the
1467 octree recursively, while the represented subspace is expanded from the
1468 smallest one to the whole point cloud successively. It is shown by experimental
1469 results that the proposed SSGT method offers better R-D performances than the
1470 previous Region Adaptive Haar Transform (RAHT) method.
1471 </p>
1472 </description>
1473 <guid isPermaLink="false">oai:arXiv.org:2010.15302</guid>
1474 </item>
1475 <item>
1476 <title>Automatic joint damage quantification using computer vision and deep learning. (arXiv:2010.15303v1 [cs.CV])</title>
1477 <link>http://fr.arxiv.org/abs/2010.15303</link>
1478 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_Q/0/1/0/all/0/1">Quang Tran</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Roesler_J/0/1/0/all/0/1">Jeffery R. Roesler</a></p>
1479
1480 <p>Joint raveled or spalled damage (henceforth called joint damage) can affect
1481 the safety and long-term performance of concrete pavements. It is important to
1482 assess and quantify the joint damage over time to assist in building action
1483 plans for maintenance, predicting maintenance costs, and maximize the concrete
1484 pavement service life. A framework for the accurate, autonomous, and rapid
1485 quantification of joint damage with a low-cost camera is proposed using a
1486 computer vision technique with a deep learning (DL) algorithm. The DL model is
1487 employed to train 263 images of sawcuts with joint damage. The trained DL model
1488 is used for pixel-wise color-masking joint damage in a series of query 2D
1489 images, which are used to reconstruct a 3D image using open-source structure
1490 from motion algorithm. Another damage quantification algorithm using a color
1491 threshold is applied to detect and compute the surface area of the damage in
1492 the 3D reconstructed image. The effectiveness of the framework was validated
1493 through inspecting joint damage at four transverse contraction joints in
1494 Illinois, USA, including three acceptable joints and one unacceptable joint by
1495 visual inspection. The results show the framework achieves 76% recall and 10%
1496 error.
1497 </p>
1498 </description>
1499 <guid isPermaLink="false">oai:arXiv.org:2010.15303</guid>
1500 </item>
1501 <item>
1502 <title>ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection. (arXiv:2010.15306v1 [eess.AS])</title>
1503 <link>http://fr.arxiv.org/abs/2010.15306</link>
1504 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Shimada_K/0/1/0/all/0/1">Kazuki Shimada</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Koyama_Y/0/1/0/all/0/1">Yuichiro Koyama</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Takahashi_N/0/1/0/all/0/1">Naoya Takahashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Takahashi_S/0/1/0/all/0/1">Shusuke Takahashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Mitsufuji_Y/0/1/0/all/0/1">Yuki Mitsufuji</a></p>
1505
1506 <p>Neural-network (NN)-based methods show high performance in sound event
1507 localization and detection (SELD). Conventional NN-based methods use two
1508 branches for a sound event detection (SED) target and a direction-of-arrival
1509 (DOA) target. The two-branch representation with a single network has to decide
1510 how to balance the two objectives during optimization. Using two networks
1511 dedicated to each task increases system complexity and network size. To address
1512 these problems, we propose an activity-coupled Cartesian DOA (ACCDOA)
1513 representation, which assigns a sound event activity to the length of a
1514 corresponding Cartesian DOA vector. The ACCDOA representation enables us to
1515 solve a SELD task with a single target and has two advantages: avoiding the
1516 necessity of balancing the objectives and model size increase. In experimental
1517 evaluations with the DCASE 2020 Task 3 dataset, the ACCDOA representation
1518 outperformed the two-branch representation in SELD metrics with a smaller
1519 network size. The ACCDOA-based SELD system also performed better than
1520 state-of-the-art SELD systems in terms of localization and location-dependent
1521 detection.
1522 </p>
1523 </description>
1524 <guid isPermaLink="false">oai:arXiv.org:2010.15306</guid>
1525 </item>
1526 <item>
1527 <title>DeviceTTS: A Small-Footprint, Fast, Stable Network for On-Device Text-to-Speech. (arXiv:2010.15311v1 [eess.AS])</title>
1528 <link>http://fr.arxiv.org/abs/2010.15311</link>
1529 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Huang_Z/0/1/0/all/0/1">Zhiying Huang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_H/0/1/0/all/0/1">Hao Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lei_M/0/1/0/all/0/1">Ming Lei</a></p>
1530
1531 <p>With the number of smart devices increasing, the demand for on-device
1532 text-to-speech (TTS) increases rapidly. In recent years, many prominent
1533 End-to-End TTS methods have been proposed, and have greatly improved the
1534 quality of synthesized speech. However, to ensure the qualified speech, most
1535 TTS systems depend on large and complex neural network models, and it's hard to
1536 deploy these TTS systems on-device. In this paper, a small-footprint, fast,
1537 stable network for on-device TTS is proposed, named as DeviceTTS. DeviceTTS
1538 makes use of a duration predictor as a bridge between encoder and decoder so as
1539 to avoid the problem of words skipping and repeating in Tacotron. As we all
1540 know, model size is a key factor for on-device TTS. For DeviceTTS, Deep
1541 Feedforward Sequential Memory Network (DFSMN) is used as the basic component.
1542 Moreover, to speed up inference, mix-resolution decoder is proposed for balance
1543 the inference speed and speech quality. Experiences are done with WORLD and
1544 LPCNet vocoder. Finally, with only 1.4 million model parameters and 0.099
1545 GFLOPS, DeviceTTS achieves comparable performance with Tacotron and FastSpeech.
1546 As far as we know, the DeviceTTS can meet the needs of most of the devices in
1547 practical application.
1548 </p>
1549 </description>
1550 <guid isPermaLink="false">oai:arXiv.org:2010.15311</guid>
1551 </item>
1552 <item>
1553 <title>"where is this relationship going?": Understanding Relationship Trajectories in Narrative Text. (arXiv:2010.15313v1 [cs.CL])</title>
1554 <link>http://fr.arxiv.org/abs/2010.15313</link>
1555 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+You_K/0/1/0/all/0/1">Keen You</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Goldwasser_D/0/1/0/all/0/1">Dan Goldwasser</a></p>
1556
1557 <p>We examine a new commonsense reasoning task: given a narrative describing a
1558 social interaction that centers on two protagonists, systems make inferences
1559 about the underlying relationship trajectory. Specifically, we propose two
1560 evaluation tasks: Relationship Outlook Prediction MCQ and Resolution Prediction
1561 MCQ. In Relationship Outlook Prediction, a system maps an interaction to a
1562 relationship outlook that captures how the interaction is expected to change
1563 the relationship. In Resolution Prediction, a system attributes a given
1564 relationship outlook to a particular resolution that explains the outcome.
1565 These two tasks parallel two real-life questions that people frequently ponder
1566 upon as they navigate different social situations: "where is this relationship
1567 going?" and "how did we end up here?". To facilitate the investigation of human
1568 social relationships through these two tasks, we construct a new dataset,
1569 Social Narrative Tree, which consists of 1250 stories documenting a variety of
1570 daily social interactions. The narratives encode a multitude of social elements
1571 that interweave to give rise to rich commonsense knowledge of how relationships
1572 evolve with respect to social interactions. We establish baseline performances
1573 using language models and the accuracies are significantly lower than human
1574 performance. The results demonstrate that models need to look beyond syntactic
1575 and semantic signals to comprehend complex human relationships.
1576 </p>
1577 </description>
1578 <guid isPermaLink="false">oai:arXiv.org:2010.15313</guid>
1579 </item>
1580 <item>
1581 <title>Recurrent neural circuits for contour detection. (arXiv:2010.15314v1 [cs.CV])</title>
1582 <link>http://fr.arxiv.org/abs/2010.15314</link>
1583 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Linsley_D/0/1/0/all/0/1">Drew Linsley</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_J/0/1/0/all/0/1">Junkyung Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ashok_A/0/1/0/all/0/1">Alekh Ashok</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Serre_T/0/1/0/all/0/1">Thomas Serre</a></p>
1584
1585 <p>We introduce a deep recurrent neural network architecture that approximates
1586 visual cortical circuits. We show that this architecture, which we refer to as
1587 the gamma-net, learns to solve contour detection tasks with better sample
1588 efficiency than state-of-the-art feedforward networks, while also exhibiting a
1589 classic perceptual illusion, known as the orientation-tilt illusion. Correcting
1590 this illusion significantly reduces gamma-net contour detection accuracy by
1591 driving it to prefer low-level edges over high-level object boundary contours.
1592 Overall, our study suggests that the orientation-tilt illusion is a byproduct
1593 of neural circuits that help biological visual systems achieve robust and
1594 efficient contour detection, and that incorporating these circuits in
1595 artificial neural networks can improve computer vision.
1596 </p>
1597 </description>
1598 <guid isPermaLink="false">oai:arXiv.org:2010.15314</guid>
1599 </item>
1600 <item>
1601 <title>Exploring Generative Adversarial Networks for Image-to-Image Translation in STEM Simulation. (arXiv:2010.15315v1 [cs.CV])</title>
1602 <link>http://fr.arxiv.org/abs/2010.15315</link>
1603 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lawrence_N/0/1/0/all/0/1">Nick Lawrence</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_M/0/1/0/all/0/1">Mingren Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yin_R/0/1/0/all/0/1">Ruiqi Yin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_C/0/1/0/all/0/1">Cloris Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Morgan_D/0/1/0/all/0/1">Dane Morgan</a></p>
1604
1605 <p>The use of accurate scanning transmission electron microscopy (STEM) image
1606 simulation methods require large computation times that can make their use
1607 infeasible for the simulation of many images. Other simulation methods based on
1608 linear imaging models, such as the convolution method, are much faster but are
1609 too inaccurate to be used in application. In this paper, we explore deep
1610 learning models that attempt to translate a STEM image produced by the
1611 convolution method to a prediction of the high accuracy multislice image. We
1612 then compare our results to those of regression methods. We find that using the
1613 deep learning model Generative Adversarial Network (GAN) provides us with the
1614 best results and performs at a similar accuracy level to previous regression
1615 models on the same dataset. Codes and data for this project can be found in
1616 this GitHub repository, https://github.com/uw-cmg/GAN-STEM-Conv2MultiSlice.
1617 </p>
1618 </description>
1619 <guid isPermaLink="false">oai:arXiv.org:2010.15315</guid>
1620 </item>
1621 <item>
1622 <title>Multiple Sclerosis Severity Classification From Clinical Text. (arXiv:2010.15316v1 [cs.CL])</title>
1623 <link>http://fr.arxiv.org/abs/2010.15316</link>
1624 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Costa_A/0/1/0/all/0/1">Alister D Costa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Denkovski_S/0/1/0/all/0/1">Stefan Denkovski</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Malyska_M/0/1/0/all/0/1">Michal Malyska</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Moon_S/0/1/0/all/0/1">Sae Young Moon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rufino_B/0/1/0/all/0/1">Brandon Rufino</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_Z/0/1/0/all/0/1">Zhen Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Killian_T/0/1/0/all/0/1">Taylor Killian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ghassemi_M/0/1/0/all/0/1">Marzyeh Ghassemi</a></p>
1625
1626 <p>Multiple Sclerosis (MS) is a chronic, inflammatory and degenerative
1627 neurological disease, which is monitored by a specialist using the Expanded
1628 Disability Status Scale (EDSS) and recorded in unstructured text in the form of
1629 a neurology consult note. An EDSS measurement contains an overall "EDSS" score
1630 and several functional subscores. Typically, expert knowledge is required to
1631 interpret consult notes and generate these scores. Previous approaches used
1632 limited context length Word2Vec embeddings and keyword searches to predict
1633 scores given a consult note, but often failed when scores were not explicitly
1634 stated. In this work, we present MS-BERT, the first publicly available
1635 transformer model trained on real clinical data other than MIMIC. Next, we
1636 present MSBC, a classifier that applies MS-BERT to generate embeddings and
1637 predict EDSS and functional subscores. Lastly, we explore combining MSBC with
1638 other models through the use of Snorkel to generate scores for unlabelled
1639 consult notes. MSBC achieves state-of-the-art performance on all metrics and
1640 prediction tasks and outperforms the models generated from the Snorkel
1641 ensemble. We improve Macro-F1 by 0.12 (to 0.88) for predicting EDSS and on
1642 average by 0.29 (to 0.63) for predicting functional subscores over previous
1643 Word2Vec CNN and rule-based approaches.
1644 </p>
1645 </description>
1646 <guid isPermaLink="false">oai:arXiv.org:2010.15316</guid>
1647 </item>
1648 <item>
1649 <title>The IQIYI System for Voice Conversion Challenge 2020. (arXiv:2010.15317v1 [cs.SD])</title>
1650 <link>http://fr.arxiv.org/abs/2010.15317</link>
1651 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gan_W/0/1/0/all/0/1">Wendong Gan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_H/0/1/0/all/0/1">Haitao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yan_Y/0/1/0/all/0/1">Yin Yan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_J/0/1/0/all/0/1">Jianwei Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_B/0/1/0/all/0/1">Bolong Wen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_X/0/1/0/all/0/1">Xueping Xu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_H/0/1/0/all/0/1">Hai Li</a></p>
1652
1653 <p>This paper presents the IQIYI voice conversion system (T24) for Voice
1654 Conversion 2020. In the competition, each target speaker has 70 sentences. We
1655 have built an end-to-end voice conversion system based on PPG. First, the ASR
1656 acoustic model calculates the BN feature, which represents the content-related
1657 information in the speech. Then the Mel feature is calculated through an
1658 improved prosody tacotron model. Finally, the Mel spectrum is converted to wav
1659 through an improved LPCNet. The evaluation results show that this system can
1660 achieve better voice conversion effects. In the case of using 16k rather than
1661 24k sampling rate audio, the conversion result is relatively good in
1662 naturalness and similarity. Among them, our best results are in the similarity
1663 evaluation of the Task 2, the 2nd in the ASV-based objective evaluation and the
1664 5th in the subjective evaluation.
1665 </p>
1666 </description>
1667 <guid isPermaLink="false">oai:arXiv.org:2010.15317</guid>
1668 </item>
1669 <item>
1670 <title>Gaussian Processes Model-based Control of Underactuated Balance Robots. (arXiv:2010.15320v1 [cs.RO])</title>
1671 <link>http://fr.arxiv.org/abs/2010.15320</link>
1672 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_K/0/1/0/all/0/1">Kuo Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yi_J/0/1/0/all/0/1">Jingang Yi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Song_D/0/1/0/all/0/1">Dezhen Song</a></p>
1673
1674 <p>Ranging from cart-pole systems and autonomous bicycles to bipedal robots,
1675 control of these underactuated balance robots aims to achieve both external
1676 (actuated) subsystem trajectory tracking and internal (unactuated) subsystem
1677 balancing tasks with limited actuation authority. This paper proposes a
1678 learning model-based control framework for underactuated balance robots. The
1679 key idea to simultaneously achieve tracking and balancing tasks is to design
1680 control strategies in slow- and fast-time scales, respectively. In slow-time
1681 scale, model predictive control (MPC) is used to generate the desired internal
1682 subsystem trajectory that encodes the external subsystem tracking performance
1683 and control input. In fast-time scale, the actual internal trajectory is
1684 stabilized to the desired internal trajectory by using an inverse dynamics
1685 controller. The coupling effects between the external and internal subsystems
1686 are captured through the planned internal trajectory profile and the dual
1687 structural properties of the robotic systems. The control design is based on
1688 Gaussian processes (GPs) regression model that are learned from experiments
1689 without need of priori knowledge about the robot dynamics nor successful
1690 balance demonstration. The GPs provide estimates of modeling uncertainties of
1691 the robotic systems and these uncertainty estimations are incorporated in the
1692 MPC design to enhance the control robustness to modeling errors. The
1693 learning-based control design is analyzed with guaranteed stability and
1694 performance. The proposed design is demonstrated by experiments on a Furuta
1695 pendulum and an autonomous bikebot.
1696 </p>
1697 </description>
1698 <guid isPermaLink="false">oai:arXiv.org:2010.15320</guid>
1699 </item>
1700 <item>
1701 <title>Improvement of EAST Data Acquisition Configuration Management. (arXiv:2010.15322v1 [physics.ins-det])</title>
1702 <link>http://fr.arxiv.org/abs/2010.15322</link>
1703 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Ying_C/0/1/0/all/0/1">Chen Ying</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Shi_L/0/1/0/all/0/1">Li Shi</a></p>
1704
1705 <p>The data acquisition console is an important component of the EAST data
1706 acquisition system which provides unified data acquisition and long-term data
1707 storage for diagnostics. The data acquisition console is used to manage the
1708 data acquisition configuration information and control the data acquisition
1709 workflow. The data acquisition console has been developed many years, and with
1710 increasing of data acquisition nodes and emergence of new control nodes, the
1711 function of configuration management has become inadequate. It is going to
1712 update the configuration management function of data acquisition console. The
1713 upgraded data acquisition console based on LabVIEW should be oriented to the
1714 data acquisition administrator, with the functions of managing data acquisition
1715 nodes, managing control nodes, setting and publishing configuration parameters,
1716 batch management, database backup, monitoring the status of data acquisition
1717 nodes, controlling the data acquisition workflow, and shot simulation data
1718 acquisition test. The upgraded data acquisition console has been designed and
1719 under testing recently.
1720 </p>
1721 </description>
1722 <guid isPermaLink="false">oai:arXiv.org:2010.15322</guid>
1723 </item>
1724 <item>
1725 <title>Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth. (arXiv:2010.15327v1 [cs.LG])</title>
1726 <link>http://fr.arxiv.org/abs/2010.15327</link>
1727 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_T/0/1/0/all/0/1">Thao Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Raghu_M/0/1/0/all/0/1">Maithra Raghu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kornblith_S/0/1/0/all/0/1">Simon Kornblith</a></p>
1728
1729 <p>A key factor in the success of deep neural networks is the ability to scale
1730 models to improve performance by varying the architecture depth and width. This
1731 simple property of neural network design has resulted in highly effective
1732 architectures for a variety of tasks. Nevertheless, there is limited
1733 understanding of effects of depth and width on the learned representations. In
1734 this paper, we study this fundamental question. We begin by investigating how
1735 varying depth and width affects model hidden representations, finding a
1736 characteristic block structure in the hidden representations of larger capacity
1737 (wider or deeper) models. We demonstrate that this block structure arises when
1738 model capacity is large relative to the size of the training set, and is
1739 indicative of the underlying layers preserving and propagating the dominant
1740 principal component of their representations. This discovery has important
1741 ramifications for features learned by different models, namely, representations
1742 outside the block structure are often similar across architectures with varying
1743 widths and depths, but the block structure is unique to each model. We analyze
1744 the output predictions of different model architectures, finding that even when
1745 the overall accuracy is similar, wide and deep models exhibit distinctive error
1746 patterns and variations across classes.
1747 </p>
1748 </description>
1749 <guid isPermaLink="false">oai:arXiv.org:2010.15327</guid>
1750 </item>
1751 <item>
1752 <title>Scalable Attack-Resistant Obfuscation of Logic Circuits. (arXiv:2010.15329v1 [cs.CR])</title>
1753 <link>http://fr.arxiv.org/abs/2010.15329</link>
1754 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Alaql_A/0/1/0/all/0/1">Abdulrahman Alaql</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bhunia_S/0/1/0/all/0/1">Swarup Bhunia</a></p>
1755
1756 <p>Hardware IP protection has been one of the most critical areas of research in
1757 the past years. Recently, attacks on hardware IPs (such as reverse engineering
1758 or cloning) have evolved as attackers have developed sophisticated techniques.
1759 Therefore, hardware obfuscation has been introduced as a powerful tool to
1760 protect IPs against piracy attacks. However, many recent attempts to break
1761 existing obfuscation methods have been successful in unlocking the IP and
1762 restoring its functionality. In this paper, we propose SARO, a Scalable
1763 Attack-Resistant Obfuscation that provides a robust functional and structural
1764 design transformation process. SARO treats the target circuit as a graph, and
1765 performs a partitioning algorithm to produce a set of sub-graphs, then applies
1766 our novel Truth Table Transformation (T3) process to each partition. We also
1767 propose the $T3_{metric}$, which is developed to quantify the structural and
1768 functional design transformation level caused by the obfuscation process. We
1769 evaluate SARO on ISCAS85 and EPFL benchmarks, and provide full security and
1770 performance analysis of our proposed framework.
1771 </p>
1772 </description>
1773 <guid isPermaLink="false">oai:arXiv.org:2010.15329</guid>
1774 </item>
1775 <item>
1776 <title>Learning Sampling Distributions Using Local 3D Workspace Decompositions for Motion Planning in High Dimensions. (arXiv:2010.15335v1 [cs.RO])</title>
1777 <link>http://fr.arxiv.org/abs/2010.15335</link>
1778 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chamzas_C/0/1/0/all/0/1">Constantinos Chamzas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kingston_Z/0/1/0/all/0/1">Zachary Kingston</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Quintero_Pena_C/0/1/0/all/0/1">Carlos Quintero-Pe&#xf1;a</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shrivastava_A/0/1/0/all/0/1">Anshumali Shrivastava</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kavraki_L/0/1/0/all/0/1">Lydia E. Kavraki</a></p>
1779
1780 <p>Earlier work has shown that reusing experience from prior motion planning
1781 problems can improve the efficiency of similar, future motion planning queries.
1782 However, for robots with many degrees-of-freedom, these methods exhibit poor
1783 generalization across different environments and often require large datasets
1784 that are impractical to gather. We present SPARK and FLAME , two
1785 experience-based frameworks for sampling-based planning applicable to complex
1786 manipulators in 3 D environments. Both combine samplers associated with
1787 features from a workspace decomposition into a global biased sampling
1788 distribution. SPARK decomposes the environment based on exact geometry while
1789 FLAME is more general, and uses an octree-based decomposition obtained from
1790 sensor data. We demonstrate the effectiveness of SPARK and FLAME on a Fetch
1791 robot tasked with challenging pick-and-place manipulation problems. Our
1792 approaches can be trained incrementally and significantly improve performance
1793 with only a handful of examples, generalizing better over diverse tasks and
1794 environments as compared to prior approaches.
1795 </p>
1796 </description>
1797 <guid isPermaLink="false">oai:arXiv.org:2010.15335</guid>
1798 </item>
1799 <item>
1800 <title>SAR-NAS: Skeleton-based Action Recognition via Neural Architecture Searching. (arXiv:2010.15336v1 [cs.CV])</title>
1801 <link>http://fr.arxiv.org/abs/2010.15336</link>
1802 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_H/0/1/0/all/0/1">Haoyuan Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hou_Y/0/1/0/all/0/1">Yonghong Hou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_P/0/1/0/all/0/1">Pichao Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_Z/0/1/0/all/0/1">Zihui Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_W/0/1/0/all/0/1">Wanqing Li</a></p>
1803
1804 <p>This paper presents a study of automatic design of neural network
1805 architectures for skeleton-based action recognition. Specifically, we encode a
1806 skeleton-based action instance into a tensor and carefully define a set of
1807 operations to build two types of network cells: normal cells and reduction
1808 cells. The recently developed DARTS (Differentiable Architecture Search) is
1809 adopted to search for an effective network architecture that is built upon the
1810 two types of cells. All operations are 2D based in order to reduce the overall
1811 computation and search space. Experiments on the challenging NTU RGB+D and
1812 Kinectics datasets have verified that most of the networks developed to date
1813 for skeleton-based action recognition are likely not compact and efficient. The
1814 proposed method provides an approach to search for such a compact network that
1815 is able to achieve comparative or even better performance than the
1816 state-of-the-art methods.
1817 </p>
1818 </description>
1819 <guid isPermaLink="false">oai:arXiv.org:2010.15336</guid>
1820 </item>
1821 <item>
1822 <title>A New "Model-Free" Method Combined with Neural Network for MIMO Systems. (arXiv:2010.15338v1 [eess.SY])</title>
1823 <link>http://fr.arxiv.org/abs/2010.15338</link>
1824 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_F/0/1/0/all/0/1">Feilong Zhang</a></p>
1825
1826 <p>In this brief, a model-free adaptive predictive control (MFAPC) is proposed.
1827 It outperforms the current model-free adaptive control (MFAC) for not only
1828 solving the time delay problem in multiple-input multiple-output (MIMO) systems
1829 but also relaxing the current rigorous assumptions for sake of a wider
1830 applicable range. The most attractive merit of the proposed controller is that
1831 the controller design, performance analysis and applications are easy for
1832 engineers to realize. Furthermore, the problem of how to choose the matrix
1833 {\lambda} is finished by analyzing the function of the closed-loop poles rather
1834 than the previous contraction mapping method. Additionally, in view of the
1835 nonlinear modeling capability and adaptability of neural networks (NNs), we
1836 combine these two classes of algorithms together. The feasibility and several
1837 interesting results of the proposed method are shown in simulations.
1838 </p>
1839 </description>
1840 <guid isPermaLink="false">oai:arXiv.org:2010.15338</guid>
1841 </item>
1842 <item>
1843 <title>Identifying safe intersection design through unsupervised feature extraction from satellite imagery. (arXiv:2010.15343v1 [cs.CV])</title>
1844 <link>http://fr.arxiv.org/abs/2010.15343</link>
1845 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wijnands_J/0/1/0/all/0/1">Jasper S. Wijnands</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_H/0/1/0/all/0/1">Haifeng Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nice_K/0/1/0/all/0/1">Kerry A. Nice</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Thompson_J/0/1/0/all/0/1">Jason Thompson</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Scully_K/0/1/0/all/0/1">Katherine Scully</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_J/0/1/0/all/0/1">Jingqiu Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stevenson_M/0/1/0/all/0/1">Mark Stevenson</a></p>
1846
1847 <p>The World Health Organization has listed the design of safer intersections as
1848 a key intervention to reduce global road trauma. This article presents the
1849 first study to systematically analyze the design of all intersections in a
1850 large country, based on aerial imagery and deep learning. Approximately 900,000
1851 satellite images were downloaded for all intersections in Australia and
1852 customized computer vision techniques emphasized the road infrastructure. A
1853 deep autoencoder extracted high-level features, including the intersection's
1854 type, size, shape, lane markings, and complexity, which were used to cluster
1855 similar designs. An Australian telematics data set linked infrastructure design
1856 to driving behaviors captured during 66 million kilometers of driving. This
1857 showed more frequent hard acceleration events (per vehicle) at four- than
1858 three-way intersections, relatively low hard deceleration frequencies at
1859 T-intersections, and consistently low average speeds on roundabouts. Overall,
1860 domain-specific feature extraction enabled the identification of infrastructure
1861 improvements that could result in safer driving behaviors, potentially reducing
1862 road trauma.
1863 </p>
1864 </description>
1865 <guid isPermaLink="false">oai:arXiv.org:2010.15343</guid>
1866 </item>
1867 <item>
1868 <title>Sea-Net: Squeeze-And-Excitation Attention Net For Diabetic Retinopathy Grading. (arXiv:2010.15344v1 [cs.CV])</title>
1869 <link>http://fr.arxiv.org/abs/2010.15344</link>
1870 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_Z/0/1/0/all/0/1">Ziyuan Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chopra_K/0/1/0/all/0/1">Kartik Chopra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zeng_Z/0/1/0/all/0/1">Zeng Zeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_X/0/1/0/all/0/1">Xiaoli Li</a></p>
1871
1872 <p>Diabetes is one of the most common disease in individuals. \textit{Diabetic
1873 retinopathy} (DR) is a complication of diabetes, which could lead to blindness.
1874 Automatic DR grading based on retinal images provides a great diagnostic and
1875 prognostic value for treatment planning. However, the subtle differences among
1876 severity levels make it difficult to capture important features using
1877 conventional methods. To alleviate the problems, a new deep learning
1878 architecture for robust DR grading is proposed, referred to as SEA-Net, in
1879 which, spatial attention and channel attention are alternatively carried out
1880 and boosted with each other, improving the classification performance. In
1881 addition, a hybrid loss function is proposed to further maximize the
1882 inter-class distance and reduce the intra-class variability. Experimental
1883 results have shown the effectiveness of the proposed architecture.
1884 </p>
1885 </description>
1886 <guid isPermaLink="false">oai:arXiv.org:2010.15344</guid>
1887 </item>
1888 <item>
1889 <title>Developing Augmented Reality based Gaming Model to Teach Ethical Education in Primary Schools. (arXiv:2010.15346v1 [cs.CY])</title>
1890 <link>http://fr.arxiv.org/abs/2010.15346</link>
1891 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ali_M/0/1/0/all/0/1">Mohammad Ali</a></p>
1892
1893 <p>Education sector is adopting new technologies for both teaching and learning
1894 pedagogy. Augmented Reality (AR) is a new technology that can be used in the
1895 educational pedagogy to enhance the engagement with students. Students interact
1896 with AR-based educational material for more visualization and explanation.
1897 Therefore, the use of AR in education is becoming more popular. However, most
1898 researches narrate the use of AR technologies in the field of English, Maths,
1899 Science, Culture, Arts, and History education but the absence of ethical
1900 education is visible. In our paper, we design the system and develop an
1901 AR-based mobile game model in the field of Ethical education for pre-primary
1902 students. Students from pre-primary require more interactive lessons than
1903 theoretical concepts. So, we use AR technology to develop a game which offers
1904 interactive procedures where students can learn with fun and engage with the
1905 context. Finally, we develop a prototype that works with our research
1906 objective. We conclude our paper with future works.
1907 </p>
1908 </description>
1909 <guid isPermaLink="false">oai:arXiv.org:2010.15346</guid>
1910 </item>
1911 <item>
1912 <title>Distance Invariant Sparse Autoencoder for Wireless Signal Strength Mapping. (arXiv:2010.15347v1 [eess.SP])</title>
1913 <link>http://fr.arxiv.org/abs/2010.15347</link>
1914 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Miyagusuku_R/0/1/0/all/0/1">Renato Miyagusuku</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ozaki_K/0/1/0/all/0/1">Koichi Ozaki</a></p>
1915
1916 <p>Wireless signal strength based localization can enable robust localization
1917 for robots using inexpensive sensors. For this, a location-to-signal-strength
1918 map has to be learned for each access point in the environment. Due to the
1919 ubiquity of Wireless networks in most environments, this can result in tens or
1920 hundreds of maps. To reduce the dimensionality of this problem, we employ
1921 autoencoders, which are a popular unsupervised approach for feature extraction
1922 and data compression. In particular, we propose the use of sparse autoencoders
1923 that learn latent spaces that preserve the relative distance between inputs.
1924 Distance invariance between input and latent spaces allows our system to
1925 successfully learn compact representations that allow precise data
1926 reconstruction but also have a low impact on localization performance when
1927 using maps from the latent space rather than the input space. We demonstrate
1928 the feasibility of our approach by performing experiments in outdoor
1929 environments.
1930 </p>
1931 </description>
1932 <guid isPermaLink="false">oai:arXiv.org:2010.15347</guid>
1933 </item>
1934 <item>
1935 <title>A Hybrid Position/Force Controller for Joint Robots. (arXiv:2010.15350v1 [cs.RO])</title>
1936 <link>http://fr.arxiv.org/abs/2010.15350</link>
1937 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_S/0/1/0/all/0/1">Shengwen Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_J/0/1/0/all/0/1">Juan Ren</a></p>
1938
1939 <p>In this paper, we present a hybrid position/force controller for operating
1940 joint robots. The hybrid controller has two goals---motion tracking and force
1941 regulating. As long as these two goals are not mutually exclusive, they can be
1942 decoupled in some way. In this work, we make use of the smooth and invertible
1943 mapping from joint space to task space to decouple the two control goals and
1944 design controllers separately. The traditional motion controller in task space
1945 is used for motion control, while the force controller is designed through
1946 manipulating the desired trajectory to regulate the force indirectly. Two case
1947 studies---contour tracking/polishing surfaces and grabbing boxes with two
1948 robotic arms---are presented to show the efficacy of the hybrid controller, and
1949 simulations with physics engines are carried out to validate the efficacy of
1950 the proposed method.
1951 </p>
1952 </description>
1953 <guid isPermaLink="false">oai:arXiv.org:2010.15350</guid>
1954 </item>
1955 <item>
1956 <title>An automated and multi-parametric algorithm for objective analysis of meibography images. (arXiv:2010.15352v1 [eess.IV])</title>
1957 <link>http://fr.arxiv.org/abs/2010.15352</link>
1958 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Xiao_P/0/1/0/all/0/1">Peng Xiao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Luo_Z/0/1/0/all/0/1">Zhongzhou Luo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Deng_Y/0/1/0/all/0/1">Yuqing Deng</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_G/0/1/0/all/0/1">Gengyuan Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Yuan_J/0/1/0/all/0/1">Jin Yuan</a></p>
1959
1960 <p>Meibography is a non-contact imaging technique used by ophthalmologists to
1961 assist in the evaluation and diagnosis of meibomian gland dysfunction (MGD).
1962 While artificial qualitative analysis of meibography images could lead to low
1963 repeatability and efficiency and multi-parametric analysis is demanding to
1964 offer more comprehensive information in discovering subtle changes of meibomian
1965 glands during MGD progression, we developed an automated and multi-parametric
1966 algorithm for objective and quantitative analysis of meibography images. The
1967 full architecture of the algorithm can be divided into three steps: (1)
1968 segmentation of the tarsal conjunctiva area as the region of interest (ROI);
1969 (2) segmentation and identification of glands within the ROI; and (3)
1970 quantitative multi-parametric analysis including newly defined gland diameter
1971 deformation index (DI), gland tortuosity index (TI), and glands signal index
1972 (SI). To evaluate the performance of the automated algorithm, the similarity
1973 index (k) and the segmentation error including the false positive rate (r_P)
1974 and the false negative rate (r_N) are calculated between the manually defined
1975 ground truth and the automatic segmentations of both the ROI and meibomian
1976 glands of 15 typical meibography images. The feasibility of the algorithm is
1977 demonstrated in analyzing typical meibograhy images.
1978 </p>
1979 </description>
1980 <guid isPermaLink="false">oai:arXiv.org:2010.15352</guid>
1981 </item>
1982 <item>
1983 <title>Domain decomposition and partitioning methods for mixed finite element discretizations of the Biot system of poroelasticity. (arXiv:2010.15353v1 [math.NA])</title>
1984 <link>http://fr.arxiv.org/abs/2010.15353</link>
1985 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Jayadharan_M/0/1/0/all/0/1">Manu Jayadharan</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Khattatov_E/0/1/0/all/0/1">Eldar Khattatov</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Yotov_I/0/1/0/all/0/1">Ivan Yotov</a></p>
1986
1987 <p>We develop non-overlapping domain decomposition methods for the Biot system
1988 of poroelasticity in a mixed form. The solid deformation is modeled with a
1989 mixed three-field formulation with weak stress symmetry. The fluid flow is
1990 modeled with a mixed Darcy formulation. We introduce displacement and pressure
1991 Lagrange multipliers on the subdomain interfaces to impose weakly continuity of
1992 normal stress and normal velocity, respectively. The global problem is reduced
1993 to an interface problem for the Lagrange multipliers, which is solved by a
1994 Krylov space iterative method. We study both monolithic and split methods. In
1995 the monolithic method, a coupled displacement-pressure interface problem is
1996 solved, with each iteration requiring the solution of local Biot problems. We
1997 show that the resulting interface operator is positive definite and analyze the
1998 convergence of the iteration. We further study drained split and fixed stress
1999 Biot splittings, in which case we solve separate interface problems requiring
2000 elasticity and Darcy solves. We analyze the stability of the split
2001 formulations. Numerical experiments are presented to illustrate the convergence
2002 of the domain decomposition methods and compare their accuracy and efficiency.
2003 </p>
2004 </description>
2005 <guid isPermaLink="false">oai:arXiv.org:2010.15353</guid>
2006 </item>
2007 <item>
2008 <title>Reconfigurable Intelligent Surface Aided Secure Transmission: Outage-Constrained Energy-Efficiency Maximization. (arXiv:2010.15354v1 [cs.IT])</title>
2009 <link>http://fr.arxiv.org/abs/2010.15354</link>
2010 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Z/0/1/0/all/0/1">Zongze Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shuai Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_M/0/1/0/all/0/1">Miaowen Wen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Y/0/1/0/all/0/1">Yik-Chung Wu</a></p>
2011
2012 <p>Reconfigurable intelligent surface (RIS) has the potential to significantly
2013 enhance the network secure transmission performance by reconfiguring the
2014 wireless propagation environment. However, due to the passive nature of
2015 eavesdroppers and the cascaded channel brought by the RIS, the eavesdroppers'
2016 channel state information is imperfectly obtained at the base station. Under
2017 the channel uncertainty, the optimal phase-shift, power allocation, and
2018 transmission rate design for secure transmission is currently unknown due to
2019 the difficulty of handling the probabilistic constraint with coupled variables.
2020 To fill this gap, this paper formulates a problem of energy-efficient secure
2021 transmission design while incorporating the probabilistic constraint. By
2022 transforming the probabilistic constraint and decoupling variables, the secure
2023 energy efficiency maximization problem can be solved via alternatively
2024 executing difference-of-convex programming and semidefinite relaxation
2025 technique. To scale the solution to massive antennas and reflecting elements
2026 scenario, a fast first-order algorithm with low complexity is further proposed.
2027 Simulation results show that the proposed first-order algorithm achieves
2028 identical performance to the conventional method but saves at least two orders
2029 of magnitude in computation time. Moreover, the resultant RIS aided secure
2030 transmission significantly improves the energy efficiency compared to baseline
2031 schemes of random phase-shift, fixed phase-shift, and RIS ignoring CSI
2032 uncertainty.
2033 </p>
2034 </description>
2035 <guid isPermaLink="false">oai:arXiv.org:2010.15354</guid>
2036 </item>
2037 <item>
2038 <title>Financial ticket intelligent recognition system based on deep learning. (arXiv:2010.15356v1 [cs.LG])</title>
2039 <link>http://fr.arxiv.org/abs/2010.15356</link>
2040 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_F/0/1/0/all/0/1">Fukang Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_H/0/1/0/all/0/1">Haiyu Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_B/0/1/0/all/0/1">Bo Xu</a></p>
2041
2042 <p>Facing the rapid growth in the issuance of financial tickets (or bills,
2043 invoices etc.), traditional manual invoice reimbursement and financial
2044 accounting system are imposing an increasing burden on financial accountants
2045 and consuming excessive manpower. To solve this problem, we proposes an
2046 iterative self-learning Framework of Financial Ticket intelligent Recognition
2047 System (FFTRS), which can support the fast iterative updating and extensibility
2048 of the algorithm model, which are the fundamental requirements for a practical
2049 financial accounting system. In addition, we designed a simple yet efficient
2050 Financial Ticket Faster Detection network (FTFDNet) and an intelligent data
2051 warehouse of financial ticket are designed to strengthen its efficiency and
2052 performance. At present, the system can recognize 194 kinds of financial
2053 tickets and has an automatic iterative optimization mechanism, which means,
2054 with the increase of application time, the types of tickets supported by the
2055 system will continue to increase, and the accuracy of recognition will continue
2056 to improve. Experimental results show that the average recognition accuracy of
2057 the system is 97.07%, and the average running time for a single ticket is
2058 175.67ms. The practical value of the system has been tested in a commercial
2059 application, which makes a beneficial attempt for the deep learning technology
2060 in financial accounting work.
2061 </p>
2062 </description>
2063 <guid isPermaLink="false">oai:arXiv.org:2010.15356</guid>
2064 </item>
2065 <item>
2066 <title>A stochastic optimization algorithm for analyzing planar central and balanced configurations in the $n$-body problem. (arXiv:2010.15358v1 [math.DS])</title>
2067 <link>http://fr.arxiv.org/abs/2010.15358</link>
2068 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Doicu_A/0/1/0/all/0/1">Alexandru Doicu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Zhao_L/0/1/0/all/0/1">Lei Zhao</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Doicu_A/0/1/0/all/0/1">Adrian Doicu</a></p>
2069
2070 <p>A stochastic optimization algorithm for analyzing planar central and balanced
2071 configurations in the $n$-body problem is presented. We find a comprehensive
2072 list of equal mass central configurations satisfying the Morse equality up to
2073 $n=12$. We show some exemplary balanced configurations in the case $n=5$, as
2074 well as some balanced configurations without any axis of symmetry in the cases
2075 $n=4$ and $n=10$.
2076 </p>
2077 </description>
2078 <guid isPermaLink="false">oai:arXiv.org:2010.15358</guid>
2079 </item>
2080 <item>
2081 <title>Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection. (arXiv:2010.15360v1 [cs.CL])</title>
2082 <link>http://fr.arxiv.org/abs/2010.15360</link>
2083 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shaolei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zhongyuan Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Che_W/0/1/0/all/0/1">Wanxiang Che</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_T/0/1/0/all/0/1">Ting Liu</a></p>
2084
2085 <p>Most existing approaches to disfluency detection heavily rely on
2086 human-annotated corpora, which is expensive to obtain in practice. There have
2087 been several proposals to alleviate this issue with, for instance,
2088 self-supervised learning techniques, but they still require human-annotated
2089 corpora. In this work, we explore the unsupervised learning paradigm which can
2090 potentially work with unlabeled text corpora that are cheaper and easier to
2091 obtain. Our model builds upon the recent work on Noisy Student Training, a
2092 semi-supervised learning approach that extends the idea of self-training.
2093 Experimental results on the commonly used English Switchboard test set show
2094 that our approach achieves competitive performance compared to the previous
2095 state-of-the-art supervised systems using contextualized word embeddings (e.g.
2096 BERT and ELECTRA).
2097 </p>
2098 </description>
2099 <guid isPermaLink="false">oai:arXiv.org:2010.15360</guid>
2100 </item>
2101 <item>
2102 <title>Model-Agnostic Counterfactual Reasoning for Eliminating Popularity Bias in Recommender System. (arXiv:2010.15363v1 [cs.IR])</title>
2103 <link>http://fr.arxiv.org/abs/2010.15363</link>
2104 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_T/0/1/0/all/0/1">Tianxin Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_F/0/1/0/all/0/1">Fuli Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_J/0/1/0/all/0/1">Jiawei Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shi_C/0/1/0/all/0/1">Chufeng Shi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Z/0/1/0/all/0/1">Ziwei Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yi_J/0/1/0/all/0/1">Jinfeng Yi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_X/0/1/0/all/0/1">Xiangnan He</a></p>
2105
2106 <p>The general aim of the recommender system is to provide personalized
2107 suggestions to users, which is opposed to suggesting popular items. However,
2108 the normal training paradigm, i.e., fitting a recommender model to recover the
2109 user behavior data with pointwise or pairwise loss, makes the model biased
2110 towards popular items. This results in the terrible Matthew effect, making
2111 popular items be more frequently recommended and become even more popular.
2112 Existing work addresses this issue with Inverse Propensity Weighting (IPW),
2113 which decreases the impact of popular items on the training and increases the
2114 impact of long-tail items. Although theoretically sound, IPW methods are highly
2115 sensitive to the weighting strategy, which is notoriously difficult to tune.
2116 </p>
2117 <p>In this work, we explore the popularity bias issue from a novel and
2118 fundamental perspective -- cause-effect. We identify that popularity bias lies
2119 in the direct effect from the item node to the ranking score, such that an
2120 item's intrinsic property is the cause of mistakenly assigning it a higher
2121 ranking score. To eliminate popularity bias, it is essential to answer the
2122 counterfactual question that what the ranking score would be if the model only
2123 uses item property. To this end, we formulate a causal graph to describe the
2124 important cause-effect relations in the recommendation process. During
2125 training, we perform multi-task learning to achieve the contribution of each
2126 cause; during testing, we perform counterfactual inference to remove the effect
2127 of item popularity. Remarkably, our solution amends the learning process of
2128 recommendation which is agnostic to a wide range of models. We demonstrate it
2129 on Matrix Factorization (MF) and LightGCN, which are representative of the
2130 conventional and state-of-the-art model for collaborative filtering.
2131 Experiments on five real-world datasets demonstrate the effectiveness of our
2132 method.
2133 </p>
2134 </description>
2135 <guid isPermaLink="false">oai:arXiv.org:2010.15363</guid>
2136 </item>
2137 <item>
2138 <title>Online State-Time Trajectory Planning Using Timed-ESDF in Highly Dynamic Environments. (arXiv:2010.15364v1 [cs.RO])</title>
2139 <link>http://fr.arxiv.org/abs/2010.15364</link>
2140 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_D/0/1/0/all/0/1">Delong Zhu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_T/0/1/0/all/0/1">Tong Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_J/0/1/0/all/0/1">Jiahui Lin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fang_Y/0/1/0/all/0/1">Yuqi Fang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Meng_M/0/1/0/all/0/1">Max Q.-H. Meng</a></p>
2141
2142 <p>Online state-time trajectory planning in highly dynamic environments remains
2143 an unsolved problem due to the unpredictable motions of moving obstacles and
2144 the curse of dimensionality from the state-time space. Existing state-time
2145 planners are typically implemented based on randomized sampling approaches or
2146 path searching on discretized state graph. The smoothness, path clearance, and
2147 planning efficiency of these planners are usually not satisfying. In this work,
2148 we propose a gradient-based planner over the state-time space for online
2149 trajectory generation in highly dynamic environments. To enable the
2150 gradient-based optimization, we propose a Timed-ESDT that supports distance and
2151 gradient queries with state-time keys. Based on the Timed-ESDT, we also define
2152 a smooth prior and an obstacle likelihood function that is compatible with the
2153 state-time space. The trajectory planning is then formulated to a MAP problem
2154 and solved by an efficient numerical optimizer. Moreover, to improve the
2155 optimality of the planner, we also define a state-time graph and then conduct
2156 path searching on it to find a better initialization for the optimizer. By
2157 integrating the graph searching, the planning quality is significantly
2158 improved. Experiment results on simulated and benchmark datasets show that our
2159 planner can outperform the state-of-the-art methods, demonstrating its
2160 significant advantages over the traditional ones.
2161 </p>
2162 </description>
2163 <guid isPermaLink="false">oai:arXiv.org:2010.15364</guid>
2164 </item>
2165 <item>
2166 <title>Infinite Time Solutions of Numerical Schemes for Advection Problems. (arXiv:2010.15365v1 [math.NA])</title>
2167 <link>http://fr.arxiv.org/abs/2010.15365</link>
2168 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Biswas_A/0/1/0/all/0/1">Abhijit Biswas</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Seibold_B/0/1/0/all/0/1">Benjamin Seibold</a></p>
2169
2170 <p>This paper addresses the question whether there are numerical schemes for
2171 constant-coefficient advection problems that can yield convergent solutions for
2172 an infinite time horizon. The motivation is that such methods may serve as
2173 building blocks for long-time accurate solutions in more complex
2174 advection-dominated problems. After establishing a new notion of convergence in
2175 an infinite time limit of numerical methods, we first show that linear methods
2176 cannot meet this convergence criterion. Then we present a new numerical
2177 methodology, based on a nonlinear jet scheme framework. We show that these
2178 methods do satisfy the new convergence criterion, thus establishing that
2179 numerical methods exist that converge on an infinite time horizon, and
2180 demonstrate the long-time accuracy gains incurred by this property.
2181 </p>
2182 </description>
2183 <guid isPermaLink="false">oai:arXiv.org:2010.15365</guid>
2184 </item>
2185 <item>
2186 <title>Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation. (arXiv:2010.15366v1 [cs.SD])</title>
2187 <link>http://fr.arxiv.org/abs/2010.15366</link>
2188 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_S/0/1/0/all/0/1">Sung-Feng Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chuang_S/0/1/0/all/0/1">Shun-Po Chuang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_D/0/1/0/all/0/1">Da-Rong Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yi-Chen Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_G/0/1/0/all/0/1">Gene-Ping Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_H/0/1/0/all/0/1">Hung-yi Lee</a></p>
2189
2190 <p>Speech separation has been well-developed while there are still problems
2191 waiting to be solved. The main problem we focus on in this paper is the
2192 frequent label permutation switching of permutation invariant training (PIT).
2193 For N-speaker separation, there would be N! possible label permutations. How to
2194 stably select correct label permutations is a long-standing problem. In this
2195 paper, we utilize self-supervised pre-training to stabilize the label
2196 permutations. Among several types of self-supervised tasks, speech enhancement
2197 based pre-training tasks show significant effectiveness in our experiments.
2198 When using off-the-shelf pre-trained models, training duration could be
2199 shortened to one-third to two-thirds. Furthermore, even taking pre-training
2200 time into account, the entire training process could still be shorter without a
2201 performance drop when using a larger batch size.
2202 </p>
2203 </description>
2204 <guid isPermaLink="false">oai:arXiv.org:2010.15366</guid>
2205 </item>
2206 <item>
2207 <title>Learning Centric Wireless Resource Allocation for Edge Computing: Algorithm and Experiment. (arXiv:2010.15371v1 [cs.IT])</title>
2208 <link>http://fr.arxiv.org/abs/2010.15371</link>
2209 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_L/0/1/0/all/0/1">Liangkai Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hong_Y/0/1/0/all/0/1">Yuncong Hong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shuai Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Han_R/0/1/0/all/0/1">Ruihua Han</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_D/0/1/0/all/0/1">Dachuan Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_R/0/1/0/all/0/1">Rui Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hao_Q/0/1/0/all/0/1">Qi Hao</a></p>
2210
2211 <p>Edge intelligence is an emerging network architecture that integrates
2212 sensing, communication, computing components, and supports various machine
2213 learning applications, where a fundamental communication question is: how to
2214 allocate the limited wireless resources (such as time, energy) to the
2215 simultaneous model training of heterogeneous learning tasks? Existing methods
2216 ignore two important facts: 1) different models have heterogeneous demands on
2217 training data; 2) there is a mismatch between the simulated environment and the
2218 real-world environment. As a result, they could lead to low learning
2219 performance in practice. This paper proposes the learning centric wireless
2220 resource allocation (LCWRA) scheme that maximizes the worst learning
2221 performance of multiple classification tasks. Analysis shows that the optimal
2222 transmission time has an inverse power relationship with respect to the
2223 classification error. Finally, both simulation and experimental results are
2224 provided to verify the performance of the proposed LCWRA scheme and its
2225 robustness in real implementation.
2226 </p>
2227 </description>
2228 <guid isPermaLink="false">oai:arXiv.org:2010.15371</guid>
2229 </item>
2230 <item>
2231 <title>Learning Personalized Discretionary Lane-Change Initiation for Fully Autonomous Driving Based on Reinforcement Learning. (arXiv:2010.15372v1 [cs.HC])</title>
2232 <link>http://fr.arxiv.org/abs/2010.15372</link>
2233 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Z/0/1/0/all/0/1">Zhuoxi Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zheng Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_B/0/1/0/all/0/1">Bo Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nakano_K/0/1/0/all/0/1">Kimihiko Nakano</a></p>
2234
2235 <p>In this article, the authors present a novel method to learn the personalized
2236 tactic of discretionary lane-change initiation for fully autonomous vehicles
2237 through human-computer interactions. Instead of learning from human-driving
2238 demonstrations, a reinforcement learning technique is employed to learn how to
2239 initiate lane changes from traffic context, the action of a self-driving
2240 vehicle, and in-vehicle user feedback. The proposed offline algorithm rewards
2241 the action-selection strategy when the user gives positive feedback and
2242 penalizes it when negative feedback. Also, a multi-dimensional driving scenario
2243 is considered to represent a more realistic lane-change trade-off. The results
2244 show that the lane-change initiation model obtained by this method can
2245 reproduce the personal lane-change tactic, and the performance of the
2246 customized models (average accuracy 86.1%) is much better than that of the
2247 non-customized models (average accuracy 75.7%). This method allows continuous
2248 improvement of customization for users during fully autonomous driving even
2249 without human-driving experience, which will significantly enhance the user
2250 acceptance of high-level autonomy of self-driving vehicles.
2251 </p>
2252 </description>
2253 <guid isPermaLink="false">oai:arXiv.org:2010.15372</guid>
2254 </item>
2255 <item>
2256 <title>Solving Sparse Linear Inverse Problems in Communication Systems: A Deep Learning Approach With Adaptive Depth. (arXiv:2010.15376v1 [eess.SP])</title>
2257 <link>http://fr.arxiv.org/abs/2010.15376</link>
2258 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Chen_W/0/1/0/all/0/1">Wei Chen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_B/0/1/0/all/0/1">Bowen Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jin_S/0/1/0/all/0/1">Shi Jin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ai_B/0/1/0/all/0/1">Bo Ai</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhong_Z/0/1/0/all/0/1">Zhangdui Zhong</a></p>
2259
2260 <p>Sparse signal recovery problems from noisy linear measurements appear in many
2261 areas of wireless communications. In recent years, deep learning (DL) based
2262 approaches have attracted interests of researchers to solve the sparse linear
2263 inverse problem by unfolding iterative algorithms as neural networks.
2264 Typically, research concerning DL assume a fixed number of network layers.
2265 However, it ignores a key character in traditional iterative algorithms, where
2266 the number of iterations required for convergence changes with varying sparsity
2267 levels. By investigating on the projected gradient descent, we unveil the
2268 drawbacks of the existing DL methods with fixed depth. Then we propose an
2269 end-to-end trainable DL architecture, which involves an extra halting score at
2270 each layer. Therefore, the proposed method learns how many layers to execute to
2271 emit an output, and the network depth is dynamically adjusted for each task in
2272 the inference phase. We conduct experiments using both synthetic data and
2273 applications including random access in massive MTC and massive MIMO channel
2274 estimation, and the results demonstrate the improved efficiency for the
2275 proposed approach.
2276 </p>
2277 </description>
2278 <guid isPermaLink="false">oai:arXiv.org:2010.15376</guid>
2279 </item>
2280 <item>
2281 <title>Supervised sequential pattern mining of event sequences in sport to identify important patterns of play: an application to rugby union. (arXiv:2010.15377v1 [cs.LG])</title>
2282 <link>http://fr.arxiv.org/abs/2010.15377</link>
2283 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bunker_R/0/1/0/all/0/1">Rory Bunker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fujii_K/0/1/0/all/0/1">Keisuke Fujii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hanada_H/0/1/0/all/0/1">Hiroyuki Hanada</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Takeuchi_I/0/1/0/all/0/1">Ichiro Takeuchi</a></p>
2284
2285 <p>Given a set of sequences comprised of time-ordered events, sequential pattern
2286 mining is useful to identify frequent sub-sequences from different sequences or
2287 within the same sequence. However, in sport, these techniques cannot determine
2288 the importance of particular patterns of play to good or bad outcomes, which is
2289 often of greater interest to coaches. In this study, we apply a supervised
2290 sequential pattern mining algorithm called safe pattern pruning (SPP) to 490
2291 labelled event sequences representing passages of play from one rugby team's
2292 matches from the 2018 Japan Top League, and then evaluate the importance of the
2293 obtained sub-sequences to points-scoring outcomes. Linebreaks, successful
2294 lineouts, regained kicks in play, repeated phase-breakdown play, and failed
2295 opposition exit plays were identified as important patterns of play for the
2296 team scoring. When sequences were labelled with points scoring outcomes for the
2297 opposition teams, opposition team linebreaks, errors made by the team,
2298 opposition team lineouts, and repeated phase-breakdown play by the opposition
2299 team were identified as important patterns of play for the opposition team
2300 scoring. By virtue of its supervised nature and pruning properties, SPP
2301 obtained a greater variety of generally more sophisticated patterns than the
2302 well-known unsupervised PrefixSpan algorithm.
2303 </p>
2304 </description>
2305 <guid isPermaLink="false">oai:arXiv.org:2010.15377</guid>
2306 </item>
2307 <item>
2308 <title>Collaborative Method for Incremental Learning on Classification and Generation. (arXiv:2010.15378v1 [cs.CV])</title>
2309 <link>http://fr.arxiv.org/abs/2010.15378</link>
2310 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_B/0/1/0/all/0/1">Byungju Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_J/0/1/0/all/0/1">Jaeyoung Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_K/0/1/0/all/0/1">Kyungsu Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_S/0/1/0/all/0/1">Sungjin Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_J/0/1/0/all/0/1">Junmo Kim</a></p>
2311
2312 <p>Although well-trained deep neural networks have shown remarkable performance
2313 on numerous tasks, they rapidly forget what they have learned as soon as they
2314 begin to learn with additional data with the previous data stop being provided.
2315 In this paper, we introduce a novel algorithm, Incremental Class Learning with
2316 Attribute Sharing (ICLAS), for incremental class learning with deep neural
2317 networks. As one of its component, we also introduce a generative model,
2318 incGAN, which can generate images with increased variety compared with the
2319 training data. Under challenging environment of data deficiency, ICLAS
2320 incrementally trains classification and the generation networks. Since ICLAS
2321 trains both networks, our algorithm can perform multiple times of incremental
2322 class learning. The experiments on MNIST dataset demonstrate the advantages of
2323 our algorithm.
2324 </p>
2325 </description>
2326 <guid isPermaLink="false">oai:arXiv.org:2010.15378</guid>
2327 </item>
2328 <item>
2329 <title>The Performance Analysis of Generalized Margin Maximizer (GMM) on Separable Data. (arXiv:2010.15379v1 [stat.ML])</title>
2330 <link>http://fr.arxiv.org/abs/2010.15379</link>
2331 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Salehi_F/0/1/0/all/0/1">Fariborz Salehi</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Abbasi_E/0/1/0/all/0/1">Ehsan Abbasi</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Hassibi_B/0/1/0/all/0/1">Babak Hassibi</a></p>
2332
2333 <p>Logistic models are commonly used for binary classification tasks. The
2334 success of such models has often been attributed to their connection to
2335 maximum-likelihood estimators. It has been shown that gradient descent
2336 algorithm, when applied on the logistic loss, converges to the max-margin
2337 classifier (a.k.a. hard-margin SVM). The performance of the max-margin
2338 classifier has been recently analyzed. Inspired by these results, in this
2339 paper, we present and study a more general setting, where the underlying
2340 parameters of the logistic model possess certain structures (sparse,
2341 block-sparse, low-rank, etc.) and introduce a more general framework (which is
2342 referred to as "Generalized Margin Maximizer", GMM). While classical max-margin
2343 classifiers minimize the $2$-norm of the parameter vector subject to linearly
2344 separating the data, GMM minimizes any arbitrary convex function of the
2345 parameter vector. We provide a precise analysis of the performance of GMM via
2346 the solution of a system of nonlinear equations. We also provide a detailed
2347 study for three special cases: ($1$) $\ell_2$-GMM that is the max-margin
2348 classifier, ($2$) $\ell_1$-GMM which encourages sparsity, and ($3$)
2349 $\ell_{\infty}$-GMM which is often used when the parameter vector has binary
2350 entries. Our theoretical results are validated by extensive simulation results
2351 across a range of parameter values, problem instances, and model structures.
2352 </p>
2353 </description>
2354 <guid isPermaLink="false">oai:arXiv.org:2010.15379</guid>
2355 </item>
2356 <item>
2357 <title>Learning to Actively Learn: A Robust Approach. (arXiv:2010.15382v1 [cs.LG])</title>
2358 <link>http://fr.arxiv.org/abs/2010.15382</link>
2359 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jifan Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jamieson_K/0/1/0/all/0/1">Kevin Jamieson</a></p>
2360
2361 <p>This work proposes a procedure for designing algorithms for specific adaptive
2362 data collection tasks like active learning and pure-exploration multi-armed
2363 bandits. Unlike the design of traditional adaptive algorithms that rely on
2364 concentration of measure and careful analysis to justify the correctness and
2365 sample complexity of the procedure, our adaptive algorithm is learned via
2366 adversarial training over equivalence classes of problems derived from
2367 information theoretic lower bounds. In particular, a single adaptive learning
2368 algorithm is learned that competes with the best adaptive algorithm learned for
2369 each equivalence class. Our procedure takes as input just the available
2370 queries, set of hypotheses, loss function, and total query budget. This is in
2371 contrast to existing meta-learning work that learns an adaptive algorithm
2372 relative to an explicit, user-defined subset or prior distribution over
2373 problems which can be challenging to define and be mismatched to the instance
2374 encountered at test time. This work is particularly focused on the regime when
2375 the total query budget is very small, such as a few dozen, which is much
2376 smaller than those budgets typically considered by theoretically derived
2377 algorithms. We perform synthetic experiments to justify the stability and
2378 effectiveness of the training procedure, and then evaluate the method on tasks
2379 derived from real data including a noisy 20 Questions game and a joke
2380 recommendation task.
2381 </p>
2382 </description>
2383 <guid isPermaLink="false">oai:arXiv.org:2010.15382</guid>
2384 </item>
2385 <item>
2386 <title>Prediction-Based Power Oversubscription in Cloud Platforms. (arXiv:2010.15388v1 [cs.DC])</title>
2387 <link>http://fr.arxiv.org/abs/2010.15388</link>
2388 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kumbhare_A/0/1/0/all/0/1">Alok Kumbhare</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Azimi_R/0/1/0/all/0/1">Reza Azimi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Manousakis_I/0/1/0/all/0/1">Ioannis Manousakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bonde_A/0/1/0/all/0/1">Anand Bonde</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frujeri_F/0/1/0/all/0/1">Felipe Frujeri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mahalingam_N/0/1/0/all/0/1">Nithish Mahalingam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Misra_P/0/1/0/all/0/1">Pulkit Misra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Javadi_S/0/1/0/all/0/1">Seyyed Ahmad Javadi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schroeder_B/0/1/0/all/0/1">Bianca Schroeder</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fontoura_M/0/1/0/all/0/1">Marcus Fontoura</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bianchini_R/0/1/0/all/0/1">Ricardo Bianchini</a></p>
2389
2390 <p>Datacenter designers rely on conservative estimates of IT equipment power
2391 draw to provision resources. This leaves resources underutilized and requires
2392 more datacenters to be built. Prior work has used power capping to shave the
2393 rare power peaks and add more servers to the datacenter, thereby
2394 oversubscribing its resources and lowering capital costs. This works well when
2395 the workloads and their server placements are known. Unfortunately, these
2396 factors are unknown in public clouds, forcing providers to limit the
2397 oversubscription so that performance is never impacted.
2398 </p>
2399 <p>In this paper, we argue that providers can use predictions of workload
2400 performance criticality and virtual machine (VM) resource utilization to
2401 increase oversubscription. This poses many challenges, such as identifying the
2402 performance-critical workloads from black-box VMs, creating support for
2403 criticality-aware power management, and increasing oversubscription while
2404 limiting the impact of capping. We address these challenges for the hardware
2405 and software infrastructures of Microsoft Azure. The results show that we
2406 enable a 2x increase in oversubscription with minimum impact to critical
2407 workloads.
2408 </p>
2409 </description>
2410 <guid isPermaLink="false">oai:arXiv.org:2010.15388</guid>
2411 </item>
2412 <item>
2413 <title>Learning Audio Embeddings with User Listening Data for Content-based Music Recommendation. (arXiv:2010.15389v1 [cs.SD])</title>
2414 <link>http://fr.arxiv.org/abs/2010.15389</link>
2415 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_K/0/1/0/all/0/1">Ke Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liang_B/0/1/0/all/0/1">Beici Liang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_X/0/1/0/all/0/1">Xiaoshuan Ma</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gu_M/0/1/0/all/0/1">Minwei Gu</a></p>
2416
2417 <p>Personalized recommendation on new track releases has always been a
2418 challenging problem in the music industry. To combat this problem, we first
2419 explore user listening history and demographics to construct a user embedding
2420 representing the user's music preference. With the user embedding and audio
2421 data from user's liked and disliked tracks, an audio embedding can be obtained
2422 for each track using metric learning with Siamese networks. For a new track, we
2423 can decide the best group of users to recommend by computing the similarity
2424 between the track's audio embedding and different user embeddings,
2425 respectively. The proposed system yields state-of-the-art performance on
2426 content-based music recommendation tested with millions of users and tracks.
2427 Also, we extract audio embeddings as features for music genre classification
2428 tasks. The results show the generalization ability of our audio embeddings.
2429 </p>
2430 </description>
2431 <guid isPermaLink="false">oai:arXiv.org:2010.15389</guid>
2432 </item>
2433 <item>
2434 <title>Multitask Bandit Learning through Heterogeneous Feedback Aggregation. (arXiv:2010.15390v1 [cs.LG])</title>
2435 <link>http://fr.arxiv.org/abs/2010.15390</link>
2436 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zhi Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1">Chicheng Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_M/0/1/0/all/0/1">Manish Kumar Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Riek_L/0/1/0/all/0/1">Laurel D. Riek</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chaudhuri_K/0/1/0/all/0/1">Kamalika Chaudhuri</a></p>
2437
2438 <p>In many real-world applications, multiple agents seek to learn how to perform
2439 highly related yet slightly different tasks in an online bandit learning
2440 protocol. We formulate this problem as the $\epsilon$-multi-player multi-armed
2441 bandit problem, in which a set of players concurrently interact with a set of
2442 arms, and for each arm, the reward distributions for all players are similar
2443 but not necessarily identical. We develop an upper confidence bound-based
2444 algorithm, RobustAgg$(\epsilon)$, that adaptively aggregates rewards collected
2445 by different players. In the setting where an upper bound on the pairwise
2446 similarities of reward distributions between players is known, we achieve
2447 instance-dependent regret guarantees that depend on the amenability of
2448 information sharing across players. We complement these upper bounds with
2449 nearly matching lower bounds. In the setting where pairwise similarities are
2450 unknown, we provide a lower bound, as well as an algorithm that trades off
2451 minimax regret guarantees for adaptivity to unknown similarity structure.
2452 </p>
2453 </description>
2454 <guid isPermaLink="false">oai:arXiv.org:2010.15390</guid>
2455 </item>
2456 <item>
2457 <title>Robustifying Binary Classification to Adversarial Perturbation. (arXiv:2010.15391v1 [cs.LG])</title>
2458 <link>http://fr.arxiv.org/abs/2010.15391</link>
2459 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Salehi_F/0/1/0/all/0/1">Fariborz Salehi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hassibi_B/0/1/0/all/0/1">Babak Hassibi</a></p>
2460
2461 <p>Despite the enormous success of machine learning models in various
2462 applications, most of these models lack resilience to (even small)
2463 perturbations in their input data. Hence, new methods to robustify machine
2464 learning models seem very essential. To this end, in this paper we consider the
2465 problem of binary classification with adversarial perturbations. Investigating
2466 the solution to a min-max optimization (which considers the worst-case loss in
2467 the presence of adversarial perturbations) we introduce a generalization to the
2468 max-margin classifier which takes into account the power of the adversary in
2469 manipulating the data. We refer to this classifier as the "Robust Max-margin"
2470 (RM) classifier. Under some mild assumptions on the loss function, we
2471 theoretically show that the gradient descent iterates (with sufficiently small
2472 step size) converge to the RM classifier in its direction. Therefore, the RM
2473 classifier can be studied to compute various performance measures (e.g.
2474 generalization error) of binary classification with adversarial perturbations.
2475 </p>
2476 </description>
2477 <guid isPermaLink="false">oai:arXiv.org:2010.15391</guid>
2478 </item>
2479 <item>
2480 <title>Off-Policy Interval Estimation with Lipschitz Value Iteration. (arXiv:2010.15392v1 [cs.LG])</title>
2481 <link>http://fr.arxiv.org/abs/2010.15392</link>
2482 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_Z/0/1/0/all/0/1">Ziyang Tang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_Y/0/1/0/all/0/1">Yihao Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_N/0/1/0/all/0/1">Na Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_J/0/1/0/all/0/1">Jian Peng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Q/0/1/0/all/0/1">Qiang Liu</a></p>
2483
2484 <p>Off-policy evaluation provides an essential tool for evaluating the effects
2485 of different policies or treatments using only observed data. When applied to
2486 high-stakes scenarios such as medical diagnosis or financial decision-making,
2487 it is crucial to provide provably correct upper and lower bounds of the
2488 expected reward, not just a classical single point estimate, to the end-users,
2489 as executing a poor policy can be very costly. In this work, we propose a
2490 provably correct method for obtaining interval bounds for off-policy evaluation
2491 in a general continuous setting. The idea is to search for the maximum and
2492 minimum values of the expected reward among all the Lipschitz Q-functions that
2493 are consistent with the observations, which amounts to solving a constrained
2494 optimization problem on a Lipschitz function space. We go on to introduce a
2495 Lipschitz value iteration method to monotonically tighten the interval, which
2496 is simple yet efficient and provably convergent. We demonstrate the practical
2497 efficiency of our method on a range of benchmarks.
2498 </p>
2499 </description>
2500 <guid isPermaLink="false">oai:arXiv.org:2010.15392</guid>
2501 </item>
2502 <item>
2503 <title>Discovery and classification of Twitter bots. (arXiv:2010.15393v1 [cs.SI])</title>
2504 <link>http://fr.arxiv.org/abs/2010.15393</link>
2505 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shevtsov_A/0/1/0/all/0/1">Alexander Shevtsov Alexander Shevtsov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Oikonomidou_M/0/1/0/all/0/1">Maria Oikonomidou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Antonakaki_D/0/1/0/all/0/1">Despoina Antonakaki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pratikakis_P/0/1/0/all/0/1">Polyvios Pratikakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kanterakis_A/0/1/0/all/0/1">Alexandros Kanterakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ioannidis_S/0/1/0/all/0/1">Sotiris Ioannidis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fragopoulou_P/0/1/0/all/0/1">Paraskevi Fragopoulou</a></p>
2506
2507 <p>A very large number of people use Online Social Networks daily. Such
2508 platforms thus become attractive targets for agents that seek to gain access to
2509 the attention of large audiences, and influence perceptions or opinions.
2510 Botnets, collections of automated accounts controlled by a single agent, are a
2511 common mechanism for exerting maximum influence. Botnets may be used to better
2512 infiltrate the social graph over time and to create an illusion of community
2513 behavior, amplifying their message and increasing persuasion.
2514 </p>
2515 <p>This paper investigates Twitter botnets, their behavior, their interaction
2516 with user communities and their evolution over time. We analyzed a dense crawl
2517 of a subset of Twitter traffic, amounting to nearly all interactions by
2518 Greek-speaking Twitter users for a period of 36 months. We detected over a
2519 million events where seemingly unrelated accounts tweeted nearly identical
2520 content at nearly the same time. We filtered these concurrent content injection
2521 events and detected a set of 1,850 accounts that repeatedly exhibit this
2522 pattern of behavior, suggesting that they are fully or in part controlled and
2523 orchestrated by the same software. We found botnets that appear for brief
2524 intervals and disappear, as well as botnets that evolve and grow, spanning the
2525 duration of our dataset. We analyze statistical differences between bot
2526 accounts and human users, as well as botnet interaction with user communities
2527 and Twitter trending topics.
2528 </p>
2529 </description>
2530 <guid isPermaLink="false">oai:arXiv.org:2010.15393</guid>
2531 </item>
2532 <item>
2533 <title>Smart Homes: Security Challenges and Privacy Concerns. (arXiv:2010.15394v1 [cs.CR])</title>
2534 <link>http://fr.arxiv.org/abs/2010.15394</link>
2535 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hall_F/0/1/0/all/0/1">Fraser Hall</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maglaras_L/0/1/0/all/0/1">Leandros Maglaras</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Aivaliotis_T/0/1/0/all/0/1">Theodoros Aivaliotis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xagoraris_L/0/1/0/all/0/1">Loukas Xagoraris</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kantzavelou_I/0/1/0/all/0/1">Ioanna Kantzavelou</a></p>
2536
2537 <p>Development and growth of Internet of Things (IoT) technology has
2538 exponentially increased over the course of the last 10 years since its
2539 inception, and as a result has directly influenced the popularity and size of
2540 smart homes. In this article we present the main technologies and applications
2541 that constitute a smart home, we identify the main security and privacy
2542 challenges that smart home face and we provide good practices to mitigate those
2543 threats.
2544 </p>
2545 </description>
2546 <guid isPermaLink="false">oai:arXiv.org:2010.15394</guid>
2547 </item>
2548 <item>
2549 <title>Channel Estimation and Equalization for CP-OFDM-based OTFS in Fractional Doppler Channels. (arXiv:2010.15396v1 [cs.IT])</title>
2550 <link>http://fr.arxiv.org/abs/2010.15396</link>
2551 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hashimoto_N/0/1/0/all/0/1">Noriyuki Hashimoto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Osawa_N/0/1/0/all/0/1">Noboru Osawa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yamazaki_K/0/1/0/all/0/1">Kosuke Yamazaki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ibi_S/0/1/0/all/0/1">Shinsuke Ibi</a></p>
2552
2553 <p>Orthogonal time frequency and space (OTFS) modulation is a promising
2554 technology that satisfies high Doppler requirements for future mobile systems.
2555 OTFS modulation encodes information symbols and pilot symbols into the
2556 two-dimensional (2D) delay-Doppler (DD) domain. The received symbols suffer
2557 from inter-Doppler interference (IDI) in the fading channels with fractional
2558 Doppler shifts that are sampled at noninteger indices in the DD domain. IDI has
2559 been treated as an unavoidable effect because the fractional Doppler shifts
2560 cannot be obtained directly from the received pilot symbols. In this paper, we
2561 provide a solution to channel estimation for fractional Doppler channels. The
2562 proposed estimation provides new insight into the OTFS input-output relation in
2563 the DD domain as a 2D circular convolution with a small approximation.
2564 According to the input-output relation, we also provide a low-complexity
2565 channel equalization method using the estimated channel information. We
2566 demonstrate the error performance of the proposed channel estimation and
2567 equalization in several channels by simulations. The simulation results show
2568 that in high-mobility environments, the total system utilizing the proposed
2569 methods outperforms orthogonal frequency division multiplexing (OFDM) with
2570 ideal channel estimation and a conventional channel estimation method using a
2571 pseudo sequence.
2572 </p>
2573 </description>
2574 <guid isPermaLink="false">oai:arXiv.org:2010.15396</guid>
2575 </item>
2576 <item>
2577 <title>Free-boundary conformal parameterization of point clouds. (arXiv:2010.15399v1 [cs.CG])</title>
2578 <link>http://fr.arxiv.org/abs/2010.15399</link>
2579 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1">Yechen Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Choi_G/0/1/0/all/0/1">Gary P. T. Choi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lui_L/0/1/0/all/0/1">Lok Ming Lui</a></p>
2580
2581 <p>With the advancement in 3D scanning technology, there has been a surge of
2582 interest in the use of point clouds in science and engineering. To facilitate
2583 the computations and analyses of point clouds, prior works have considered
2584 parameterizing them onto some simple planar domains with a fixed boundary shape
2585 such as a unit circle or a rectangle. However, the geometry of the fixed shape
2586 may lead to some undesirable distortion in the parameterization. It is
2587 therefore more natural to consider free-boundary conformal parameterizations of
2588 point clouds, which minimize the local geometric distortion of the mapping
2589 without constraining the overall shape. In this work, we propose a novel
2590 approximation scheme of the Laplace--Beltrami operator on point clouds and
2591 utilize it for developing a free-boundary conformal parameterization method for
2592 disk-type point clouds. With the aid of the free-boundary conformal
2593 parameterization, high-quality point cloud meshing can be easily achieved.
2594 Furthermore, we show that using the idea of conformal welding in complex
2595 analysis, the point cloud conformal parameterization can be computed in a
2596 divide-and-conquer manner. Experimental results are presented to demonstrate
2597 the effectiveness of the proposed method.
2598 </p>
2599 </description>
2600 <guid isPermaLink="false">oai:arXiv.org:2010.15399</guid>
2601 </item>
2602 <item>
2603 <title>On Efficient and Scalable Time-Continuous Spatial Crowdsourcing -- Full Version. (arXiv:2010.15404v1 [cs.DB])</title>
2604 <link>http://fr.arxiv.org/abs/2010.15404</link>
2605 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_T/0/1/0/all/0/1">Ting Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_X/0/1/0/all/0/1">Xike Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_X/0/1/0/all/0/1">Xin Cao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pedersen_T/0/1/0/all/0/1">Torben Bach Pedersen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xiao_M/0/1/0/all/0/1">Mingjun Xiao</a></p>
2606
2607 <p>The proliferation of advanced mobile terminals opened up a new crowdsourcing
2608 avenue, spatial crowdsourcing, to utilize the crowd potential to perform
2609 real-world tasks. In this work, we study a new type of spatial crowdsourcing,
2610 called time-continuous spatial crowdsourcing (TCSC in short). It supports broad
2611 applications for long-term continuous spatial data acquisition, ranging from
2612 environmental monitoring to traffic surveillance in citizen science and
2613 crowdsourcing projects. However, due to limited budgets and limited
2614 availability of workers in practice, the data collected is often incomplete,
2615 incurring data deficiency problem. To tackle that, in this work, we first
2616 propose an entropy-based quality metric, which captures the joint effects of
2617 incompletion in data acquisition and the imprecision in data interpolation.
2618 Based on that, we investigate quality-aware task assignment methods for both
2619 single- and multi-task scenarios. We show the NP-hardness of the single-task
2620 case, and design polynomial-time algorithms with guaranteed approximation
2621 ratios. We study novel indexing and pruning techniques for further enhancing
2622 the performance in practice. Then, we extend the solution to multi-task
2623 scenarios and devise a parallel framework for speeding up the process of
2624 optimization. We conduct extensive experiments on both real and synthetic
2625 datasets to show the effectiveness of our proposals.
2626 </p>
2627 </description>
2628 <guid isPermaLink="false">oai:arXiv.org:2010.15404</guid>
2629 </item>
2630 <item>
2631 <title>Conversation Graph: Data Augmentation, Training and Evaluation for Non-Deterministic Dialogue Management. (arXiv:2010.15411v1 [cs.CL])</title>
2632 <link>http://fr.arxiv.org/abs/2010.15411</link>
2633 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gritta_M/0/1/0/all/0/1">Milan Gritta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lampouras_G/0/1/0/all/0/1">Gerasimos Lampouras</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Iacobacci_I/0/1/0/all/0/1">Ignacio Iacobacci</a></p>
2634
2635 <p>Task-oriented dialogue systems typically rely on large amounts of
2636 high-quality training data or require complex handcrafted rules. However,
2637 existing datasets are often limited in size considering the complexity of the
2638 dialogues. Additionally, conventional training signal inference is not suitable
2639 for non-deterministic agent behaviour, i.e. considering multiple actions as
2640 valid in identical dialogue states. We propose the Conversation Graph
2641 (ConvGraph), a graph-based representation of dialogues that can be exploited
2642 for data augmentation, multi-reference training and evaluation of
2643 non-deterministic agents. ConvGraph generates novel dialogue paths to augment
2644 data volume and diversity. Intrinsic and extrinsic evaluation across three
2645 datasets shows that data augmentation and/or multi-reference training with
2646 ConvGraph can improve dialogue success rates by up to 6.4%.
2647 </p>
2648 </description>
2649 <guid isPermaLink="false">oai:arXiv.org:2010.15411</guid>
2650 </item>
2651 <item>
2652 <title>Measuring and Harnessing Transference in Multi-Task Learning. (arXiv:2010.15413v1 [cs.LG])</title>
2653 <link>http://fr.arxiv.org/abs/2010.15413</link>
2654 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fifty_C/0/1/0/all/0/1">Christopher Fifty</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Amid_E/0/1/0/all/0/1">Ehsan Amid</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_Z/0/1/0/all/0/1">Zhe Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_T/0/1/0/all/0/1">Tianhe Yu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Anil_R/0/1/0/all/0/1">Rohan Anil</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Finn_C/0/1/0/all/0/1">Chelsea Finn</a></p>
2655
2656 <p>Multi-task learning can leverage information learned by one task to benefit
2657 the training of other tasks. Despite this capacity, na\"ive formulations often
2658 degrade performance and in particular, identifying the tasks that would benefit
2659 from co-training remains a challenging design question. In this paper, we
2660 analyze the dynamics of information transfer, or transference, across tasks
2661 throughout training. Specifically, we develop a similarity measure that can
2662 quantify transference among tasks and use this quantity to both better
2663 understand the optimization dynamics of multi-task learning as well as improve
2664 overall learning performance. In the latter case, we propose two methods to
2665 leverage our transference metric. The first operates at a macro-level by
2666 selecting which tasks should train together while the second functions at a
2667 micro-level by determining how to combine task gradients at each training step.
2668 We find these methods can lead to significant improvement over prior work on
2669 three supervised multi-task learning benchmarks and one multi-task
2670 reinforcement learning paradigm.
2671 </p>
2672 </description>
2673 <guid isPermaLink="false">oai:arXiv.org:2010.15413</guid>
2674 </item>
2675 <item>
2676 <title>A Novel Anomaly Detection Algorithm for Hybrid Production Systems based on Deep Learning and Timed Automata. (arXiv:2010.15415v1 [cs.LG])</title>
2677 <link>http://fr.arxiv.org/abs/2010.15415</link>
2678 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hranisavljevic_N/0/1/0/all/0/1">Nemanja Hranisavljevic</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Niggemann_O/0/1/0/all/0/1">Oliver Niggemann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maier_A/0/1/0/all/0/1">Alexander Maier</a></p>
2679
2680 <p>Performing anomaly detection in hybrid systems is a challenging task since it
2681 requires analysis of timing behavior and mutual dependencies of both discrete
2682 and continuous signals. Typically, it requires modeling system behavior, which
2683 is often accomplished manually by human engineers. Using machine learning for
2684 creating a behavioral model from observations has advantages, such as lower
2685 development costs and fewer requirements for specific knowledge about the
2686 system. The paper presents DAD:DeepAnomalyDetection, a new approach for
2687 automatic model learning and anomaly detection in hybrid production systems. It
2688 combines deep learning and timed automata for creating behavioral model from
2689 observations. The ability of deep belief nets to extract binary features from
2690 real-valued inputs is used for transformation of continuous to discrete
2691 signals. These signals, together with the original discrete signals are than
2692 handled in an identical way. Anomaly detection is performed by the comparison
2693 of actual and predicted system behavior. The algorithm has been applied to few
2694 data sets including two from real systems and has shown promising results.
2695 </p>
2696 </description>
2697 <guid isPermaLink="false">oai:arXiv.org:2010.15415</guid>
2698 </item>
2699 <item>
2700 <title>ProCAN: Progressive Growing Channel Attentive Non-Local Network for Lung Nodule Classification. (arXiv:2010.15417v1 [eess.IV])</title>
2701 <link>http://fr.arxiv.org/abs/2010.15417</link>
2702 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Al_Shabi_M/0/1/0/all/0/1">Mundher Al-Shabi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shak_K/0/1/0/all/0/1">Kelvin Shak</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Tan_M/0/1/0/all/0/1">Maxine Tan</a></p>
2703
2704 <p>Lung cancer classification in screening computed tomography (CT) scans is one
2705 of the most crucial tasks for early detection of this disease. Many lives can
2706 be saved if we are able to accurately classify malignant/ cancerous lung
2707 nodules. Consequently, several deep learning based models have been proposed
2708 recently to classify lung nodules as malignant or benign. Nevertheless, the
2709 large variation in the size and heterogeneous appearance of the nodules makes
2710 this task an extremely challenging one. We propose a new Progressive Growing
2711 Channel Attentive Non-Local (ProCAN) network for lung nodule classification.
2712 The proposed method addresses this challenge from three different aspects.
2713 First, we enrich the Non-Local network by adding channel-wise attention
2714 capability to it. Second, we apply Curriculum Learning principles, whereby we
2715 first train our model on easy examples before hard/ difficult ones. Third, as
2716 the classification task gets harder during the Curriculum learning, our model
2717 is progressively grown to increase its capability of handling the task at hand.
2718 We examined our proposed method on two different public datasets and compared
2719 its performance with state-of-the-art methods in the literature. The results
2720 show that the ProCAN model outperforms state-of-the-art methods and achieves an
2721 AUC of 98.05% and accuracy of 95.28% on the LIDC-IDRI dataset. Moreover, we
2722 conducted extensive ablation studies to analyze the contribution and effects of
2723 each new component of our proposed method.
2724 </p>
2725 </description>
2726 <guid isPermaLink="false">oai:arXiv.org:2010.15417</guid>
2727 </item>
2728 <item>
2729 <title>Scalable Graph Neural Networks via Bidirectional Propagation. (arXiv:2010.15421v1 [cs.LG])</title>
2730 <link>http://fr.arxiv.org/abs/2010.15421</link>
2731 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_M/0/1/0/all/0/1">Ming Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_Z/0/1/0/all/0/1">Zhewei Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_B/0/1/0/all/0/1">Bolin Ding</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yaliang Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yuan_Y/0/1/0/all/0/1">Ye Yuan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Du_X/0/1/0/all/0/1">Xiaoyong Du</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_J/0/1/0/all/0/1">Ji-Rong Wen</a></p>
2732
2733 <p>Graph Neural Networks (GNN) is an emerging field for learning on
2734 non-Euclidean data. Recently, there has been increased interest in designing
2735 GNN that scales to large graphs. Most existing methods use "graph sampling" or
2736 "layer-wise sampling" techniques to reduce training time. However, these
2737 methods still suffer from degrading performance and scalability problems when
2738 applying to graphs with billions of edges. This paper presents GBP, a scalable
2739 GNN that utilizes a localized bidirectional propagation process from both the
2740 feature vectors and the training/testing nodes. Theoretical analysis shows that
2741 GBP is the first method that achieves sub-linear time complexity for both the
2742 precomputation and the training phases. An extensive empirical study
2743 demonstrates that GBP achieves state-of-the-art performance with significantly
2744 less training/testing time. Most notably, GBP can deliver superior performance
2745 on a graph with over 60 million nodes and 1.8 billion edges in less than half
2746 an hour on a single machine.
2747 </p>
2748 </description>
2749 <guid isPermaLink="false">oai:arXiv.org:2010.15421</guid>
2750 </item>
2751 <item>
2752 <title>Tilde at WMT 2020: News Task Systems. (arXiv:2010.15423v1 [cs.CL])</title>
2753 <link>http://fr.arxiv.org/abs/2010.15423</link>
2754 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Krislauks_R/0/1/0/all/0/1">Rihards Kri&#x161;lauks</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pinnis_M/0/1/0/all/0/1">M&#x101;rcis Pinnis</a></p>
2755
2756 <p>This paper describes Tilde's submission to the WMT2020 shared task on news
2757 translation for both directions of the English-Polish language pair in both the
2758 constrained and the unconstrained tracks. We follow our submissions from the
2759 previous years and build our baseline systems to be morphologically motivated
2760 sub-word unit-based Transformer base models that we train using the Marian
2761 machine translation toolkit. Additionally, we experiment with different
2762 parallel and monolingual data selection schemes, as well as sampled
2763 back-translation. Our final models are ensembles of Transformer base and
2764 Transformer big models that feature right-to-left re-ranking.
2765 </p>
2766 </description>
2767 <guid isPermaLink="false">oai:arXiv.org:2010.15423</guid>
2768 </item>
2769 <item>
2770 <title>Detection of asteroid trails in Hubble Space Telescope images using Deep Learning. (arXiv:2010.15425v1 [astro-ph.IM])</title>
2771 <link>http://fr.arxiv.org/abs/2010.15425</link>
2772 <description><p>Authors: <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Parfeni_A/0/1/0/all/0/1">Andrei A. Parfeni</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Caramete_L/0/1/0/all/0/1">Laurentiu I. Caramete</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Dobre_A/0/1/0/all/0/1">Andreea M. Dobre</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Bach_N/0/1/0/all/0/1">Nguyen Tran Bach</a></p>
2773
2774 <p>We present an application of Deep Learning for the image recognition of
2775 asteroid trails in single-exposure photos taken by the Hubble Space Telescope.
2776 Using algorithms based on multi-layered deep Convolutional Neural Networks, we
2777 report accuracies of above 80% on the validation set. Our project was motivated
2778 by the Hubble Asteroid Hunter project on Zooniverse, which focused on
2779 identifying these objects in order to localize and better characterize them. We
2780 aim to demonstrate that Machine Learning techniques can be very useful in
2781 trying to solve problems that are closely related to Astronomy and
2782 Astrophysics, but that they are still not developed enough for very specific
2783 tasks.
2784 </p>
2785 </description>
2786 <guid isPermaLink="false">oai:arXiv.org:2010.15425</guid>
2787 </item>
2788 <item>
2789 <title>Physics-informed deep learning for flow and deformation in poroelastic media. (arXiv:2010.15426v1 [cs.CE])</title>
2790 <link>http://fr.arxiv.org/abs/2010.15426</link>
2791 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bekele_Y/0/1/0/all/0/1">Yared W. Bekele</a></p>
2792
2793 <p>A physics-informed neural network is presented for poroelastic problems with
2794 coupled flow and deformation processes. The governing equilibrium and mass
2795 balance equations are discussed and specific derivations for two-dimensional
2796 cases are presented. A fully-connected deep neural network is used for
2797 training. Barry and Mercer's source problem with time-dependent fluid
2798 injection/extraction in an idealized poroelastic medium, which has an exact
2799 analytical solution, is used as a numerical example. A random sample from the
2800 analytical solution is used as training data and the performance of the model
2801 is tested by predicting the solution on the entire domain after training. The
2802 deep learning model predicts the horizontal and vertical deformations well
2803 while the error in the predicted pore pressure predictions is slightly higher
2804 because of the sparsity of the pore pressure values.
2805 </p>
2806 </description>
2807 <guid isPermaLink="false">oai:arXiv.org:2010.15426</guid>
2808 </item>
2809 <item>
2810 <title>Sparse Signal Reconstruction for Nonlinear Models via Piecewise Rational Optimization. (arXiv:2010.15427v1 [math.OC])</title>
2811 <link>http://fr.arxiv.org/abs/2010.15427</link>
2812 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Marmin_A/0/1/0/all/0/1">Arthur Marmin</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Castella_M/0/1/0/all/0/1">Marc Castella</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Pesquet_J/0/1/0/all/0/1">Jean-Christophe Pesquet</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Duval_L/0/1/0/all/0/1">Laurent Duval</a></p>
2813
2814 <p>We propose a method to reconstruct sparse signals degraded by a nonlinear
2815 distortion and acquired at a limited sampling rate. Our method formulates the
2816 reconstruction problem as a nonconvex minimization of the sum of a data fitting
2817 term and a penalization term. In contrast with most previous works which settle
2818 for approximated local solutions, we seek for a global solution to the obtained
2819 challenging nonconvex problem. Our global approach relies on the so-called
2820 Lasserre relaxation of polynomial optimization. We here specifically include in
2821 our approach the case of piecewise rational functions, which makes it possible
2822 to address a wide class of nonconvex exact and continuous relaxations of the
2823 $\ell_0$ penalization function. Additionally, we study the complexity of the
2824 optimization problem. It is shown how to use the structure of the problem to
2825 lighten the computational burden efficiently. Finally, numerical simulations
2826 illustrate the benefits of our method in terms of both global optimality and
2827 signal reconstruction.
2828 </p>
2829 </description>
2830 <guid isPermaLink="false">oai:arXiv.org:2010.15427</guid>
2831 </item>
2832 <item>
2833 <title>Self-paced Data Augmentation for Training Neural Networks. (arXiv:2010.15434v1 [cs.LG])</title>
2834 <link>http://fr.arxiv.org/abs/2010.15434</link>
2835 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Takase_T/0/1/0/all/0/1">Tomoumi Takase</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Karakida_R/0/1/0/all/0/1">Ryo Karakida</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Asoh_H/0/1/0/all/0/1">Hideki Asoh</a></p>
2836
2837 <p>Data augmentation is widely used for machine learning; however, an effective
2838 method to apply data augmentation has not been established even though it
2839 includes several factors that should be tuned carefully. One such factor is
2840 sample suitability, which involves selecting samples that are suitable for data
2841 augmentation. A typical method that applies data augmentation to all training
2842 samples disregards sample suitability, which may reduce classifier performance.
2843 To address this problem, we propose the self-paced augmentation (SPA) to
2844 automatically and dynamically select suitable samples for data augmentation
2845 when training a neural network. The proposed method mitigates the deterioration
2846 of generalization performance caused by ineffective data augmentation. We
2847 discuss two reasons the proposed SPA works relative to curriculum learning and
2848 desirable changes to loss function instability. Experimental results
2849 demonstrate that the proposed SPA can improve the generalization performance,
2850 particularly when the number of training samples is small. In addition, the
2851 proposed SPA outperforms the state-of-the-art RandAugment method.
2852 </p>
2853 </description>
2854 <guid isPermaLink="false">oai:arXiv.org:2010.15434</guid>
2855 </item>
2856 <item>
2857 <title>Group-Harmonic and Group-Closeness Maximization -- Approximation and Engineering. (arXiv:2010.15435v1 [cs.DS])</title>
2858 <link>http://fr.arxiv.org/abs/2010.15435</link>
2859 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Angriman_E/0/1/0/all/0/1">Eugenio Angriman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Becker_R/0/1/0/all/0/1">Ruben Becker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+DAngelo_G/0/1/0/all/0/1">Gianlorenzo D&#x27;Angelo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gilbert_H/0/1/0/all/0/1">Hugo Gilbert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Grinten_A/0/1/0/all/0/1">Alexander van der Grinten</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Meyerhenke_H/0/1/0/all/0/1">Henning Meyerhenke</a></p>
2860
2861 <p>Centrality measures characterize important nodes in networks. Efficiently
2862 computing such nodes has received a lot of attention. When considering the
2863 generalization of computing central groups of nodes, challenging optimization
2864 problems occur. In this work, we study two such problems, group-harmonic
2865 maximization and group-closeness maximization both from a theoretical and from
2866 an algorithm engineering perspective.
2867 </p>
2868 <p>On the theoretical side, we obtain the following results. For group-harmonic
2869 maximization, unless $P=NP$, there is no polynomial-time algorithm that
2870 achieves an approximation factor better than $1-1/e$ (directed) and $1-1/(4e)$
2871 (undirected), even for unweighted graphs. On the positive side, we show that a
2872 greedy algorithm achieves an approximation factor of $\lambda(1-2/e)$
2873 (directed) and $\lambda(1-1/e)/2$ (undirected), where $\lambda$ is the ratio of
2874 minimal and maximal edge weights. For group-closeness maximization, the
2875 undirected case is $NP$-hard to be approximated to within a factor better than
2876 $1-1/(e+1)$ and a constant approximation factor is achieved by a local-search
2877 algorithm. For the directed case, however, we show that, for any
2878 $\epsilon&lt;1/2$, the problem is $NP$-hard to be approximated within a factor of
2879 $4|V|^{-\epsilon}$.
2880 </p>
2881 <p>From the algorithm engineering perspective, we provide efficient
2882 implementations of the above greedy and local search algorithms. In our
2883 experimental study we show that, on small instances where an optimum solution
2884 can be computed in reasonable time, the quality of both the greedy and the
2885 local search algorithms come very close to the optimum. On larger instances,
2886 our local search algorithms yield results with superior quality compared to
2887 existing greedy and local search solutions, at the cost of additional running
2888 time. We thus advocate local search for scenarios where solution quality is of
2889 highest concern.
2890 </p>
2891 </description>
2892 <guid isPermaLink="false">oai:arXiv.org:2010.15435</guid>
2893 </item>
2894 <item>
2895 <title>Affordance-Aware Handovers with Human Arm Mobility Constraints. (arXiv:2010.15436v1 [cs.RO])</title>
2896 <link>http://fr.arxiv.org/abs/2010.15436</link>
2897 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ardon_P/0/1/0/all/0/1">Paola Ard&#xf3;n</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cabrera_M/0/1/0/all/0/1">Maria E. Cabrera</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pairet_E/0/1/0/all/0/1">&#xc8;ric Pairet</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Petrick_R/0/1/0/all/0/1">Ronald P. A. Petrick</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramamoorthy_S/0/1/0/all/0/1">Subramanian Ramamoorthy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lohan_K/0/1/0/all/0/1">Katrin S. Lohan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cakmak_M/0/1/0/all/0/1">Maya Cakmak</a></p>
2898
2899 <p>Reasoning about object handover configurations allows an assistive agent to
2900 estimate the appropriateness of handover for a receiver with different arm
2901 mobility capacities. While there are existing approaches to estimating the
2902 effectiveness of handovers, their findings are limited to users without arm
2903 mobility impairments and to specific objects. Therefore, current
2904 state-of-the-art approaches are unable to hand over novel objects to receivers
2905 with different arm mobility capacities. We propose a method that generalises
2906 handover behaviours to previously unseen objects, subject to the constraint of
2907 a user's arm mobility levels and the task context. We propose a
2908 heuristic-guided hierarchically optimised cost whose optimisation adapts object
2909 configurations for receivers with low arm mobility. This also ensures that the
2910 robot grasps consider the context of the user's upcoming task, i.e., the usage
2911 of the object. To understand preferences over handover configurations, we
2912 report on the findings of an online study, wherein we presented different
2913 handover methods, including ours, to $259$ users with different levels of arm
2914 mobility. We encapsulate these preferences in a SRL that is able to reason
2915 about the most suitable handover configuration given a receiver's arm mobility
2916 and upcoming task. We find that people's preferences over handover methods are
2917 correlated to their arm mobility capacities. In experiments with a PR2 robotic
2918 platform, we obtained an average handover accuracy of $90.8\%$ when
2919 generalising handovers to novel objects.
2920 </p>
2921 </description>
2922 <guid isPermaLink="false">oai:arXiv.org:2010.15436</guid>
2923 </item>
2924 <item>
2925 <title>Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model. (arXiv:2010.15437v1 [cs.CL])</title>
2926 <link>http://fr.arxiv.org/abs/2010.15437</link>
2927 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ihori_M/0/1/0/all/0/1">Mana Ihori</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Masumura_R/0/1/0/all/0/1">Ryo Masumura</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Makishima_N/0/1/0/all/0/1">Naoki Makishima</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tanaka_T/0/1/0/all/0/1">Tomohiro Tanaka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Takashima_A/0/1/0/all/0/1">Akihiko Takashima</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Orihashi_S/0/1/0/all/0/1">Shota Orihashi</a></p>
2928
2929 <p>This paper presents a novel fusion method for integrating an external
2930 language model (LM) into the Transformer based sequence-to-sequence (seq2seq)
2931 model. While paired data are basically required to train the seq2seq model, the
2932 external LM can be trained with only unpaired data. Thus, it is important to
2933 leverage memorized knowledge in the external LM for building the seq2seq model,
2934 since it is hard to prepare a large amount of paired data. However, the
2935 existing fusion methods assume that the LM is integrated with recurrent neural
2936 network-based seq2seq models instead of the Transformer. Therefore, this paper
2937 proposes a fusion method that can explicitly utilize network structures in the
2938 Transformer. The proposed method, called {\bf memory attentive fusion},
2939 leverages the Transformer-style attention mechanism that repeats source-target
2940 attention in a multi-hop manner for reading the memorized knowledge in the LM.
2941 Our experiments on two text-style conversion tasks demonstrate that the
2942 proposed method performs better than conventional fusion methods.
2943 </p>
2944 </description>
2945 <guid isPermaLink="false">oai:arXiv.org:2010.15437</guid>
2946 </item>
2947 <item>
2948 <title>Modeling and Control of COVID-19 Epidemic through Testing Policies. (arXiv:2010.15438v1 [math.OC])</title>
2949 <link>http://fr.arxiv.org/abs/2010.15438</link>
2950 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Niazi_M/0/1/0/all/0/1">Muhammad Umar B. Niazi</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Kibangou_A/0/1/0/all/0/1">Alain Kibangou</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Canudas_de_Wit_C/0/1/0/all/0/1">Carlos Canudas-de-Wit</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Nikitin_D/0/1/0/all/0/1">Denis Nikitin</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Tumash_L/0/1/0/all/0/1">Liudmila Tumash</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Bliman_P/0/1/0/all/0/1">Pierre-Alexandre Bliman</a></p>
2951
2952 <p>Testing for the infected cases is one of the most important mechanisms to
2953 control an epidemic. It enables to isolate the detected infected individuals,
2954 thereby limiting the disease transmission to the susceptible population.
2955 However, despite the significance of testing policies, the recent literature on
2956 the subject lacks a control-theoretic perspective. In this work, an epidemic
2957 model that incorporates the testing rate as a control input is presented. The
2958 proposed model differentiates the undetected infected from the detected
2959 infected cases, who are assumed to be removed from the disease spreading
2960 process in the population. First, the model is estimated and validated for
2961 COVID-19 data in France. Then, two testing policies are proposed, the so-called
2962 best-effort strategy for testing (BEST) and constant optimal strategy for
2963 testing (COST). The BEST policy is a suppression strategy that provides a lower
2964 bound on the testing rate such that the epidemic switches from a spreading to a
2965 non-spreading state. The COST policy is a mitigation strategy that provides an
2966 optimal value of testing rate that minimizes the peak value of the infected
2967 population when the total stockpile of tests is limited. Both testing policies
2968 are evaluated by predicting the number of active intensive care unit (ICU)
2969 cases and the cumulative number of deaths due to COVID-19.
2970 </p>
2971 </description>
2972 <guid isPermaLink="false">oai:arXiv.org:2010.15438</guid>
2973 </item>
2974 <item>
2975 <title>FlatNet: Towards Photorealistic Scene Reconstruction from Lensless Measurements. (arXiv:2010.15440v1 [eess.IV])</title>
2976 <link>http://fr.arxiv.org/abs/2010.15440</link>
2977 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Khan_S/0/1/0/all/0/1">Salman S. Khan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sundar_V/0/1/0/all/0/1">Varun Sundar</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Boominathan_V/0/1/0/all/0/1">Vivek Boominathan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Veeraraghavan_A/0/1/0/all/0/1">Ashok Veeraraghavan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Mitra_K/0/1/0/all/0/1">Kaushik Mitra</a></p>
2978
2979 <p>Lensless imaging has emerged as a potential solution towards realizing
2980 ultra-miniature cameras by eschewing the bulky lens in a traditional camera.
2981 Without a focusing lens, the lensless cameras rely on computational algorithms
2982 to recover the scenes from multiplexed measurements. However, the current
2983 iterative-optimization-based reconstruction algorithms produce noisier and
2984 perceptually poorer images. In this work, we propose a non-iterative deep
2985 learning based reconstruction approach that results in orders of magnitude
2986 improvement in image quality for lensless reconstructions. Our approach, called
2987 $\textit{FlatNet}$, lays down a framework for reconstructing high-quality
2988 photorealistic images from mask-based lensless cameras, where the camera's
2989 forward model formulation is known. FlatNet consists of two stages: (1) an
2990 inversion stage that maps the measurement into a space of intermediate
2991 reconstruction by learning parameters within the forward model formulation, and
2992 (2) a perceptual enhancement stage that improves the perceptual quality of this
2993 intermediate reconstruction. These stages are trained together in an end-to-end
2994 manner. We show high-quality reconstructions by performing extensive
2995 experiments on real and challenging scenes using two different types of
2996 lensless prototypes: one which uses a separable forward model and another,
2997 which uses a more general non-separable cropped-convolution model. Our
2998 end-to-end approach is fast, produces photorealistic reconstructions, and is
2999 easy to adopt for other mask-based lensless cameras.
3000 </p>
3001 </description>
3002 <guid isPermaLink="false">oai:arXiv.org:2010.15440</guid>
3003 </item>
3004 <item>
3005 <title>Self-awareness in intelligent vehicles: Feature based dynamic Bayesian models for abnormality detection. (arXiv:2010.15441v1 [cs.LG])</title>
3006 <link>http://fr.arxiv.org/abs/2010.15441</link>
3007 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kanapram_D/0/1/0/all/0/1">Divya Thekke Kanapram</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marin_Plaza_P/0/1/0/all/0/1">Pablo Marin-Plaza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marcenaro_L/0/1/0/all/0/1">Lucio Marcenaro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Martin_D/0/1/0/all/0/1">David Martin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Escalera_A/0/1/0/all/0/1">Arturo de la Escalera</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Regazzoni_C/0/1/0/all/0/1">Carlo Regazzoni</a></p>
3008
3009 <p>The evolution of Intelligent Transportation Systems in recent times
3010 necessitates the development of self-awareness in agents. Before the intensive
3011 use of Machine Learning, the detection of abnormalities was manually programmed
3012 by checking every variable and creating huge nested conditions that are very
3013 difficult to track. This paper aims to introduce a novel method to develop
3014 self-awareness in autonomous vehicles that mainly focuses on detecting abnormal
3015 situations around the considered agents. Multi-sensory time-series data from
3016 the vehicles are used to develop the data-driven Dynamic Bayesian Network (DBN)
3017 models used for future state prediction and the detection of dynamic
3018 abnormalities. Moreover, an initial level collective awareness model that can
3019 perform joint anomaly detection in co-operative tasks is proposed. The GNG
3020 algorithm learns the DBN models' discrete node variables; probabilistic
3021 transition links connect the node variables. A Markov Jump Particle Filter
3022 (MJPF) is applied to predict future states and detect when the vehicle is
3023 potentially misbehaving using learned DBNs as filter parameters. In this paper,
3024 datasets from real experiments of autonomous vehicles performing various tasks
3025 used to learn and test a set of switching DBN models.
3026 </p>
3027 </description>
3028 <guid isPermaLink="false">oai:arXiv.org:2010.15441</guid>
3029 </item>
3030 <item>
3031 <title>Advanced Python Performance Monitoring with Score-P. (arXiv:2010.15444v1 [cs.DC])</title>
3032 <link>http://fr.arxiv.org/abs/2010.15444</link>
3033 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gocht_A/0/1/0/all/0/1">Andreas Gocht</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schone_R/0/1/0/all/0/1">Robert Sch&#xf6;ne</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frenzel_J/0/1/0/all/0/1">Jan Frenzel</a></p>
3034
3035 <p>Within the last years, Python became more prominent in the scientific
3036 community and is now used for simulations, machine learning, and data analysis.
3037 All these tasks profit from additional compute power offered by parallelism and
3038 offloading. In the domain of High Performance Computing (HPC), we can look back
3039 to decades of experience exploiting different levels of parallelism on the
3040 core, node or inter-node level, as well as utilising accelerators. By using
3041 performance analysis tools to investigate all these levels of parallelism, we
3042 can tune applications for unprecedented performance. Unfortunately, standard
3043 Python performance analysis tools cannot cope with highly parallel programs.
3044 Since the development of such software is complex and error-prone, we
3045 demonstrate an easy-to-use solution based on an existing tool infrastructure
3046 for performance analysis. In this paper, we describe how to apply the
3047 established instrumentation framework \scorep to trace Python applications. We
3048 finish with a study of the overhead that users can expect for instrumenting
3049 their applications.
3050 </p>
3051 </description>
3052 <guid isPermaLink="false">oai:arXiv.org:2010.15444</guid>
3053 </item>
3054 <item>
3055 <title>Progressive Voice Trigger Detection: Accuracy vs Latency. (arXiv:2010.15446v1 [eess.AS])</title>
3056 <link>http://fr.arxiv.org/abs/2010.15446</link>
3057 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Sigtia_S/0/1/0/all/0/1">Siddharth Sigtia</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bridle_J/0/1/0/all/0/1">John Bridle</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Richards_H/0/1/0/all/0/1">Hywel Richards</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Clark_P/0/1/0/all/0/1">Pascal Clark</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Marchi_E/0/1/0/all/0/1">Erik Marchi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Garg_V/0/1/0/all/0/1">Vineet Garg</a></p>
3058
3059 <p>We present an architecture for voice trigger detection for virtual
3060 assistants. The main idea in this work is to exploit information in words that
3061 immediately follow the trigger phrase. We first demonstrate that by including
3062 more audio context after a detected trigger phrase, we can indeed get a more
3063 accurate decision. However, waiting to listen to more audio each time incurs a
3064 latency increase. Progressive Voice Trigger Detection allows us to trade-off
3065 latency and accuracy by accepting clear trigger candidates quickly, but waiting
3066 for more context to decide whether to accept more marginal examples. Using a
3067 two-stage architecture, we show that by delaying the decision for just 3% of
3068 detected true triggers in the test set, we are able to obtain a relative
3069 improvement of 66% in false rejection rate, while incurring only a negligible
3070 increase in latency.
3071 </p>
3072 </description>
3073 <guid isPermaLink="false">oai:arXiv.org:2010.15446</guid>
3074 </item>
3075 <item>
3076 <title>Capacity-achieving codes: a review on double transitivity. (arXiv:2010.15453v1 [cs.IT])</title>
3077 <link>http://fr.arxiv.org/abs/2010.15453</link>
3078 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ivanov_K/0/1/0/all/0/1">Kirill Ivanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Urbanke_R/0/1/0/all/0/1">R&#xfc;diger L. Urbanke</a></p>
3079
3080 <p>Recently it was proved that if a linear code is invariant under the action of
3081 a doubly transitive permutation group, it achieves the capacity of erasure
3082 channel. Therefore, it is of sufficient interest to classify all codes,
3083 invariant under such permutation groups. We take a step in this direction and
3084 give a review of all suitable groups and the known results on codes invariant
3085 under these groups. It turns out that there are capacity-achieving families of
3086 algebraic geometric codes.
3087 </p>
3088 </description>
3089 <guid isPermaLink="false">oai:arXiv.org:2010.15453</guid>
3090 </item>
3091 <item>
3092 <title>Scalable Federated Learning over Passive Optical Networks. (arXiv:2010.15454v1 [cs.NI])</title>
3093 <link>http://fr.arxiv.org/abs/2010.15454</link>
3094 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_J/0/1/0/all/0/1">Jun Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_L/0/1/0/all/0/1">Lei Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_J/0/1/0/all/0/1">Jiajia Chen</a></p>
3095
3096 <p>Two-step aggregation is introduced to facilitate scalable federated learning
3097 (SFL) over passive optical networks (PONs). Results reveal that the SFL keeps
3098 the required PON upstream bandwidth constant regardless of the number of
3099 involved clients, while bringing ~10% learning accuracy improvement.
3100 </p>
3101 </description>
3102 <guid isPermaLink="false">oai:arXiv.org:2010.15454</guid>
3103 </item>
3104 <item>
3105 <title>Optimal Sharing and and Fair Cost Allocation of Community Energy Storage. (arXiv:2010.15455v1 [cs.GT])</title>
3106 <link>http://fr.arxiv.org/abs/2010.15455</link>
3107 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_Y/0/1/0/all/0/1">Yu Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_G/0/1/0/all/0/1">Guoqiang Hu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Spanos_C/0/1/0/all/0/1">Costas J. Spanos</a></p>
3108
3109 <p>This paper studies an ES sharing model where multiple buildings cooperatively
3110 invest and share a community ES (CES) to harness economic benefits from on-site
3111 renewable integration and utility price arbitrage. Particularly, we formulate
3112 the problem that integrates the optimal ES sizing, operation and cost
3113 allocation as a coalition game, which are generally addressed separately in the
3114 literature. Particularly, we address the fair ex-post cost allocation which has
3115 not been well studied. To overcome the computational challenge of computing the
3116 entire information of explicit characteristic functions that takes exponential
3117 time, we propose a fair cost allocation based on nucleolus by employing a
3118 constraints generation technique. We study the fairness and computational
3119 efficiency of the method through a number of case studies. The numeric results
3120 imply that the proposed method outperforms the Shapley approach and
3121 proportional method either in computational efficiency or fairness. Notably,
3122 for the proposed method, only a small fraction of characteristic functions
3123 (2.54%) is computed to achieve the cost allocation versus the entire
3124 information required by Shapley approach. With the proposed cost allocation, we
3125 investigate the enhanced economic benefits of the CES model for individual
3126 buildings over individual ES (IES) installation. We see the CES model provides
3127 higher cost reduction to each committed buildings. Moreover, the value of
3128 storage is obviously improved (about 1.83 times) with the CES model over the
3129 IES model.
3130 </p>
3131 </description>
3132 <guid isPermaLink="false">oai:arXiv.org:2010.15455</guid>
3133 </item>
3134 <item>
3135 <title>Multilayer Clustered Graph Learning. (arXiv:2010.15456v1 [cs.LG])</title>
3136 <link>http://fr.arxiv.org/abs/2010.15456</link>
3137 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gheche_M/0/1/0/all/0/1">Mireille El Gheche</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frossard_P/0/1/0/all/0/1">Pascal Frossard</a></p>
3138
3139 <p>Multilayer graphs are appealing mathematical tools for modeling multiple
3140 types of relationship in the data. In this paper, we aim at analyzing
3141 multilayer graphs by properly combining the information provided by individual
3142 layers, while preserving the specific structure that allows us to eventually
3143 identify communities or clusters that are crucial in the analysis of graph
3144 data. To do so, we learn a clustered representative graph by solving an
3145 optimization problem that involves a data fidelity term to the observed layers,
3146 and a regularization pushing for a sparse and community-aware graph. We use the
3147 contrastive loss as a data fidelity term, in order to properly aggregate the
3148 observed layers into a representative graph. The regularization is based on a
3149 measure of graph sparsification called "effective resistance", coupled with a
3150 penalization of the first few eigenvalues of the representative graph Laplacian
3151 matrix to favor the formation of communities. The proposed optimization problem
3152 is nonconvex but fully differentiable, and thus can be solved via the projected
3153 gradient method. Experiments show that our method leads to a significant
3154 improvement w.r.t. state-of-the-art multilayer graph learning algorithms for
3155 solving clustering problems.
3156 </p>
3157 </description>
3158 <guid isPermaLink="false">oai:arXiv.org:2010.15456</guid>
3159 </item>
3160 <item>
3161 <title>FiGLearn: Filter and Graph Learning using Optimal Transport. (arXiv:2010.15457v1 [cs.LG])</title>
3162 <link>http://fr.arxiv.org/abs/2010.15457</link>
3163 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Minder_M/0/1/0/all/0/1">Matthias Minder</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farsijani_Z/0/1/0/all/0/1">Zahra Farsijani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shah_D/0/1/0/all/0/1">Dhruti Shah</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gheche_M/0/1/0/all/0/1">Mireille El Gheche</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frossard_P/0/1/0/all/0/1">Pascal Frossard</a></p>
3164
3165 <p>In many applications, a dataset can be considered as a set of observed
3166 signals that live on an unknown underlying graph structure. Some of these
3167 signals may be seen as white noise that has been filtered on the graph topology
3168 by a graph filter. Hence, the knowledge of the filter and the graph provides
3169 valuable information about the underlying data generation process and the
3170 complex interactions that arise in the dataset. We hence introduce a novel
3171 graph signal processing framework for jointly learning the graph and its
3172 generating filter from signal observations. We cast a new optimisation problem
3173 that minimises the Wasserstein distance between the distribution of the signal
3174 observations and the filtered signal distribution model. Our proposed method
3175 outperforms state-of-the-art graph learning frameworks on synthetic data. We
3176 then apply our method to a temperature anomaly dataset, and further show how
3177 this framework can be used to infer missing values if only very little
3178 information is available.
3179 </p>
3180 </description>
3181 <guid isPermaLink="false">oai:arXiv.org:2010.15457</guid>
3182 </item>
3183 <item>
3184 <title>Named Entity Recognition for Social Media Texts with Semantic Augmentation. (arXiv:2010.15458v1 [cs.CL])</title>
3185 <link>http://fr.arxiv.org/abs/2010.15458</link>
3186 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nie_Y/0/1/0/all/0/1">Yuyang Nie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_Y/0/1/0/all/0/1">Yuanhe Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wan_X/0/1/0/all/0/1">Xiang Wan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Song_Y/0/1/0/all/0/1">Yan Song</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dai_B/0/1/0/all/0/1">Bo Dai</a></p>
3187
3188 <p>Existing approaches for named entity recognition suffer from data sparsity
3189 problems when conducted on short and informal texts, especially user-generated
3190 social media content. Semantic augmentation is a potential way to alleviate
3191 this problem. Given that rich semantic information is implicitly preserved in
3192 pre-trained word embeddings, they are potential ideal resources for semantic
3193 augmentation. In this paper, we propose a neural-based approach to NER for
3194 social media texts where both local (from running text) and augmented semantics
3195 are taken into account. In particular, we obtain the augmented semantic
3196 information from a large-scale corpus, and propose an attentive semantic
3197 augmentation module and a gate module to encode and aggregate such information,
3198 respectively. Extensive experiments are performed on three benchmark datasets
3199 collected from English and Chinese social media platforms, where the results
3200 demonstrate the superiority of our approach to previous studies across all
3201 three datasets.
3202 </p>
3203 </description>
3204 <guid isPermaLink="false">oai:arXiv.org:2010.15458</guid>
3205 </item>
3206 <item>
3207 <title>Concatenated Codes for Recovery From Multiple Reads of DNA Sequences. (arXiv:2010.15461v1 [cs.IT])</title>
3208 <link>http://fr.arxiv.org/abs/2010.15461</link>
3209 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lenz_A/0/1/0/all/0/1">Andreas Lenz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maarouf_I/0/1/0/all/0/1">Issam Maarouf</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Welter_L/0/1/0/all/0/1">Lorenz Welter</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wachter_Zeh_A/0/1/0/all/0/1">Antonia Wachter-Zeh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rosnes_E/0/1/0/all/0/1">Eirik Rosnes</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Amat_A/0/1/0/all/0/1">Alexandre Graell i Amat</a></p>
3210
3211 <p>Decoding sequences that stem from multiple transmissions of a codeword over
3212 an insertion, deletion, and substitution channel is a critical component of
3213 efficient deoxyribonucleic acid (DNA) data storage systems. In this paper, we
3214 consider a concatenated coding scheme with an outer low-density parity-check
3215 code and either an inner convolutional code or a block code. We propose two new
3216 decoding algorithms for inference from multiple received sequences, both
3217 combining the inner code and channel to a joint hidden Markov model to infer
3218 symbolwise a posteriori probabilities (APPs). The first decoder computes the
3219 exact APPs by jointly decoding the received sequences, whereas the second
3220 decoder approximates the APPs by combining the results of separately decoded
3221 received sequences. Using the proposed algorithms, we evaluate the performance
3222 of decoding multiple received sequences by means of achievable information
3223 rates and Monte-Carlo simulations. We show significant performance gains
3224 compared to a single received sequence.
3225 </p>
3226 </description>
3227 <guid isPermaLink="false">oai:arXiv.org:2010.15461</guid>
3228 </item>
3229 <item>
3230 <title>Self-Supervised Video Representation Using Pretext-Contrastive Learning. (arXiv:2010.15464v1 [cs.CV])</title>
3231 <link>http://fr.arxiv.org/abs/2010.15464</link>
3232 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tao_L/0/1/0/all/0/1">Li Tao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_X/0/1/0/all/0/1">Xueting Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yamasaki_T/0/1/0/all/0/1">Toshihiko Yamasaki</a></p>
3233
3234 <p>Pretext tasks and contrastive learning have been successful in
3235 self-supervised learning for video retrieval and recognition. In this study, we
3236 analyze their optimization targets and utilize the hyper-sphere feature space
3237 to explore the connections between them, indicating the compatibility and
3238 consistency of these two different learning methods. Based on the analysis, we
3239 propose a self-supervised training method, referred as Pretext-Contrastive
3240 Learning (PCL), to learn video representations. Extensive experiments based on
3241 different combinations of pretext task baselines and contrastive losses confirm
3242 the strong agreement with their self-supervised learning targets, demonstrating
3243 the effectiveness and the generality of PCL. The combination of pretext tasks
3244 and contrastive losses showed significant improvements in both video retrieval
3245 and recognition over the corresponding baselines. And we can also outperform
3246 current state-of-the-art methods in the same manner. Further, our PCL is
3247 flexible and can be applied to almost all existing pretext task methods.
3248 </p>
3249 </description>
3250 <guid isPermaLink="false">oai:arXiv.org:2010.15464</guid>
3251 </item>
3252 <item>
3253 <title>Improving Named Entity Recognition with Attentive Ensemble of Syntactic Information. (arXiv:2010.15466v1 [cs.CL])</title>
3254 <link>http://fr.arxiv.org/abs/2010.15466</link>
3255 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nie_Y/0/1/0/all/0/1">Yuyang Nie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_Y/0/1/0/all/0/1">Yuanhe Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Song_Y/0/1/0/all/0/1">Yan Song</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ao_X/0/1/0/all/0/1">Xiang Ao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wan_X/0/1/0/all/0/1">Xiang Wan</a></p>
3256
3257 <p>Named entity recognition (NER) is highly sensitive to sentential syntactic
3258 and semantic properties where entities may be extracted according to how they
3259 are used and placed in the running text. To model such properties, one could
3260 rely on existing resources to providing helpful knowledge to the NER task; some
3261 existing studies proved the effectiveness of doing so, and yet are limited in
3262 appropriately leveraging the knowledge such as distinguishing the important
3263 ones for particular context. In this paper, we improve NER by leveraging
3264 different types of syntactic information through attentive ensemble, which
3265 functionalizes by the proposed key-value memory networks, syntax attention, and
3266 the gate mechanism for encoding, weighting and aggregating such syntactic
3267 information, respectively. Experimental results on six English and Chinese
3268 benchmark datasets suggest the effectiveness of the proposed model and show
3269 that it outperforms previous studies on all experiment datasets.
3270 </p>
3271 </description>
3272 <guid isPermaLink="false">oai:arXiv.org:2010.15466</guid>
3273 </item>
3274 <item>
3275 <title>Emergence of Spatial Coordinates via Exploration. (arXiv:2010.15469v1 [cs.LG])</title>
3276 <link>http://fr.arxiv.org/abs/2010.15469</link>
3277 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Laflaquiere_A/0/1/0/all/0/1">Alban Laflaqui&#xe8;re</a></p>
3278
3279 <p>Spatial knowledge is a fundamental building block for the development of
3280 advanced perceptive and cognitive abilities. Traditionally, in robotics, the
3281 Euclidean (x,y,z) coordinate system and the agent's forward model are defined a
3282 priori. We show that a naive agent can autonomously build an internal
3283 coordinate system, with the same dimension and metric regularity as the
3284 external space, simply by learning to predict the outcome of sensorimotor
3285 transitions in a self-supervised way.
3286 </p>
3287 </description>
3288 <guid isPermaLink="false">oai:arXiv.org:2010.15469</guid>
3289 </item>
3290 <item>
3291 <title>Hybrid mimetic finite-difference and virtual element formulation for coupled poromechanics. (arXiv:2010.15470v1 [math.NA])</title>
3292 <link>http://fr.arxiv.org/abs/2010.15470</link>
3293 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Borio_A/0/1/0/all/0/1">Andrea Borio</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Hamon_F/0/1/0/all/0/1">Fran&#xe7;ois Hamon</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Castelletto_N/0/1/0/all/0/1">Nicola Castelletto</a>, <a href="http://fr.arxiv.org/find/math/1/au:+White_J/0/1/0/all/0/1">Joshua A. White</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Settgast_R/0/1/0/all/0/1">Randolph R. Settgast</a></p>
3294
3295 <p>We present a hybrid mimetic finite-difference and virtual element formulation
3296 for coupled single-phase poromechanics on unstructured meshes. The key
3297 advantage of the scheme is that it is convergent on complex meshes containing
3298 highly distorted cells with arbitrary shapes. We use a local pressure-jump
3299 stabilization method based on unstructured macro-elements to prevent the
3300 development of spurious pressure modes in incompressible problems approaching
3301 undrained conditions. A scalable linear solution strategy is obtained using a
3302 block-triangular preconditioner designed specifically for the saddle-point
3303 systems arising from the proposed discretization. The accuracy and efficiency
3304 of our approach are demonstrated numerically on two-dimensional benchmark
3305 problems.
3306 </p>
3307 </description>
3308 <guid isPermaLink="false">oai:arXiv.org:2010.15470</guid>
3309 </item>
3310 <item>
3311 <title>Iteratively reweighted greedy set cover. (arXiv:2010.15476v1 [cs.DS])</title>
3312 <link>http://fr.arxiv.org/abs/2010.15476</link>
3313 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Alexa_M/0/1/0/all/0/1">Marc Alexa</a></p>
3314
3315 <p>We empirically analyze a simple heuristic for large sparse set cover
3316 problems. It uses the weighted greedy algorithm as a basic building block. By
3317 multiplicative updates of the weights attached to the elements, the greedy
3318 solution is iteratively improved. The implementation of this algorithm is
3319 trivial and the algorithm is essentially free of parameters that would require
3320 tuning. More iterations can only improve the solution. This set of features
3321 makes the approach attractive for practical problems.
3322 </p>
3323 </description>
3324 <guid isPermaLink="false">oai:arXiv.org:2010.15476</guid>
3325 </item>
3326 <item>
3327 <title>Learned infinite elements. (arXiv:2010.15479v1 [math.NA])</title>
3328 <link>http://fr.arxiv.org/abs/2010.15479</link>
3329 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Hohage_T/0/1/0/all/0/1">Thorsten Hohage</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lehrenfeld_C/0/1/0/all/0/1">Christoph Lehrenfeld</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Preuss_J/0/1/0/all/0/1">Janosch Preuss</a></p>
3330
3331 <p>We study the numerical solution of scalar time-harmonic wave equations on
3332 unbounded domains which can be split into a bounded interior domain of primary
3333 interest and an exterior domain with separable geometry. To compute the
3334 solution in the interior domain, approximations to the Dirichlet-to-Neumann
3335 (DtN) map of the exterior domain have to be imposed as transparent boundary
3336 conditions on the artificial coupling boundary. Although the DtN map can be
3337 computed by separation of variables, it is a nonlocal operator with dense
3338 matrix representations, and hence computationally inefficient. Therefore,
3339 approximations of DtN maps by sparse matrices, usually involving additional
3340 degrees of freedom, have been studied intensively in the literature using a
3341 variety of approaches including different types of infinite elements, local
3342 non-reflecting boundary conditions, and perfectly matched layers. The entries
3343 of these sparse matrices are derived analytically, e.g. from transformations or
3344 asymptotic expansions of solutions to the differential equation in the exterior
3345 domain. In contrast, in this paper we propose to `learn' the matrix entries
3346 from the DtN map in its separated form by solving an optimization problem as a
3347 preprocessing step. Theoretical considerations suggest that the approximation
3348 quality of learned infinite elements improves exponentially with increasing
3349 number of infinite element degrees of freedom, which is confirmed in numerical
3350 experiments. These numerical studies also show that learned infinite elements
3351 outperform state-of-the-art methods for the Helmholtz equation. At the same
3352 time, learned infinite elements are much more flexible than traditional methods
3353 as they, e.g., work similarly well for exterior domains involving strong
3354 reflections, for example, for the atmosphere of the Sun, which is strongly
3355 inhomogeneous and exhibits reflections at the corona.
3356 </p>
3357 </description>
3358 <guid isPermaLink="false">oai:arXiv.org:2010.15479</guid>
3359 </item>
3360 <item>
3361 <title>Convergence of Constrained Anderson Acceleration. (arXiv:2010.15482v1 [math.NA])</title>
3362 <link>http://fr.arxiv.org/abs/2010.15482</link>
3363 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Barre_M/0/1/0/all/0/1">Mathieu Barr&#xe9;</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Taylor_A/0/1/0/all/0/1">Adrien Taylor</a>, <a href="http://fr.arxiv.org/find/math/1/au:+dAspremont_A/0/1/0/all/0/1">Alexandre d&#x27;Aspremont</a></p>
3364
3365 <p>We prove non asymptotic linear convergence rates for the constrained Anderson
3366 acceleration extrapolation scheme. These guarantees come from new upper bounds
3367 on the constrained Chebyshev problem, which consists in minimizing the maximum
3368 absolute value of a polynomial on a bounded real interval with $l_1$
3369 constraints on its coefficients vector. Constrained Anderson Acceleration has a
3370 numerical cost comparable to that of the original scheme.
3371 </p>
3372 </description>
3373 <guid isPermaLink="false">oai:arXiv.org:2010.15482</guid>
3374 </item>
3375 <item>
3376 <title>Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification. (arXiv:2010.15487v1 [cs.CV])</title>
3377 <link>http://fr.arxiv.org/abs/2010.15487</link>
3378 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ali_A/0/1/0/all/0/1">Arslan Ali</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Migliorati_A/0/1/0/all/0/1">Andrea Migliorati</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bianchi_T/0/1/0/all/0/1">Tiziano Bianchi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Magli_E/0/1/0/all/0/1">Enrico Magli</a></p>
3379
3380 <p>Deep learning has shown outstanding performance in several applications
3381 including image classification. However, deep classifiers are known to be
3382 highly vulnerable to adversarial attacks, in that a minor perturbation of the
3383 input can easily lead to an error. Providing robustness to adversarial attacks
3384 is a very challenging task especially in problems involving a large number of
3385 classes, as it typically comes at the expense of an accuracy decrease. In this
3386 work, we propose the Gaussian class-conditional simplex (GCCS) loss: a novel
3387 approach for training deep robust multiclass classifiers that provides
3388 adversarial robustness while at the same time achieving or even surpassing the
3389 classification accuracy of state-of-the-art methods. Differently from other
3390 frameworks, the proposed method learns a mapping of the input classes onto
3391 target distributions in a latent space such that the classes are linearly
3392 separable. Instead of maximizing the likelihood of target labels for individual
3393 samples, our objective function pushes the network to produce feature
3394 distributions yielding high inter-class separation. The mean values of the
3395 distributions are centered on the vertices of a simplex such that each class is
3396 at the same distance from every other class. We show that the regularization of
3397 the latent space based on our approach yields excellent classification accuracy
3398 and inherently provides robustness to multiple adversarial attacks, both
3399 targeted and untargeted, outperforming state-of-the-art approaches over
3400 challenging datasets.
3401 </p>
3402 </description>
3403 <guid isPermaLink="false">oai:arXiv.org:2010.15487</guid>
3404 </item>
3405 <item>
3406 <title>Linearizing Combinators. (arXiv:2010.15490v1 [math.CT])</title>
3407 <link>http://fr.arxiv.org/abs/2010.15490</link>
3408 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Cockett_R/0/1/0/all/0/1">Robin Cockett</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lemay_J/0/1/0/all/0/1">Jean-Simon Pacaud Lemay</a></p>
3409
3410 <p>In 2017, Bauer, Johnson, Osborne, Riehl, and Tebbe (BJORT) showed that the
3411 Abelian functor calculus provides an example of a Cartesian differential
3412 category. The definition of a Cartesian differential category is based on a
3413 differential combinator which directly formalizes the total derivative from
3414 multivariable calculus. However, in the aforementioned work the authors used
3415 techniques from Goodwillie's functor calculus to establish a linearization
3416 process from which they then derived a differential combinator. This raised the
3417 question of what the precise relationship between linearization and having a
3418 differential combinator might be.
3419 </p>
3420 <p>In this paper, we introduce the notion of a linearizing combinator which
3421 abstracts linearization in the Abelian functor calculus. We then use it to
3422 provide an alternative axiomatization of a Cartesian differential category.
3423 Every Cartesian differential category comes equipped with a canonical
3424 linearizing combinator obtained by differentiation at zero. Conversely, a
3425 differential combinator can be constructed \`a la BJORT when one has a system
3426 of partial linearizing combinators in each context. Thus, while linearizing
3427 combinators do provide an alternative axiomatization of Cartesian differential
3428 categories, an explicit notion of partial linearization is required. This is in
3429 contrast to the situation for differential combinators where partial
3430 differentiation is automatic in the presence of total differentiation. The
3431 ability to form a system of partial linearizing combinators from a total
3432 linearizing combinator, while not being possible in general, is possible when
3433 the setting is Cartesian closed.
3434 </p>
3435 </description>
3436 <guid isPermaLink="false">oai:arXiv.org:2010.15490</guid>
3437 </item>
3438 <item>
3439 <title>A Novel Fast 3D Single Image Super-Resolution Algorithm. (arXiv:2010.15491v1 [eess.IV])</title>
3440 <link>http://fr.arxiv.org/abs/2010.15491</link>
3441 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Tuador_N/0/1/0/all/0/1">Nwigbo Kenule Tuador</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pham_D/0/1/0/all/0/1">Duong Hung Pham</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Michetti_J/0/1/0/all/0/1">J&#xe9;r&#xf4;me Michetti</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Basarab_A/0/1/0/all/0/1">Adrian Basarab</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kouame_D/0/1/0/all/0/1">Denis Kouam&#xe9;</a></p>
3442
3443 <p>This paper introduces a novel computationally efficient method of solving the
3444 3D single image super-resolution (SR) problem, i.e., reconstruction of a
3445 high-resolution volume from its low-resolution counterpart. The main
3446 contribution lies in the original way of handling simultaneously the associated
3447 decimation and blurring operators, based on their underlying properties in the
3448 frequency domain. In particular, the proposed decomposition technique of the 3D
3449 decimation operator allows a straightforward implementation for Tikhonov
3450 regularization, and can be further used to take into consideration other
3451 regularization functions such as the total variation, enabling the
3452 computational cost of state-of-the-art algorithms to be considerably decreased.
3453 Numerical experiments carried out showed that the proposed approach outperforms
3454 existing 3D SR methods.
3455 </p>
3456 </description>
3457 <guid isPermaLink="false">oai:arXiv.org:2010.15491</guid>
3458 </item>
3459 <item>
3460 <title>"What, not how" -- Solving an under-actuated insertion task from scratch. (arXiv:2010.15492v1 [cs.RO])</title>
3461 <link>http://fr.arxiv.org/abs/2010.15492</link>
3462 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Vezzani_G/0/1/0/all/0/1">Giulia Vezzani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Neunert_M/0/1/0/all/0/1">Michael Neunert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wulfmeier_M/0/1/0/all/0/1">Markus Wulfmeier</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jeong_R/0/1/0/all/0/1">Rae Jeong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lampe_T/0/1/0/all/0/1">Thomas Lampe</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Siegel_N/0/1/0/all/0/1">Noah Siegel</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hafner_R/0/1/0/all/0/1">Roland Hafner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Abdolmaleki_A/0/1/0/all/0/1">Abbas Abdolmaleki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Riedmiller_M/0/1/0/all/0/1">Martin Riedmiller</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nori_F/0/1/0/all/0/1">Francesco Nori</a></p>
3463
3464 <p>Robot manipulation requires a complex set of skills that need to be carefully
3465 combined and coordinated to solve a task. Yet, most ReinforcementLearning (RL)
3466 approaches in robotics study tasks which actually consist only of a single
3467 manipulation skill, such as grasping an object or inserting a pre-grasped
3468 object. As a result the skill ('how' to solve the task) but not the actual goal
3469 of a complete manipulation ('what' to solve) is specified. In contrast, we
3470 study a complex manipulation goal that requires an agent to learn and combine
3471 diverse manipulation skills. We propose a challenging, highly under-actuated
3472 peg-in-hole task with a free, rotational asymmetrical peg, requiring a broad
3473 range of manipulation skills. While correct peg (re-)orientation is a
3474 requirement for successful insertion, there is no reward associated with it.
3475 Hence an agent needs to understand this pre-condition and learn the skill to
3476 fulfil it. The final insertion reward is sparse, allowing freedom in the
3477 solution and leading to complex emerging behaviour not envisioned during the
3478 task design. We tackle the problem in a multi-task RL framework using Scheduled
3479 Auxiliary Control (SAC-X) combined with Regularized Hierarchical Policy
3480 Optimization (RHPO) which successfully solves the task in simulation and from
3481 scratch on a single robot where data is severely limited.
3482 </p>
3483 </description>
3484 <guid isPermaLink="false">oai:arXiv.org:2010.15492</guid>
3485 </item>
3486 <item>
3487 <title>Enhancing Vulnerable Road User Safety: A Survey of Existing Practices and Consideration for Using Mobile Devices for V2X Connections. (arXiv:2010.15502v1 [cs.NI])</title>
3488 <link>http://fr.arxiv.org/abs/2010.15502</link>
3489 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dasanayaka_N/0/1/0/all/0/1">Nishanthi Dasanayaka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hasan_K/0/1/0/all/0/1">Khondokar Fida Hasan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_C/0/1/0/all/0/1">Charles Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_Y/0/1/0/all/0/1">Yanming Feng</a></p>
3490
3491 <p>Vulnerable road users (VRUs) such as pedestrians, cyclists and motorcyclists
3492 are at the highest risk in the road traffic environment. Globally, over half of
3493 road traffic deaths are vulnerable road users. Although substantial efforts are
3494 being made to improve VRU safety from engineering solutions to law enforcement,
3495 the death toll of VRUs' continues to rise. The emerging technology, Cooperative
3496 Intelligent Transportation System (C-ITS), has the proven potential to enhance
3497 road safety by enabling wireless communication to exchange information among
3498 road users. Such exchanged information is utilized for creating situational
3499 awareness and detecting any potential collisions in advance to take necessary
3500 measures to avoid any possible road casualties. The current state-of-the-art
3501 solutions of C-ITS for VRU safety, however, are limited to unidirectional
3502 communication where VRUs are only responsible for alerting their presence to
3503 drivers with the intention of avoiding collisions. This one-way interaction is
3504 substantially limiting the enormous potential of C-ITS which otherwise can be
3505 employed to devise a more effective solution for the VRU safety where VRU can
3506 be equipped with bidirectional communication with full C-ITS functionalities.
3507 To address such problems and to explore better C-ITS solution suggestions for
3508 VRU, this paper reviewed and evaluated the current technologies and safety
3509 methods proposed for VRU safety over the period 2007-2020. Later, it presents
3510 the design considerations for a cellular-based Vehicle-to-VRU (V2VRU)
3511 communication system along with potential challenges of a cellular-based
3512 approach to provide necessary recommendations.
3513 </p>
3514 </description>
3515 <guid isPermaLink="false">oai:arXiv.org:2010.15502</guid>
3516 </item>
3517 <item>
3518 <title>A stochastic $\theta$-SEIHRD model: adding randomness to the COVID-19 spread. (arXiv:2010.15504v1 [math.NA])</title>
3519 <link>http://fr.arxiv.org/abs/2010.15504</link>
3520 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Leitao_A/0/1/0/all/0/1">&#xc1;lvaro Leitao</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Vazquez_C/0/1/0/all/0/1">Carlos V&#xe1;zquez</a></p>
3521
3522 <p>In this article we mainly extend the deterministic model developed in [10] to
3523 a stochastic setting. More precisely, we incorporated randomness in some
3524 coefficients by assuming that they follow a prescribed stochastic dynamics. In
3525 this way, the model variables are now represented by stochastic process, that
3526 can be simulated by appropriately solve the system of stochastic differential
3527 equations. Thus, the model becomes more complete and flexible than the
3528 deterministic analogous, as it incorporates additional uncertainties which are
3529 present in more realistic situations. In particular, confidence intervals for
3530 the main variables and worst case scenarios can be computed.
3531 </p>
3532 </description>
3533 <guid isPermaLink="false">oai:arXiv.org:2010.15504</guid>
3534 </item>
3535 <item>
3536 <title>Dynamic Formation Reshaping Based on Point Set Registration in a Swarm of Drones. (arXiv:2010.15506v1 [cs.RO])</title>
3537 <link>http://fr.arxiv.org/abs/2010.15506</link>
3538 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-Hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_M/0/1/0/all/0/1">Muhammad Mehboob Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3539
3540 <p>This work focuses on the formation reshaping in an optimized manner in
3541 autonomous swarm of drones. Here, the two main problems are: 1) how to break
3542 and reshape the initial formation in an optimal manner, and 2) how to do such
3543 reformation while minimizing the overall deviation of the drones and the
3544 overall time, i.e., without slowing down. To address the first problem, we
3545 introduce a set of routines for the drones/agents to follow while reshaping to
3546 a secondary formation shape. And the second problem is resolved by utilizing
3547 the temperature function reduction technique, originally used in the point set
3548 registration process. The goal is to be able to dynamically reform the shape of
3549 multi-agent based swarm in near-optimal manner while going through narrow
3550 openings between, for instance obstacles, and then bringing the agents back to
3551 their original shape after passing through the narrow passage using point set
3552 registration technique.
3553 </p>
3554 </description>
3555 <guid isPermaLink="false">oai:arXiv.org:2010.15506</guid>
3556 </item>
3557 <item>
3558 <title>Dynamic Resource-aware Corner Detection for Bio-inspired Vision Sensors. (arXiv:2010.15507v1 [cs.CV])</title>
3559 <link>http://fr.arxiv.org/abs/2010.15507</link>
3560 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miele_A/0/1/0/all/0/1">Antonio Miele</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3561
3562 <p>Event-based cameras are vision devices that transmit only brightness changes
3563 with low latency and ultra-low power consumption. Such characteristics make
3564 event-based cameras attractive in the field of localization and object tracking
3565 in resource-constrained systems. Since the number of generated events in such
3566 cameras is huge, the selection and filtering of the incoming events are
3567 beneficial from both increasing the accuracy of the features and reducing the
3568 computational load. In this paper, we present an algorithm to detect
3569 asynchronous corners from a stream of events in real-time on embedded systems.
3570 The algorithm is called the Three Layer Filtering-Harris or TLF-Harris
3571 algorithm. The algorithm is based on an events' filtering strategy whose
3572 purpose is 1) to increase the accuracy by deliberately eliminating some
3573 incoming events, i.e., noise, and 2) to improve the real-time performance of
3574 the system, i.e., preserving a constant throughput in terms of input events per
3575 second, by discarding unnecessary events with a limited accuracy loss. An
3576 approximation of the Harris algorithm, in turn, is used to exploit its
3577 high-quality detection capability with a low-complexity implementation to
3578 enable seamless real-time performance on embedded computing platforms. The
3579 proposed algorithm is capable of selecting the best corner candidate among
3580 neighbors and achieves an average execution time savings of 59 % compared with
3581 the conventional Harris score. Moreover, our approach outperforms the competing
3582 methods, such as eFAST, eHarris, and FA-Harris, in terms of real-time
3583 performance, and surpasses Arc* in terms of accuracy.
3584 </p>
3585 </description>
3586 <guid isPermaLink="false">oai:arXiv.org:2010.15507</guid>
3587 </item>
3588 <item>
3589 <title>FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement. (arXiv:2010.15508v1 [eess.AS])</title>
3590 <link>http://fr.arxiv.org/abs/2010.15508</link>
3591 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Horaud_R/0/1/0/all/0/1">Radu Horaud</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_X/0/1/0/all/0/1">Xiaofei Li</a></p>
3592
3593 <p>This paper proposes a full-band and sub-band fusion model, named as
3594 FullSubNet, for single-channel real-time speech enhancement. Full-band and
3595 sub-band refer to the models that input full-band and sub-band noisy spectral
3596 feature, output full-band and sub-band speech target, respectively. The
3597 sub-band model processes each frequency independently. Its input consists of
3598 one frequency and several context frequencies. The output is the prediction of
3599 the clean speech target for the corresponding frequency. These two types of
3600 models have distinct characteristics. The full-band model can capture the
3601 global spectral context and the long-distance cross-band dependencies. However,
3602 it lacks the ability to modeling signal stationarity and attending the local
3603 spectral pattern. The sub-band model is just the opposite. In our proposed
3604 FullSubNet, we connect a pure full-band model and a pure sub-band model
3605 sequentially and use practical joint training to integrate these two types of
3606 models' advantages. We conducted experiments on the DNS challenge (INTERSPEECH
3607 2020) dataset to evaluate the proposed method. Experimental results show that
3608 full-band and sub-band information are complementary, and the FullSubNet can
3609 effectively integrate them. Besides, the performance of the FullSubNet also
3610 exceeds that of the top-ranked methods in the DNS Challenge (INTERSPEECH 2020).
3611 </p>
3612 </description>
3613 <guid isPermaLink="false">oai:arXiv.org:2010.15508</guid>
3614 </item>
3615 <item>
3616 <title>Night vision obstacle detection and avoidance based on Bio-Inspired Vision Sensors. (arXiv:2010.15509v1 [cs.CV])</title>
3617 <link>http://fr.arxiv.org/abs/2010.15509</link>
3618 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_M/0/1/0/all/0/1">Muhammad Mehboob Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3619
3620 <p>Moving towards autonomy, unmanned vehicles rely heavily on state-of-the-art
3621 collision avoidance systems (CAS). However, the detection of obstacles
3622 especially during night-time is still a challenging task since the lighting
3623 conditions are not sufficient for traditional cameras to function properly.
3624 Therefore, we exploit the powerful attributes of event-based cameras to perform
3625 obstacle detection in low lighting conditions. Event cameras trigger events
3626 asynchronously at high output temporal rate with high dynamic range of up to
3627 120 $dB$. The algorithm filters background activity noise and extracts objects
3628 using robust Hough transform technique. The depth of each detected object is
3629 computed by triangulating 2D features extracted utilising LC-Harris. Finally,
3630 asynchronous adaptive collision avoidance (AACA) algorithm is applied for
3631 effective avoidance. Qualitative evaluation is compared using event-camera and
3632 traditional camera.
3633 </p>
3634 </description>
3635 <guid isPermaLink="false">oai:arXiv.org:2010.15509</guid>
3636 </item>
3637 <item>
3638 <title>Asynchronous Corner Tracking Algorithm based on Lifetime of Events for DAVIS Cameras. (arXiv:2010.15510v1 [cs.CV])</title>
3639 <link>http://fr.arxiv.org/abs/2010.15510</link>
3640 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mohamed_S/0/1/0/all/0/1">Sherif A.S. Mohamed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yasin_J/0/1/0/all/0/1">Jawad N. Yasin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haghbayan_M/0/1/0/all/0/1">Mohammad-Hashem Haghbayan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Miele_A/0/1/0/all/0/1">Antonio Miele</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heikkonen_J/0/1/0/all/0/1">Jukka Heikkonen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tenhunen_H/0/1/0/all/0/1">Hannu Tenhunen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Plosila_J/0/1/0/all/0/1">Juha Plosila</a></p>
3641
3642 <p>Event cameras, i.e., the Dynamic and Active-pixel Vision Sensor (DAVIS) ones,
3643 capture the intensity changes in the scene and generates a stream of events in
3644 an asynchronous fashion. The output rate of such cameras can reach up to 10
3645 million events per second in high dynamic environments. DAVIS cameras use novel
3646 vision sensors that mimic human eyes. Their attractive attributes, such as high
3647 output rate, High Dynamic Range (HDR), and high pixel bandwidth, make them an
3648 ideal solution for applications that require high-frequency tracking. Moreover,
3649 applications that operate in challenging lighting scenarios can exploit the
3650 high HDR of event cameras, i.e., 140 dB compared to 60 dB of traditional
3651 cameras. In this paper, a novel asynchronous corner tracking method is proposed
3652 that uses both events and intensity images captured by a DAVIS camera. The
3653 Harris algorithm is used to extract features, i.e., frame-corners from
3654 keyframes, i.e., intensity images. Afterward, a matching algorithm is used to
3655 extract event-corners from the stream of events. Events are solely used to
3656 perform asynchronous tracking until the next keyframe is captured. Neighboring
3657 events, within a window size of 5x5 pixels around the event-corner, are used to
3658 calculate the velocity and direction of extracted event-corners by fitting the
3659 2D planar using a randomized Hough transform algorithm. Experimental evaluation
3660 showed that our approach is able to update the location of the extracted
3661 corners up to 100 times during the blind time of traditional cameras, i.e.,
3662 between two consecutive intensity images.
3663 </p>
3664 </description>
3665 <guid isPermaLink="false">oai:arXiv.org:2010.15510</guid>
3666 </item>
3667 <item>
3668 <title>An Exact Solution Path Algorithm for SLOPE and Quasi-Spherical OSCAR. (arXiv:2010.15511v1 [stat.ME])</title>
3669 <link>http://fr.arxiv.org/abs/2010.15511</link>
3670 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Nomura_S/0/1/0/all/0/1">Shunichi Nomura</a></p>
3671
3672 <p>Sorted $L_1$ penalization estimator (SLOPE) is a regularization technique for
3673 sorted absolute coefficients in high-dimensional regression. By arbitrarily
3674 setting its regularization weights $\lambda$ under the monotonicity constraint,
3675 SLOPE can have various feature selection and clustering properties. On weight
3676 tuning, the selected features and their clusters are very sensitive to the
3677 tuning parameters. Moreover, the exhaustive tracking of their changes is
3678 difficult using grid search methods. This study presents a solution path
3679 algorithm that provides the complete and exact path of solutions for SLOPE in
3680 fine-tuning regularization weights. A simple optimality condition for SLOPE is
3681 derived and used to specify the next splitting point of the solution path. This
3682 study also proposes a new design of a regularization sequence $\lambda$ for
3683 feature clustering, which is called the quasi-spherical and octagonal shrinkage
3684 and clustering algorithm for regression (QS-OSCAR). QS-OSCAR is designed with a
3685 contour surface of the regularization terms most similar to a sphere. Among
3686 several regularization sequence designs, sparsity and clustering performance
3687 are compared through simulation studies. The numerical observations show that
3688 QS-OSCAR performs feature clustering more efficiently than other designs.
3689 </p>
3690 </description>
3691 <guid isPermaLink="false">oai:arXiv.org:2010.15511</guid>
3692 </item>
3693 <item>
3694 <title>UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition. (arXiv:2010.15521v1 [eess.AS])</title>
3695 <link>http://fr.arxiv.org/abs/2010.15521</link>
3696 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_Z/0/1/0/all/0/1">Zhiyu Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_H/0/1/0/all/0/1">Hui Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Batushiren/0/1/0/all/0/1">Batushiren</a></p>
3697
3698 <p>Speech enhancement at extremely low signal-to-noise ratio (SNR) condition is
3699 a very challenging problem and rarely investigated in previous works. This
3700 paper proposes a robust speech enhancement approach (UNetGAN) based on U-Net
3701 and generative adversarial learning to deal with this problem. This approach
3702 consists of a generator network and a discriminator network, which operate
3703 directly in the time domain. The generator network adopts a U-Net like
3704 structure and employs dilated convolution in the bottleneck of it. We evaluate
3705 the performance of the UNetGAN at low SNR conditions (up to -20dB) on the
3706 public benchmark. The result demonstrates that it significantly improves the
3707 speech quality and substantially outperforms the representative deep learning
3708 models, including SEGAN, cGAN fo SE, Bidirectional LSTM using phase-sensitive
3709 spectrum approximation cost function (PSA-BLSTM) and Wave-U-Net regarding
3710 Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech
3711 quality (PESQ).
3712 </p>
3713 </description>
3714 <guid isPermaLink="false">oai:arXiv.org:2010.15521</guid>
3715 </item>
3716 <item>
3717 <title>A brief overview of swarm intelligence-based algorithms for numerical association rule mining. (arXiv:2010.15524v1 [cs.NE])</title>
3718 <link>http://fr.arxiv.org/abs/2010.15524</link>
3719 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fister_I/0/1/0/all/0/1">Iztok Fister Jr.</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fister_I/0/1/0/all/0/1">Iztok Fister</a></p>
3720
3721 <p>Numerical Association Rule Mining is a popular variant of Association Rule
3722 Mining, where numerical attributes are handled without discretization. This
3723 means that the algorithms for dealing with this problem can operate directly,
3724 not only with categorical, but also with numerical attributes. Until recently,
3725 a big portion of these algorithms were based on a stochastic nature-inspired
3726 population-based paradigm. As a result, evolutionary and swarm
3727 intelligence-based algorithms showed big efficiency for dealing with the
3728 problem. In line with this, the main mission of this chapter is to make a
3729 historical overview of swarm intelligence-based algorithms for Numerical
3730 Association Rule Mining, as well as to present the main features of these
3731 algorithms for the observed problem. A taxonomy of the algorithms was proposed
3732 on the basis of the applied features found in this overview. Challenges,
3733 waiting in the future, finish this paper.
3734 </p>
3735 </description>
3736 <guid isPermaLink="false">oai:arXiv.org:2010.15524</guid>
3737 </item>
3738 <item>
3739 <title>Self-Learning Threshold-Based Load Balancing. (arXiv:2010.15525v1 [cs.PF])</title>
3740 <link>http://fr.arxiv.org/abs/2010.15525</link>
3741 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goldsztajn_D/0/1/0/all/0/1">Diego Goldsztajn</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Borst_S/0/1/0/all/0/1">Sem C. Borst</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Leeuwaarden_J/0/1/0/all/0/1">Johan S. H. van Leeuwaarden</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Mukherjee_D/0/1/0/all/0/1">Debankur Mukherjee</a> (3), <a href="http://fr.arxiv.org/find/cs/1/au:+Whiting_P/0/1/0/all/0/1">Philip A. Whiting</a> (4) ((1) Eindhoven University of Technology, (2) Tilburg University, (3) Georgia Institute of Technology, (4) Macquarie University)</p>
3742
3743 <p>We consider a large-scale service system where incoming tasks have to be
3744 instantaneously dispatched to one out of many parallel server pools. The
3745 dispatcher uses a threshold for balancing the load and keeping the maximum
3746 number of concurrent tasks across server pools low. We demonstrate that such a
3747 policy is optimal on the fluid and diffusion scales for a suitable threshold
3748 value, while only involving a small communication overhead. In order to set the
3749 threshold optimally, it is important, however, to learn the load of the system,
3750 which may be uncertain or even time-varying. For that purpose, we design a
3751 control rule for tuning the threshold in an online manner. We provide
3752 conditions which guarantee that this adaptive threshold settles at the optimal
3753 value, along with estimates for the time until this happens.
3754 </p>
3755 </description>
3756 <guid isPermaLink="false">oai:arXiv.org:2010.15525</guid>
3757 </item>
3758 <item>
3759 <title>A comparison of automatic multi-tissue segmentation methods of the human fetal brain using the FeTA Dataset. (arXiv:2010.15526v1 [eess.IV])</title>
3760 <link>http://fr.arxiv.org/abs/2010.15526</link>
3761 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Payette_K/0/1/0/all/0/1">Kelly Payette</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Dumast_P/0/1/0/all/0/1">Priscille de Dumast</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kebiri_H/0/1/0/all/0/1">Hamza Kebiri</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ezhov_I/0/1/0/all/0/1">Ivan Ezhov</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Paetzold_J/0/1/0/all/0/1">Johannes C. Paetzold</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shit_S/0/1/0/all/0/1">Suprosanna Shit</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Iqbal_A/0/1/0/all/0/1">Asim Iqbal</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Khan_R/0/1/0/all/0/1">Romesa Khan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kottke_R/0/1/0/all/0/1">Raimund Kottke</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Grehten_P/0/1/0/all/0/1">Patrice Grehten</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ji_H/0/1/0/all/0/1">Hui Ji</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lanczi_L/0/1/0/all/0/1">Levente Lanczi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Nagy_M/0/1/0/all/0/1">Marianna Nagy</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Beresova_M/0/1/0/all/0/1">Monika Beresova</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Nguyen_T/0/1/0/all/0/1">Thi Dao Nguyen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Natalucci_G/0/1/0/all/0/1">Giancarlo Natalucci</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Karayannis_T/0/1/0/all/0/1">Theofanis Karayannis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Menze_B/0/1/0/all/0/1">Bjoern Menze</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Cuadra_M/0/1/0/all/0/1">Meritxell Bach Cuadra</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jakab_A/0/1/0/all/0/1">Andras Jakab</a></p>
3762
3763 <p>It is critical to quantitatively analyse the developing human fetal brain in
3764 order to fully understand neurodevelopment in both normal fetuses and those
3765 with congenital disorders. To facilitate this analysis, automatic multi-tissue
3766 fetal brain segmentation algorithms are needed, which in turn requires open
3767 databases of segmented fetal brains. Here we introduce a publicly available
3768 database of 50 manually segmented pathological and non-pathological fetal
3769 magnetic resonance brain volume reconstructions across a range of gestational
3770 ages (20 to 33 weeks) into 7 different tissue categories (external
3771 cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, deep
3772 grey matter, brainstem/spinal cord). In addition, we quantitatively evaluate
3773 the accuracy of several automatic multi-tissue segmentation algorithms of the
3774 developing human fetal brain. Four research groups participated, submitting a
3775 total of 10 algorithms, demonstrating the benefits the database for the
3776 development of automatic algorithms.
3777 </p>
3778 </description>
3779 <guid isPermaLink="false">oai:arXiv.org:2010.15526</guid>
3780 </item>
3781 <item>
3782 <title>On the robustness of kernel-based pairwise learning. (arXiv:2010.15527v1 [stat.ML])</title>
3783 <link>http://fr.arxiv.org/abs/2010.15527</link>
3784 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Gensler_P/0/1/0/all/0/1">Patrick Gensler</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Christmann_A/0/1/0/all/0/1">Andreas Christmann</a></p>
3785
3786 <p>It is shown that many results on the statistical robustness of kernel-based
3787 pairwise learning can be derived under basically no assumptions on the input
3788 and output spaces. In particular neither moment conditions on the conditional
3789 distribution of Y given X = x nor the boundedness of the output space is
3790 needed. We obtain results on the existence and boundedness of the influence
3791 function and show qualitative robustness of the kernel-based estimator. The
3792 present paper generalizes results by Christmann and Zhou (2016) by allowing the
3793 prediction function to take two arguments and can thus be applied in a variety
3794 of situations such as ranking.
3795 </p>
3796 </description>
3797 <guid isPermaLink="false">oai:arXiv.org:2010.15527</guid>
3798 </item>
3799 <item>
3800 <title>An End to End Network Architecture for Fundamental Matrix Estimation. (arXiv:2010.15528v1 [cs.CV])</title>
3801 <link>http://fr.arxiv.org/abs/2010.15528</link>
3802 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yesheng Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_X/0/1/0/all/0/1">Xu Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_D/0/1/0/all/0/1">Dahong Qian</a></p>
3803
3804 <p>In this paper, we present a novel end-to-end network architecture to estimate
3805 fundamental matrix directly from stereo images. To establish a complete working
3806 pipeline, different deep neural networks in charge of finding correspondences
3807 in images, performing outlier rejection and calculating fundamental matrix, are
3808 integrated into an end-to-end network architecture.
3809 </p>
3810 <p>To well train the network and preserve geometry properties of fundamental
3811 matrix, a new loss function is introduced. To evaluate the accuracy of
3812 estimated fundamental matrix more reasonably, we design a new evaluation metric
3813 which is highly consistent with visualization result. Experiments conducted on
3814 both outdoor and indoor data-sets show that this network outperforms
3815 traditional methods as well as previous deep learning based methods on various
3816 metrics and achieves significant performance improvements.
3817 </p>
3818 </description>
3819 <guid isPermaLink="false">oai:arXiv.org:2010.15528</guid>
3820 </item>
3821 <item>
3822 <title>Probabilistic interval predictor based on dissimilarity functions. (arXiv:2010.15530v1 [eess.SY])</title>
3823 <link>http://fr.arxiv.org/abs/2010.15530</link>
3824 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Carnerero_A/0/1/0/all/0/1">A. Daniel Carnerero</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ramirez_D/0/1/0/all/0/1">Daniel R. Ramirez</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Alamo_T/0/1/0/all/0/1">Teodoro Alamo</a></p>
3825
3826 <p>This work presents a new method to obtain probabilistic interval predictions
3827 of a dynamical system. The method uses stored past system measurements to
3828 estimate the future evolution of the system. The proposed method relies on the
3829 use of dissimilarity functions to estimate the conditional probability density
3830 function of the outputs. A family of empirical probability density functions,
3831 parameterized by means of two parameters, is introduced. It is shown that the
3832 the proposed family encompasses the multivariable normal probability density
3833 function as a particular case. We show that the proposed method constitutes a
3834 generalization of classical estimation methods. A cross-validation scheme is
3835 used to tune the two parameters on which the methodology relies. In order to
3836 prove the effectiveness of the methodology presented, some numerical examples
3837 and comparisons are provided.
3838 </p>
3839 </description>
3840 <guid isPermaLink="false">oai:arXiv.org:2010.15530</guid>
3841 </item>
3842 <item>
3843 <title>Coordinated Formation Control for Intelligent and Connected Vehicles in Multiple Traffic Scenarios. (arXiv:2010.15531v1 [eess.SY])</title>
3844 <link>http://fr.arxiv.org/abs/2010.15531</link>
3845 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_Q/0/1/0/all/0/1">Qing Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Cai_M/0/1/0/all/0/1">Mengchi Cai</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_K/0/1/0/all/0/1">Keqiang Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_B/0/1/0/all/0/1">Biao Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_J/0/1/0/all/0/1">Jianqiang Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wu_X/0/1/0/all/0/1">Xiangbin Wu</a></p>
3846
3847 <p>In this paper, a unified multi-vehicle formation control framework for
3848 Intelligent and Connected Vehicles (ICVs) that can apply to multiple traffic
3849 scenarios is proposed. In the one-dimensional scenario, different formation
3850 geometries are analyzed and the interlaced structure is mathematically
3851 modelized to improve driving safety while making full use of the lane capacity.
3852 The assignment problem for vehicles and target positions is solved using
3853 Hungarian Algorithm to improve the flexibility of the method in multiple
3854 scenarios. In the two-dimensional scenario, an improved virtual platoon method
3855 is proposed to transfer the complex two-dimensional passing problem to the
3856 one-dimensional formation control problem based on the idea of rotation
3857 projection. Besides, the vehicle regrouping method is proposed to connect the
3858 two scenarios. Simulation results prove that the proposed multi-vehicle
3859 formation control framework can apply to multiple typical scenarios and have
3860 better performance than existing methods.
3861 </p>
3862 </description>
3863 <guid isPermaLink="false">oai:arXiv.org:2010.15531</guid>
3864 </item>
3865 <item>
3866 <title>How do Offline Measures for Exploration in Reinforcement Learning behave?. (arXiv:2010.15533v1 [cs.LG])</title>
3867 <link>http://fr.arxiv.org/abs/2010.15533</link>
3868 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hollenstein_J/0/1/0/all/0/1">Jakob J. Hollenstein</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Auddy_S/0/1/0/all/0/1">Sayantan Auddy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Saveriano_M/0/1/0/all/0/1">Matteo Saveriano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Renaudo_E/0/1/0/all/0/1">Erwan Renaudo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Piater_J/0/1/0/all/0/1">Justus Piater</a></p>
3869
3870 <p>Sufficient exploration is paramount for the success of a reinforcement
3871 learning agent. Yet, exploration is rarely assessed in an algorithm-independent
3872 way. We compare the behavior of three data-based, offline exploration metrics
3873 described in the literature on intuitive simple distributions and highlight
3874 problems to be aware of when using them. We propose a fourth metric,uniform
3875 relative entropy, and implement it using either a k-nearest-neighbor or a
3876 nearest-neighbor-ratio estimator, highlighting that the implementation choices
3877 have a profound impact on these measures.
3878 </p>
3879 </description>
3880 <guid isPermaLink="false">oai:arXiv.org:2010.15533</guid>
3881 </item>
3882 <item>
3883 <title>Poster: Benchmarking Financial Data Feed Systems. (arXiv:2010.15534v1 [cs.PF])</title>
3884 <link>http://fr.arxiv.org/abs/2010.15534</link>
3885 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Coenen_M/0/1/0/all/0/1">Manuel Coenen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wagner_C/0/1/0/all/0/1">Christoph Wagner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Echler_A/0/1/0/all/0/1">Alexander Echler</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frischbier_S/0/1/0/all/0/1">Sebastian Frischbier</a></p>
3886
3887 <p>Data-driven solutions for the investment industry require event-based backend
3888 systems to process high-volume financial data feeds with low latency, high
3889 throughput, and guaranteed delivery modes.
3890 </p>
3891 <p>At vwd we process an average of 18 billion incoming event notifications from
3892 500+ data sources for 30 million symbols per day and peak rates of 1+ million
3893 notifications per second using custom-built platforms that keep audit logs of
3894 every event.
3895 </p>
3896 <p>We currently assess modern open source event-processing platforms such as
3897 Kafka, NATS, Redis, Flink or Storm for the use in our ticker plant to reduce
3898 the maintenance effort for cross-cutting concerns and leverage hybrid
3899 deployment models. For comparability and repeatability we benchmark candidates
3900 with a standardized workload we derived from our real data feeds.
3901 </p>
3902 <p>We have enhanced an existing light-weight open source benchmarking tool in
3903 its processing, logging, and reporting capabilities to cope with our workloads.
3904 The resulting tool wrench can simulate workloads or replay snapshots in volume
3905 and dynamics like those we process in our ticker plant. We provide the tool as
3906 open source.
3907 </p>
3908 <p>As part of ongoing work we contribute details on (a) our workload and
3909 requirements for benchmarking candidate platforms for financial feed
3910 processing; (b) the current state of the tool wrench.
3911 </p>
3912 </description>
3913 <guid isPermaLink="false">oai:arXiv.org:2010.15534</guid>
3914 </item>
3915 <item>
3916 <title>Unbabel's Participation in the WMT20 Metrics Shared Task. (arXiv:2010.15535v1 [cs.CL])</title>
3917 <link>http://fr.arxiv.org/abs/2010.15535</link>
3918 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rei_R/0/1/0/all/0/1">Ricardo Rei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stewart_C/0/1/0/all/0/1">Craig Stewart</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farinha_C/0/1/0/all/0/1">Catarina Farinha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lavie_A/0/1/0/all/0/1">Alon Lavie</a></p>
3919
3920 <p>We present the contribution of the Unbabel team to the WMT 2020 Shared Task
3921 on Metrics. We intend to participate on the segment-level, document-level and
3922 system-level tracks on all language pairs, as well as the 'QE as a Metric'
3923 track. Accordingly, we illustrate results of our models in these tracks with
3924 reference to test sets from the previous year. Our submissions build upon the
3925 recently proposed COMET framework: We train several estimator models to regress
3926 on different human-generated quality scores and a novel ranking model trained
3927 on relative ranks obtained from Direct Assessments. We also propose a simple
3928 technique for converting segment-level predictions into a document-level score.
3929 Overall, our systems achieve strong results for all language pairs on previous
3930 test sets and in many cases set a new state-of-the-art.
3931 </p>
3932 </description>
3933 <guid isPermaLink="false">oai:arXiv.org:2010.15535</guid>
3934 </item>
3935 <item>
3936 <title>Matern Gaussian Processes on Graphs. (arXiv:2010.15538v1 [stat.ML])</title>
3937 <link>http://fr.arxiv.org/abs/2010.15538</link>
3938 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Borovitskiy_V/0/1/0/all/0/1">Viacheslav Borovitskiy</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Azangulov_I/0/1/0/all/0/1">Iskander Azangulov</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Terenin_A/0/1/0/all/0/1">Alexander Terenin</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Mostowsky_P/0/1/0/all/0/1">Peter Mostowsky</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Deisenroth_M/0/1/0/all/0/1">Marc Peter Deisenroth</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Durrande_N/0/1/0/all/0/1">Nicolas Durrande</a></p>
3939
3940 <p>Gaussian processes are a versatile framework for learning unknown functions
3941 in a manner that permits one to utilize prior information about their
3942 properties. Although many different Gaussian process models are readily
3943 available when the input space is Euclidean, the choice is much more limited
3944 for Gaussian processes whose input space is an undirected graph. In this work,
3945 we leverage the stochastic partial differential equation characterization of
3946 Mat\'{e}rn Gaussian processes - a widely-used model class in the Euclidean
3947 setting - to study their analog for undirected graphs. We show that the
3948 resulting Gaussian processes inherit various attractive properties of their
3949 Euclidean and Riemannian analogs and provide techniques that allow them to be
3950 trained using standard methods, such as inducing points. This enables graph
3951 Mat\'{e}rn Gaussian processes to be employed in mini-batch and non-conjugate
3952 settings, thereby making them more accessible to practitioners and easier to
3953 deploy within larger learning frameworks.
3954 </p>
3955 </description>
3956 <guid isPermaLink="false">oai:arXiv.org:2010.15538</guid>
3957 </item>
3958 <item>
3959 <title>Micromagnetics of thin films in the presence of Dzyaloshinskii-Moriya interaction. (arXiv:2010.15541v1 [math.AP])</title>
3960 <link>http://fr.arxiv.org/abs/2010.15541</link>
3961 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Davoli_E/0/1/0/all/0/1">Elisa Davoli</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Fratta_G/0/1/0/all/0/1">Giovanni Di Fratta</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Praetorius_D/0/1/0/all/0/1">Dirk Praetorius</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ruggeri_M/0/1/0/all/0/1">Michele Ruggeri</a></p>
3962
3963 <p>In this paper, we study the thin-film limit of the micromagnetic energy
3964 functional in the presence of bulk Dzyaloshinskii-Moriya interaction (DMI). Our
3965 analysis includes both a stationary $\Gamma$-convergence result for the
3966 micromagnetic energy, as well as the identification of the asymptotic behavior
3967 of the associated Landau-Lifshitz-Gilbert equation. In particular, we prove
3968 that, in the limiting model, part of the DMI term behaves like the projection
3969 of the magnetic moment onto the normal to the film, contributing this way to an
3970 increase in the shape anisotropy arising from the magnetostatic self-energy.
3971 Finally, we discuss a convergent finite element approach for the approximation
3972 of the time-dependent case and use it to numerically compare the original
3973 three-dimensional model with the two-dimensional thin-film limit.
3974 </p>
3975 </description>
3976 <guid isPermaLink="false">oai:arXiv.org:2010.15541</guid>
3977 </item>
3978 <item>
3979 <title>Systematic literature review protocol Identification and classification of feature modeling errors. (arXiv:2010.15545v1 [cs.SE])</title>
3980 <link>http://fr.arxiv.org/abs/2010.15545</link>
3981 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sepulveda_S/0/1/0/all/0/1">Samuel Sep&#xfa;lveda</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Diaz_J/0/1/0/all/0/1">Jaime D&#xed;az</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Esperguel_M/0/1/0/all/0/1">Marcelo Esperguel</a></p>
3982
3983 <p>Context: The importance of feature modeling languages for software product
3984 lines and the planning stage for a systematic literature review. Objective: A
3985 protocol for carrying out a systematic literature review about the evidence for
3986 identifying and classifying the errors in feature modeling languages. Method:
3987 The definition of a protocol to conduct a systematic literature review
3988 according to the guidelines of B. Kitchenham. Results: A validated protocol to
3989 conduct a systematic literature review. Conclusions: A proposal for the
3990 protocol definition of a systematic literature review about the identification
3991 and classification of errors in feature modeling was built. Initial results
3992 show that the effects and results for solving these errors should be carried
3993 out.
3994 </p>
3995 </description>
3996 <guid isPermaLink="false">oai:arXiv.org:2010.15545</guid>
3997 </item>
3998 <item>
3999 <title>Multi-Constitutive Neural Network for Large Deformation Poromechanics Problem. (arXiv:2010.15549v1 [cs.LG])</title>
4000 <link>http://fr.arxiv.org/abs/2010.15549</link>
4001 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Q/0/1/0/all/0/1">Qi Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yilin Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_Z/0/1/0/all/0/1">Ziyi Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Darve_E/0/1/0/all/0/1">Eric Darve</a></p>
4002
4003 <p>In this paper, we study the problem of large-strain consolidation in
4004 poromechanics with deep neural networks. Given different material properties
4005 and different loading conditions, the goal is to predict pore pressure and
4006 settlement. We propose a novel method "multi-constitutive neural network"
4007 (MCNN) such that one model can solve several different constitutive laws. We
4008 introduce a one-hot encoding vector as an additional input vector, which is
4009 used to label the constitutive law we wish to solve. Then we build a DNN which
4010 takes as input (X, t) along with a constitutive model label and outputs the
4011 corresponding solution. It is the first time, to our knowledge, that we can
4012 evaluate multi-constitutive laws through only one training process while still
4013 obtaining good accuracies. We found that MCNN trained to solve multiple PDEs
4014 outperforms individual neural network solvers trained with PDE.
4015 </p>
4016 </description>
4017 <guid isPermaLink="false">oai:arXiv.org:2010.15549</guid>
4018 </item>
4019 <item>
4020 <title>ADABOOK & MULTIBOOK: Adaptive Boosting with Chance Correction. (arXiv:2010.15550v1 [cs.LG])</title>
4021 <link>http://fr.arxiv.org/abs/2010.15550</link>
4022 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Powers_D/0/1/0/all/0/1">David M. W. Powers</a></p>
4023
4024 <p>There has been considerable interest in boosting and bagging, including the
4025 combination of the adaptive techniques of AdaBoost with the random selection
4026 with replacement techniques of Bagging. At the same time there has been a
4027 revisiting of the way we evaluate, with chance-corrected measures like Kappa,
4028 Informedness, Correlation or ROC AUC being advocated. This leads to the
4029 question of whether learning algorithms can do better by optimizing an
4030 appropriate chance corrected measure. Indeed, it is possible for a weak learner
4031 to optimize Accuracy to the detriment of the more reaslistic chance-corrected
4032 measures, and when this happens the booster can give up too early. This
4033 phenomenon is known to occur with conventional Accuracy-based AdaBoost, and the
4034 MultiBoost algorithm has been developed to overcome such problems using restart
4035 techniques based on bagging. This paper thus complements the theoretical work
4036 showing the necessity of using chance-corrected measures for evaluation, with
4037 empirical work showing how use of a chance-corrected measure can improve
4038 boosting. We show that the early surrender problem occurs in MultiBoost too, in
4039 multiclass situations, so that chance-corrected AdaBook and Multibook can beat
4040 standard Multiboost or AdaBoost, and we further identify which chance-corrected
4041 measures to use when.
4042 </p>
4043 </description>
4044 <guid isPermaLink="false">oai:arXiv.org:2010.15550</guid>
4045 </item>
4046 <item>
4047 <title>Investigating the Robustness of Artificial Intelligent Algorithms with Mixture Experiments. (arXiv:2010.15551v1 [stat.ML])</title>
4048 <link>http://fr.arxiv.org/abs/2010.15551</link>
4049 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Lian_J/0/1/0/all/0/1">Jiayi Lian</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Freeman_L/0/1/0/all/0/1">Laura Freeman</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Hong_Y/0/1/0/all/0/1">Yili Hong</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Deng_X/0/1/0/all/0/1">Xinwei Deng</a></p>
4050
4051 <p>Artificial intelligent (AI) algorithms, such as deep learning and XGboost,
4052 are used in numerous applications including computer vision, autonomous
4053 driving, and medical diagnostics. The robustness of these AI algorithms is of
4054 great interest as inaccurate prediction could result in safety concerns and
4055 limit the adoption of AI systems. In this paper, we propose a framework based
4056 on design of experiments to systematically investigate the robustness of AI
4057 classification algorithms. A robust classification algorithm is expected to
4058 have high accuracy and low variability under different application scenarios.
4059 The robustness can be affected by a wide range of factors such as the imbalance
4060 of class labels in the training dataset, the chosen prediction algorithm, the
4061 chosen dataset of the application, and a change of distribution in the training
4062 and test datasets. To investigate the robustness of AI classification
4063 algorithms, we conduct a comprehensive set of mixture experiments to collect
4064 prediction performance results. Then statistical analyses are conducted to
4065 understand how various factors affect the robustness of AI classification
4066 algorithms. We summarize our findings and provide suggestions to practitioners
4067 in AI applications.
4068 </p>
4069 </description>
4070 <guid isPermaLink="false">oai:arXiv.org:2010.15551</guid>
4071 </item>
4072 <item>
4073 <title>Successive Halving Top-k Operator. (arXiv:2010.15552v1 [cs.LG])</title>
4074 <link>http://fr.arxiv.org/abs/2010.15552</link>
4075 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Pietruszka_M/0/1/0/all/0/1">Micha&#x142; Pietruszka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Borchmann_L/0/1/0/all/0/1">&#x141;ukasz Borchmann</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gralinski_F/0/1/0/all/0/1">Filip Grali&#x144;ski</a></p>
4076
4077 <p>We propose a differentiable successive halving method of relaxing the top-k
4078 operator, rendering gradient-based optimization possible. The need to perform
4079 softmax iteratively on the entire vector of scores is avoided by using a
4080 tournament-style selection. As a result, a much better approximation of top-k
4081 with lower computational cost is achieved compared to the previous approach.
4082 </p>
4083 </description>
4084 <guid isPermaLink="false">oai:arXiv.org:2010.15552</guid>
4085 </item>
4086 <item>
4087 <title>Modulation Pattern Detection Using Complex Convolutions in Deep Learning. (arXiv:2010.15556v1 [cs.LG])</title>
4088 <link>http://fr.arxiv.org/abs/2010.15556</link>
4089 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Krzyston_J/0/1/0/all/0/1">Jakob Krzyston</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bhattacharjea_R/0/1/0/all/0/1">Rajib Bhattacharjea</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stark_A/0/1/0/all/0/1">Andrew Stark</a></p>
4090
4091 <p>Transceivers used for telecommunications transmit and receive specific
4092 modulation patterns that are represented as sequences of complex numbers.
4093 Classifying modulation patterns is challenging because noise and channel
4094 impairments affect the signals in complicated ways such that the received
4095 signal bears little resemblance to the transmitted signal. Although deep
4096 learning approaches have shown great promise over statistical methods in this
4097 problem space, deep learning frameworks continue to lag in support for
4098 complex-valued data. To address this gap, we study the implementation and use
4099 of complex convolutions in a series of convolutional neural network
4100 architectures. Replacement of data structure and convolution operations by
4101 their complex generalization in an architecture improves performance, with
4102 statistical significance, at recognizing modulation patterns in complex-valued
4103 signals with high SNR after being trained on low SNR signals. This suggests
4104 complex-valued convolutions enables networks to learn more meaningful
4105 representations. We investigate this hypothesis by comparing the features
4106 learned in each experiment by visualizing the inputs that results in one-hot
4107 modulation pattern classification for each network.
4108 </p>
4109 </description>
4110 <guid isPermaLink="false">oai:arXiv.org:2010.15556</guid>
4111 </item>
4112 <item>
4113 <title>Quantum Computing: A Taxonomy, Systematic Review and Future Directions. (arXiv:2010.15559v1 [cs.ET])</title>
4114 <link>http://fr.arxiv.org/abs/2010.15559</link>
4115 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gill_S/0/1/0/all/0/1">Sukhpal Singh Gill</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kumar_A/0/1/0/all/0/1">Adarsh Kumar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_H/0/1/0/all/0/1">Harvinder Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_M/0/1/0/all/0/1">Manmeet Singh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kaur_K/0/1/0/all/0/1">Kamalpreet Kaur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Usman_M/0/1/0/all/0/1">Muhammad Usman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Buyya_R/0/1/0/all/0/1">Rajkumar Buyya</a></p>
4116
4117 <p>Quantum computing is an emerging paradigm with the potential to offer
4118 significant computational advantage over conventional classical computing by
4119 exploiting quantum-mechanical principles such as entanglement and
4120 superposition. It is anticipated that this computational advantage of quantum
4121 computing will help to solve many complex and computationally intractable
4122 problems in several areas of research such as drug design, data science, clean
4123 energy, finance, industrial chemical development, secure communications, and
4124 quantum chemistry, among others. In recent years, tremendous progress in both
4125 quantum hardware development and quantum software/algorithm have brought
4126 quantum computing much closer to reality. As the quantum devices are expected
4127 to steadily scale up in the next few years, quantum decoherence and qubit
4128 interconnectivity are two of the major challenges to achieve quantum advantage
4129 in the NISQ era. Quantum computing is a highly topical and fast-moving field of
4130 research with significant ongoing progress in all facets. A systematic review
4131 of the existing literature on quantum computing will be invaluable to
4132 understand the current status of this emerging field and identify open
4133 challenges for the quantum computing community in the coming years. This review
4134 article presents a comprehensive review of quantum computing literature, and
4135 taxonomy of quantum computing. Further, the proposed taxonomy is used to map
4136 various related studies to identify the research gaps. A detailed overview of
4137 quantum software tools and technologies, post-quantum cryptography and quantum
4138 computer hardware development to document the current state-of-the-art in the
4139 respective areas. We finish the article by highlighting various open challenges
4140 and promising future directions for research.
4141 </p>
4142 </description>
4143 <guid isPermaLink="false">oai:arXiv.org:2010.15559</guid>
4144 </item>
4145 <item>
4146 <title>Genetic U-Net: Automatically Designing Lightweight U-shaped CNN Architectures Using the Genetic Algorithm for Retinal Vessel Segmentation. (arXiv:2010.15560v1 [eess.IV])</title>
4147 <link>http://fr.arxiv.org/abs/2010.15560</link>
4148 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Wei_J/0/1/0/all/0/1">Jiahong Wei</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Fan_Z/0/1/0/all/0/1">Zhun Fan</a></p>
4149
4150 <p>Many previous works based on deep learning for retinal vessel segmentation
4151 have achieved promising performance by manually designing U-shaped
4152 convolutional neural networks (CNNs). However, the manual design of these CNNs
4153 is time-consuming and requires extensive empirical knowledge. To address this
4154 problem, we propose a novel method using genetic algorithms (GAs) to
4155 automatically design a lightweight U-shaped CNN for retinal vessel
4156 segmentation, called Genetic U-Net. Here we first design a special search space
4157 containing the structure of U-Net and its corresponding operations, and then
4158 use genetic algorithm to search for superior architectures in this search
4159 space. Experimental results show that the proposed method outperforms the
4160 existing methods on three public datasets, DRIVE, CHASE_DB1 and STARE. In
4161 addition, the architectures obtained by the proposed method are more
4162 lightweight but robust than the state-of-the-art models.
4163 </p>
4164 </description>
4165 <guid isPermaLink="false">oai:arXiv.org:2010.15560</guid>
4166 </item>
4167 <item>
4168 <title>Federated Transfer Learning: concept and applications. (arXiv:2010.15561v1 [cs.LG])</title>
4169 <link>http://fr.arxiv.org/abs/2010.15561</link>
4170 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Saha_S/0/1/0/all/0/1">Sudipan Saha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ahmad_T/0/1/0/all/0/1">Tahir Ahmad</a></p>
4171
4172 <p>Development of Artificial Intelligence (AI) is inherently tied to the
4173 development of data. However, in most industries data exists in form of
4174 isolated islands, with limited scope of sharing between different
4175 organizations. This is an hindrance to the further development of AI. Federated
4176 learning has emerged as a possible solution to this problem in the last few
4177 years without compromising user privacy. Among different variants of the
4178 federated learning, noteworthy is federated transfer learning (FTL) that allows
4179 knowledge to be transferred across domains that do not have many overlapping
4180 features and users. In this work we provide a comprehensive survey of the
4181 existing works on this topic. In more details, we study the background of FTL
4182 and its different existing applications. We further analyze FTL from privacy
4183 and machine learning perspective.
4184 </p>
4185 </description>
4186 <guid isPermaLink="false">oai:arXiv.org:2010.15561</guid>
4187 </item>
4188 <item>
4189 <title>Limitations of the recall capabilities in delay based reservoir computing systems. (arXiv:2010.15562v1 [cs.ET])</title>
4190 <link>http://fr.arxiv.org/abs/2010.15562</link>
4191 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Koster_F/0/1/0/all/0/1">Felix K&#xf6;ster</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ehlert_D/0/1/0/all/0/1">Dominik Ehlert</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ludge_K/0/1/0/all/0/1">Kathy L&#xfc;dge</a></p>
4192
4193 <p>We analyze the memory capacity of a delay based reservoir computer with a
4194 Hopf normal form as nonlinearity and numerically compute the linear as well as
4195 the higher order recall capabilities. A possible physical realisation could be
4196 a laser with external cavity, for which the information is fed via electrical
4197 injection. A task independent quantification of the computational capability of
4198 the reservoir system is done via a complete orthonormal set of basis functions.
4199 Our results suggest that even for constant readout dimension the total memory
4200 capacity is dependent on the ratio between the information input period, also
4201 called the clock cycle, and the time delay in the system. Optimal performance
4202 is found for a time delay about 1.6 times the clock cycle
4203 </p>
4204 </description>
4205 <guid isPermaLink="false">oai:arXiv.org:2010.15562</guid>
4206 </item>
4207 <item>
4208 <title>Overcoming The Limitations of Neural Networks in Composite-Pattern Learning with Architopes. (arXiv:2010.15571v1 [cs.NE])</title>
4209 <link>http://fr.arxiv.org/abs/2010.15571</link>
4210 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kratsios_A/0/1/0/all/0/1">Anastasis Kratsios</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zamanlooy_B/0/1/0/all/0/1">Behnoosh Zamanlooy</a></p>
4211
4212 <p>The effectiveness of neural networks in solving complex problems is well
4213 recognized; however, little is known about their limitations. We demonstrate
4214 that the feed-forward architecture, for most commonly used activation
4215 functions, is incapable of approximating functions comprised of multiple
4216 sub-patterns while simultaneously respecting their composite-pattern structure.
4217 We overcome this bottleneck with a simple architecture modification that
4218 reallocates the neurons of any single feed-forward network across several
4219 smaller sub-networks, each specialized on a distinct part of the input-space.
4220 The modified architecture, called an Architope, is more expressive on two
4221 fronts. First, it is dense in an associated space of piecewise continuous
4222 functions in which the feed-forward architecture is not dense. Second, it
4223 achieves the same approximation rate as the feed-forward networks while only
4224 requiring $\mathscr{O}(N^{-1})$ fewer parameters in its hidden layers.
4225 Moreover, the architecture achieves these approximation improvements while
4226 preserving the target's composite-pattern structure.
4227 </p>
4228 </description>
4229 <guid isPermaLink="false">oai:arXiv.org:2010.15571</guid>
4230 </item>
4231 <item>
4232 <title>Experimental Analysis of Communication Relaying Delay in Low-Energy Ad-hoc Networks. (arXiv:2010.15572v1 [cs.NI])</title>
4233 <link>http://fr.arxiv.org/abs/2010.15572</link>
4234 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Miya_T/0/1/0/all/0/1">Taichi Miya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ohshima_K/0/1/0/all/0/1">Kohta Ohshima</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kitaguchi_Y/0/1/0/all/0/1">Yoshiaki Kitaguchi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yamaoka_K/0/1/0/all/0/1">Katsunori Yamaoka</a></p>
4235
4236 <p>In recent years, more and more applications use ad-hoc networks for local M2M
4237 communications, but in some cases such as when using WSNs, the software
4238 processing delay induced by packets relaying may not be negligible. In this
4239 paper, we planned and carried out a delay measurement experiment using
4240 Raspberry Pi Zero W. The results demonstrated that, in low-energy ad-hoc
4241 networks, processing delay of the application is always too large to ignore; it
4242 is at least ten times greater than the kernel routing and corresponds to 30% of
4243 the transmission delay. Furthermore, if the task is CPU-intensive, such as
4244 packet encryption, the processing delay can be greater than the transmission
4245 delay and its behavior is represented by a simple linear model. Our findings
4246 indicate that the key factor for achieving QoS in ad-hoc networks is an
4247 appropriate node-to-node load balancing that takes into account the CPU
4248 performance and the amount of traffic passing through each node.
4249 </p>
4250 </description>
4251 <guid isPermaLink="false">oai:arXiv.org:2010.15572</guid>
4252 </item>
4253 <item>
4254 <title>Import test questions into Moodle LMS. (arXiv:2010.15577v1 [cs.CY])</title>
4255 <link>http://fr.arxiv.org/abs/2010.15577</link>
4256 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mintii_I/0/1/0/all/0/1">Iryna S. Mintii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shokaliuk_S/0/1/0/all/0/1">Svitlana V. Shokaliuk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vakaliuk_T/0/1/0/all/0/1">Tetiana A. Vakaliuk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mintii_M/0/1/0/all/0/1">Mykhailo M. Mintii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Soloviev_V/0/1/0/all/0/1">Vladimir N. Soloviev</a></p>
4257
4258 <p>The purpose of the study is to highlight the theoretical and methodological
4259 aspects of preparing the test questions of the most common types in the form of
4260 text files for further import into learning management system (LMS) Moodle. The
4261 subject of the research is the automated filling of the Moodle LMS test
4262 database. The objectives of the study: to analyze the import files of test
4263 questions, their advantages and disadvantages; to develop guidelines for the
4264 preparation of test questions of common types in the form of text files for
4265 further import into Moodle LMS. The action algorithms for importing questions
4266 and instructions for submitting question files in such formats as Aiken, GIFT,
4267 Moodle XML, "True/False" questions, "Multiple Choice" (one of many and many of
4268 many), "Matching", with an open answer - "Numerical" or "Short answer" and
4269 "Essay" are offered in this article. The formats for submitting questions,
4270 examples of its designing and developed questions were demonstrated in view
4271 mode in Moodle LMS.
4272 </p>
4273 </description>
4274 <guid isPermaLink="false">oai:arXiv.org:2010.15577</guid>
4275 </item>
4276 <item>
4277 <title>Exploring the Nuances of Designing (with/for) Artificial Intelligence. (arXiv:2010.15578v1 [cs.CY])</title>
4278 <link>http://fr.arxiv.org/abs/2010.15578</link>
4279 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Stoimenova_N/0/1/0/all/0/1">Niya Stoimenova</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Price_R/0/1/0/all/0/1">Rebecca Price</a></p>
4280
4281 <p>Solutions relying on artificial intelligence are devised to predict data
4282 patterns and answer questions that are clearly defined, involve an enumerable
4283 set of solutions, clear rules, and inherently binary decision mechanisms. Yet,
4284 as they become exponentially implemented in our daily activities, they begin to
4285 transcend these initial boundaries and to affect the larger sociotechnical
4286 system in which they are situated. In this arrangement, a solution is under
4287 pressure to surpass true or false criteria and move to an ethical evaluation of
4288 right and wrong. Neither algorithmic solutions, nor purely humanistic ones will
4289 be enough to fully mitigate undesirable outcomes in the narrow state of AI or
4290 its future incarnations. We must take a holistic view. In this paper we explore
4291 the construct of infrastructure as a means to simultaneously address
4292 algorithmic and societal issues when designing AI.
4293 </p>
4294 </description>
4295 <guid isPermaLink="false">oai:arXiv.org:2010.15578</guid>
4296 </item>
4297 <item>
4298 <title>Modeling biomedical breathing signals with convolutional deep probabilistic autoencoders. (arXiv:2010.15579v1 [cs.LG])</title>
4299 <link>http://fr.arxiv.org/abs/2010.15579</link>
4300 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Pastor_Serrano_O/0/1/0/all/0/1">Oscar Pastor-Serrano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lathouwers_D/0/1/0/all/0/1">Danny Lathouwers</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Perko_Z/0/1/0/all/0/1">Zolt&#xe1;n Perk&#xf3;</a></p>
4301
4302 <p>One of the main problems with biomedical signals is the limited amount of
4303 patient-specific data and the significant amount of time needed to record a
4304 sufficient number of samples for diagnostic and treatment purposes. We explore
4305 the use of Variational Autoencoder (VAE) and Adversarial Autoencoder (AAE)
4306 algorithms based on one-dimensional convolutional neural networks in order to
4307 build generative models able to capture and represent the variability of a set
4308 of unlabeled quasi-periodic signals using as few as 10 parameters. Furthermore,
4309 we introduce a modified AAE architecture that allows simultaneous
4310 semi-supervised classification and generation of different types of signals.
4311 Our study is based on physical breathing signals, i.e. time series describing
4312 the position of chest markers, generally used to describe respiratory motion.
4313 The time series are discretized into a vector of periods, with each period
4314 containing 6 time and position values. These vectors can be transformed back
4315 into time series through an additional reconstruction neural network and allow
4316 to generate extended signals while simplifying the modeling task. The obtained
4317 models can be used to generate realistic breathing realizations from patient or
4318 population data and to classify new recordings. We show that by incorporating
4319 the labels from around 10-15\% of the dataset during training, the model can be
4320 guided to group data according to the patient it belongs to, or based on the
4321 presence of different types of breathing irregularities such as baseline
4322 shifts. Our specific motivation is to model breathing motion during
4323 radiotherapy lung cancer treatments, for which the developed model serves as an
4324 efficient tool to robustify plans against breathing uncertainties. However, the
4325 same methodology can in principle be applied to any other kind of
4326 quasi-periodic biomedical signal, representing a generically applicable tool.
4327 </p>
4328 </description>
4329 <guid isPermaLink="false">oai:arXiv.org:2010.15579</guid>
4330 </item>
4331 <item>
4332 <title>The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research. (arXiv:2010.15581v1 [cs.CY])</title>
4333 <link>http://fr.arxiv.org/abs/2010.15581</link>
4334 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ahmed_N/0/1/0/all/0/1">Nur Ahmed</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wahed_M/0/1/0/all/0/1">Muntasir Wahed</a></p>
4335
4336 <p>Increasingly, modern Artificial Intelligence (AI) research has become more
4337 computationally intensive. However, a growing concern is that due to unequal
4338 access to computing power, only certain firms and elite universities have
4339 advantages in modern AI research. Using a novel dataset of 171394 papers from
4340 57 prestigious computer science conferences, we document that firms, in
4341 particular, large technology firms and elite universities have increased
4342 participation in major AI conferences since deep learning's unanticipated rise
4343 in 2012. The effect is concentrated among elite universities, which are ranked
4344 1-50 in the QS World University Rankings. Further, we find two strategies
4345 through which firms increased their presence in AI research: first, they have
4346 increased firm-only publications; and second, firms are collaborating primarily
4347 with elite universities. Consequently, this increased presence of firms and
4348 elite universities in AI research has crowded out mid-tier (QS ranked 201-300)
4349 and lower-tier (QS ranked 301-500) universities. To provide causal evidence
4350 that deep learning's unanticipated rise resulted in this divergence, we
4351 leverage the generalized synthetic control method, a data-driven counterfactual
4352 estimator. Using machine learning based text analysis methods, we provide
4353 additional evidence that the divergence between these two groups - large firms
4354 and non-elite universities - is driven by access to computing power or compute,
4355 which we term as the "compute divide". This compute divide between large firms
4356 and non-elite universities increases concerns around bias and fairness within
4357 AI technology, and presents an obstacle towards "democratizing" AI. These
4358 results suggest that a lack of access to specialized equipment such as compute
4359 can de-democratize knowledge production.
4360 </p>
4361 </description>
4362 <guid isPermaLink="false">oai:arXiv.org:2010.15581</guid>
4363 </item>
4364 <item>
4365 <title>Improving Accuracy of Federated Learning in Non-IID Settings. (arXiv:2010.15582v1 [cs.LG])</title>
4366 <link>http://fr.arxiv.org/abs/2010.15582</link>
4367 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ozdayi_M/0/1/0/all/0/1">Mustafa Safa Ozdayi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kantarcioglu_M/0/1/0/all/0/1">Murat Kantarcioglu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Iyer_R/0/1/0/all/0/1">Rishabh Iyer</a></p>
4368
4369 <p>Federated Learning (FL) is a decentralized machine learning protocol that
4370 allows a set of participating agents to collaboratively train a model without
4371 sharing their data. This makes FL particularly suitable for settings where data
4372 privacy is desired. However, it has been observed that the performance of FL is
4373 closely tied with the local data distributions of agents. Particularly, in
4374 settings where local data distributions vastly differ among agents, FL performs
4375 rather poorly with respect to the centralized training. To address this
4376 problem, we hypothesize the reasons behind the performance degradation, and
4377 develop some techniques to address these reasons accordingly. In this work, we
4378 identify four simple techniques that can improve the performance of trained
4379 models without incurring any additional communication overhead to FL, but
4380 rather, some light computation overhead either on the client, or the
4381 server-side. In our experimental analysis, combination of our techniques
4382 improved the validation accuracy of a model trained via FL by more than 12%
4383 with respect to our baseline. This is about 5% less than the accuracy of the
4384 model trained on centralized data.
4385 </p>
4386 </description>
4387 <guid isPermaLink="false">oai:arXiv.org:2010.15582</guid>
4388 </item>
4389 <item>
4390 <title>Probabilistic Transformers. (arXiv:2010.15583v1 [cs.LG])</title>
4391 <link>http://fr.arxiv.org/abs/2010.15583</link>
4392 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Movellan_J/0/1/0/all/0/1">Javier R. Movellan</a></p>
4393
4394 <p>We show that Transformers are Maximum Posterior Probability estimators for
4395 Mixtures of Gaussian Models. This brings a probabilistic point of view to
4396 Transformers and suggests extensions to other probabilistic cases.
4397 </p>
4398 </description>
4399 <guid isPermaLink="false">oai:arXiv.org:2010.15583</guid>
4400 </item>
4401 <item>
4402 <title>Future Directions of the Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Program. (arXiv:2010.15584v1 [cs.CY])</title>
4403 <link>http://fr.arxiv.org/abs/2010.15584</link>
4404 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Arora_R/0/1/0/all/0/1">Ritu Arora</a> (1), <a href="http://fr.arxiv.org/find/cs/1/au:+Li_X/0/1/0/all/0/1">Xiaosong Li</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Hurwitz_B/0/1/0/all/0/1">Bonnie Hurwitz</a> (3), <a href="http://fr.arxiv.org/find/cs/1/au:+Fay_D/0/1/0/all/0/1">Daniel Fay</a> (4), <a href="http://fr.arxiv.org/find/cs/1/au:+Panda_D/0/1/0/all/0/1">Dhabaleswar K. Panda</a> (5), <a href="http://fr.arxiv.org/find/cs/1/au:+Valeev_E/0/1/0/all/0/1">Edward Valeev</a> (6), <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shaowen Wang</a> (7), <a href="http://fr.arxiv.org/find/cs/1/au:+Moore_S/0/1/0/all/0/1">Shirley Moore</a> (8), <a href="http://fr.arxiv.org/find/cs/1/au:+Chandrasekaran_S/0/1/0/all/0/1">Sunita Chandrasekaran</a> (9), <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_T/0/1/0/all/0/1">Ting Cao</a> (2), <a href="http://fr.arxiv.org/find/cs/1/au:+Bik_H/0/1/0/all/0/1">Holly Bik</a> (10), <a href="http://fr.arxiv.org/find/cs/1/au:+Curry_M/0/1/0/all/0/1">Matthew Curry</a> (11), <a href="http://fr.arxiv.org/find/cs/1/au:+Islam_T/0/1/0/all/0/1">Tanzima Islam</a> (12) ((1) Texas Advanced Computing Center, (2) University of Washington, (3) University of Arizona, (4) Microsoft, (5) The Ohio State University, (6) Virginia Tech University, (7) University of Illinois, (8) Oak Ridge National Lab, (9) University of Delaware, (10) University of California, Riverside, (11) Sandia National Lab, (12) Texas State University)</p>
4405
4406 <p>The CSSI 2019 workshop was held on October 28-29, 2019, in Austin, Texas. The
4407 main objectives of this workshop were to (1) understand the impact of the CSSI
4408 program on the community over the last 9 years, (2) engage workshop
4409 participants in identifying gaps and opportunities in the current CSSI
4410 landscape, (3) gather ideas on the cyberinfrastructure needs and expectations
4411 of the community with respect to the CSSI program, and (4) prepare a report
4412 summarizing the feedback gathered from the community that can inform the future
4413 solicitations of the CSSI program. The workshop brought together different
4414 stakeholders interested in provisioning sustainable cyberinfrastructure that
4415 can power discoveries impacting the various fields of science and technology
4416 and maintaining the nation's competitiveness in the areas such as scientific
4417 software, HPC, networking, cybersecurity, and data/information science. The
4418 workshop served as a venue for gathering the community-feedback on the current
4419 state of the CSSI program and its future directions.
4420 </p>
4421 </description>
4422 <guid isPermaLink="false">oai:arXiv.org:2010.15584</guid>
4423 </item>
4424 <item>
4425 <title>Panel: Economic Policy and Governance during Pandemics using AI. (arXiv:2010.15585v1 [cs.CY])</title>
4426 <link>http://fr.arxiv.org/abs/2010.15585</link>
4427 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Batarseh_F/0/1/0/all/0/1">Feras A. Batarseh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gopinath_M/0/1/0/all/0/1">Munisamy Gopinath</a></p>
4428
4429 <p>The global food supply chain (starting at farms and ending with consumers)
4430 has been seriously disrupted by many outlier events such as trade wars, the
4431 China demand shock, natural disasters, and pandemics. Outlier events create
4432 uncertainty along the entire supply chain in addition to intervening policy
4433 responses to mitigate their adverse effects. Artificial Intelligence (AI)
4434 methods (i.e. machine/reinforcement/deep learning) provide an opportunity to
4435 better understand outcomes during outlier events by identifying regular,
4436 irregular and contextual components. Employing AI can provide guidance to
4437 decision making suppliers, farmers, processors, wholesalers, and retailers
4438 along the supply chain, and policy makers to facilitate welfare-improving
4439 outcomes. This panel discusses these issues.
4440 </p>
4441 </description>
4442 <guid isPermaLink="false">oai:arXiv.org:2010.15585</guid>
4443 </item>
4444 <item>
4445 <title>Event-Driven Learning of Systematic Behaviours in Stock Markets. (arXiv:2010.15586v1 [q-fin.ST])</title>
4446 <link>http://fr.arxiv.org/abs/2010.15586</link>
4447 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-fin/1/au:+Wu_X/0/1/0/all/0/1">Xianchao Wu</a></p>
4448
4449 <p>It is reported that financial news, especially financial events expressed in
4450 news, provide information to investors' long/short decisions and influence the
4451 movements of stock markets. Motivated by this, we leverage financial event
4452 streams to train a classification neural network that detects latent
4453 event-stock linkages and stock markets' systematic behaviours in the U.S. stock
4454 market. Our proposed pipeline includes (1) a combined event extraction method
4455 that utilizes Open Information Extraction and neural co-reference resolution,
4456 (2) a BERT/ALBERT enhanced representation of events, and (3) an extended
4457 hierarchical attention network that includes attentions on event, news and
4458 temporal levels. Our pipeline achieves significantly better accuracies and
4459 higher simulated annualized returns than state-of-the-art models when being
4460 applied to predicting Standard\&amp;Poor 500, Dow Jones, Nasdaq indices and 10
4461 individual stocks.
4462 </p>
4463 </description>
4464 <guid isPermaLink="false">oai:arXiv.org:2010.15586</guid>
4465 </item>
4466 <item>
4467 <title>Impact of (SARS-CoV-2) COVID 19 on the indigenous language-speaking population in Mexico. (arXiv:2010.15588v1 [cs.CY])</title>
4468 <link>http://fr.arxiv.org/abs/2010.15588</link>
4469 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Medel_Ramirez_C/0/1/0/all/0/1">Carlos Medel-Ramirez</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Medel_Lopez_H/0/1/0/all/0/1">Hilario Medel-Lopez</a></p>
4470
4471 <p>The importance of the working document is that it allows the analysis of the
4472 information and the status of cases associated with (SARS-CoV-2) COVID-19 as
4473 open data at the municipal, state and national level, with a daily record of
4474 patients, according to a age, sex, comorbidities, for the condition of
4475 (SARS-CoV-2) COVID-19 according to the following characteristics: a) Positive,
4476 b) Negative, c) Suspicious. Likewise, it presents information related to the
4477 identification of an outpatient and / or hospitalized patient, attending to
4478 their medical development, identifying: a) Recovered, b) Deaths and c) Active,
4479 in Phase 3 and Phase 4, in the five main population areas speaker of indigenous
4480 language in the State of Veracruz - Mexico. The data analysis is carried out
4481 through the application of a data mining algorithm, which provides the
4482 information, fast and timely, required for the estimation of Medical Care
4483 Scenarios of (SARS-CoV-2) COVID-19, as well as for know the impact on the
4484 indigenous language-speaking population in Mexico.
4485 </p>
4486 </description>
4487 <guid isPermaLink="false">oai:arXiv.org:2010.15588</guid>
4488 </item>
4489 <item>
4490 <title>Enjeux \'ethiques de l'IA en sant\'e : une humanisation du parcours de soin par l'intelligence artificielle ?. (arXiv:2010.15590v1 [cs.CY])</title>
4491 <link>http://fr.arxiv.org/abs/2010.15590</link>
4492 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Muhlenbach_F/0/1/0/all/0/1">Fabrice Muhlenbach</a></p>
4493
4494 <p>Considering the use of artificial intelligence for greater personalization of
4495 patient care and better management of human and material resources may seem
4496 like an opportunity not to be missed. In order to offer a better humanization
4497 of the care pathway, artificial intelligence is a tool that decision-makers in
4498 the hospital sector must appropriate by taking care of the new ethical issues
4499 and conflicts of values that this technology generates.
4500 </p>
4501 <p>Envisager le recours \`a l'intelligence artificielle pour une plus grande
4502 personnalisation de la prise en charge du patient et une meilleure gestion des
4503 ressources humaines et mat\'erielles peut sembler une opportunit\'e \`a ne pas
4504 manquer. Afin de proposer une meilleure humanisation du parcours de soin,
4505 l'intelligence artificielle est un outil que les d\'ecideurs du milieu
4506 hospitalier doivent s'approprier en veillant aux nouveaux enjeux \'ethiques et
4507 conflits de valeurs que cette technologie engendre.
4508 </p>
4509 </description>
4510 <guid isPermaLink="false">oai:arXiv.org:2010.15590</guid>
4511 </item>
4512 <item>
4513 <title>Shared Space Transfer Learning for analyzing multi-site fMRI data. (arXiv:2010.15594v1 [cs.LG])</title>
4514 <link>http://fr.arxiv.org/abs/2010.15594</link>
4515 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yousefnezhad_M/0/1/0/all/0/1">Muhammad Yousefnezhad</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Selvitella_A/0/1/0/all/0/1">Alessandro Selvitella</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1">Daoqiang Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Greenshaw_A/0/1/0/all/0/1">Andrew J. Greenshaw</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Greiner_R/0/1/0/all/0/1">Russell Greiner</a></p>
4516
4517 <p>Multi-voxel pattern analysis (MVPA) learns predictive models from task-based
4518 functional magnetic resonance imaging (fMRI) data, for distinguishing when
4519 subjects are performing different cognitive tasks -- e.g., watching movies or
4520 making decisions. MVPA works best with a well-designed feature set and an
4521 adequate sample size. However, most fMRI datasets are noisy, high-dimensional,
4522 expensive to collect, and with small sample sizes. Further, training a robust,
4523 generalized predictive model that can analyze homogeneous cognitive tasks
4524 provided by multi-site fMRI datasets has additional challenges. This paper
4525 proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning
4526 (TL) approach that can functionally align homogeneous multi-site fMRI datasets,
4527 and so improve the prediction performance in every site. SSTL first extracts a
4528 set of common features for all subjects in each site. It then uses TL to map
4529 these site-specific features to a site-independent shared space in order to
4530 improve the performance of the MVPA. SSTL uses a scalable optimization
4531 procedure that works effectively for high-dimensional fMRI datasets. The
4532 optimization procedure extracts the common features for each site by using a
4533 single-iteration algorithm and maps these site-specific common features to the
4534 site-independent shared space. We evaluate the effectiveness of the proposed
4535 method for transferring between various cognitive tasks. Our comprehensive
4536 experiments validate that SSTL achieves superior performance to other
4537 state-of-the-art analysis techniques.
4538 </p>
4539 </description>
4540 <guid isPermaLink="false">oai:arXiv.org:2010.15594</guid>
4541 </item>
4542 <item>
4543 <title>Verification of Patterns. (arXiv:2010.15596v1 [cs.LO])</title>
4544 <link>http://fr.arxiv.org/abs/2010.15596</link>
4545 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yong Wang</a></p>
4546
4547 <p>The software patterns provide building blocks to the design and
4548 implementation of a software system, and try to make the software engineering
4549 to progress from experience to science. The software patterns were made famous
4550 because of the introduction as the design patterns. After that, patterns have
4551 been researched and developed widely and rapidly. The series of books of
4552 pattern-oriented software architecture should be marked in the development of
4553 software patterns. As mentioned in these books, formalization of patterns and
4554 an intermediate pattern language are needed and should be developed in the
4555 future of patterns. So, in this book, we formalize software patterns according
4556 to the categories of the series of books of pattern-oriented software
4557 architecture, and verify the correctness of patterns based on truly concurrent
4558 process algebra. In one aspect, patterns are formalized and verified; in the
4559 other aspect, truly concurrent process algebra can play a role of an
4560 intermediate pattern language for its rigorous theory.
4561 </p>
4562 </description>
4563 <guid isPermaLink="false">oai:arXiv.org:2010.15596</guid>
4564 </item>
4565 <item>
4566 <title>Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control. (arXiv:2010.15597v1 [cs.LG])</title>
4567 <link>http://fr.arxiv.org/abs/2010.15597</link>
4568 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rahmani_H/0/1/0/all/0/1">Hamid Radmard Rahmani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Koenke_C/0/1/0/all/0/1">Carsten Koenke</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wiering_M/0/1/0/all/0/1">Marco A. Wiering</a></p>
4569
4570 <p>In many reinforcement learning (RL) problems, it takes some time until a
4571 taken action by the agent reaches its maximum effect on the environment and
4572 consequently the agent receives the reward corresponding to that action by a
4573 delay called action-effect delay. Such delays reduce the performance of the
4574 learning algorithm and increase the computational costs, as the reinforcement
4575 learning agent values the immediate rewards more than the future reward that is
4576 more related to the taken action. This paper addresses this issue by
4577 introducing an applicable enhanced Q-learning method in which at the beginning
4578 of the learning phase, the agent takes a single action and builds a function
4579 that reflects the environments response to that action, called the reflexive
4580 $\gamma$ - function. During the training phase, the agent utilizes the created
4581 reflexive $\gamma$- function to update the Q-values. We have applied the
4582 developed method to a structural control problem in which the goal of the agent
4583 is to reduce the vibrations of a building subjected to earthquake excitations
4584 with a specified delay. Seismic control problems are considered as a complex
4585 task in structural engineering because of the stochastic and unpredictable
4586 nature of earthquakes and the complex behavior of the structure. Three
4587 scenarios are presented to study the effects of zero, medium, and long
4588 action-effect delays and the performance of the Enhanced method is compared to
4589 the standard Q-learning method. Both RL methods use neural network to learn to
4590 estimate the state-action value function that is used to control the structure.
4591 The results show that the enhanced method significantly outperforms the
4592 performance of the original method in all cases, and also improves the
4593 stability of the algorithm in dealing with action-effect delays.
4594 </p>
4595 </description>
4596 <guid isPermaLink="false">oai:arXiv.org:2010.15597</guid>
4597 </item>
4598 <item>
4599 <title>May I Ask Who's Calling? Named Entity Recognition on Call Center Transcripts for Privacy Law Compliance. (arXiv:2010.15598v1 [cs.CL])</title>
4600 <link>http://fr.arxiv.org/abs/2010.15598</link>
4601 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kaplan_M/0/1/0/all/0/1">Micaela Kaplan</a></p>
4602
4603 <p>We investigate using Named Entity Recognition on a new type of user-generated
4604 text: a call center conversation. These conversations combine problems from
4605 spontaneous speech with problems novel to conversational Automated Speech
4606 Recognition, including incorrect recognition, alongside other common problems
4607 from noisy user-generated text. Using our own corpus with new annotations,
4608 training custom contextual string embeddings, and applying a BiLSTM-CRF, we
4609 match state-of-the-art results on our novel task.
4610 </p>
4611 </description>
4612 <guid isPermaLink="false">oai:arXiv.org:2010.15598</guid>
4613 </item>
4614 <item>
4615 <title>Expert Selection in High-Dimensional Markov Decision Processes. (arXiv:2010.15599v1 [cs.LG])</title>
4616 <link>http://fr.arxiv.org/abs/2010.15599</link>
4617 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rubies_Royo_V/0/1/0/all/0/1">Vicenc Rubies-Royo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mazumdar_E/0/1/0/all/0/1">Eric Mazumdar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_R/0/1/0/all/0/1">Roy Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tomlin_C/0/1/0/all/0/1">Claire Tomlin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sastry_S/0/1/0/all/0/1">S. Shankar Sastry</a></p>
4618
4619 <p>In this work we present a multi-armed bandit framework for online expert
4620 selection in Markov decision processes and demonstrate its use in
4621 high-dimensional settings. Our method takes a set of candidate expert policies
4622 and switches between them to rapidly identify the best performing expert using
4623 a variant of the classical upper confidence bound algorithm, thus ensuring low
4624 regret in the overall performance of the system. This is useful in applications
4625 where several expert policies may be available, and one needs to be selected at
4626 run-time for the underlying environment.
4627 </p>
4628 </description>
4629 <guid isPermaLink="false">oai:arXiv.org:2010.15599</guid>
4630 </item>
4631 <item>
4632 <title>Three computational models and its equivalence. (arXiv:2010.15600v1 [cs.LO])</title>
4633 <link>http://fr.arxiv.org/abs/2010.15600</link>
4634 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lopez_C/0/1/0/all/0/1">Ciro Ivan Garcia Lopez</a></p>
4635
4636 <p>The study of computability has its origin in Hilbert's conference of 1900,
4637 where an adjacent question, to the ones he asked, is to give a precise
4638 description of the notion of algorithm. In the search for a good definition
4639 arose three independent theories: Turing and the Turing machines, G\"odel and
4640 the recursive functions, Church and the Lambda Calculus.
4641 </p>
4642 <p>Later there were established by Kleene that the classic models of computation
4643 are equivalent. This fact is widely accepted by many textbooks and the proof is
4644 omitted since the proof is tedious and unreadable. We intend to fill this gap
4645 presenting the proof in a modern way, without forgetting the mathematical
4646 details.
4647 </p>
4648 </description>
4649 <guid isPermaLink="false">oai:arXiv.org:2010.15600</guid>
4650 </item>
4651 <item>
4652 <title>Using a Binary Classification Model to Predict the Likelihood of Enrolment to the Undergraduate Program of a Philippine University. (arXiv:2010.15601v1 [cs.CY])</title>
4653 <link>http://fr.arxiv.org/abs/2010.15601</link>
4654 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Esquivel_D/0/1/0/all/0/1">Dr.Joseph A. Esquivel</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Esquivel_D/0/1/0/all/0/1">Dr. James A. Esquivel</a></p>
4655
4656 <p>With the recent implementation of the K to 12 Program, academic institutions,
4657 specifically, Colleges and Universities in the Philippines have been faced with
4658 difficulties in determining projected freshmen enrollees vis-a-vis
4659 decision-making factors for efficient resource management. Enrollment targets
4660 directly impacts success factors of Higher Education Institutions. This study
4661 covered an analysis of various characteristics of freshmen applicants affecting
4662 their admission status in a Philippine university. A predictive model was
4663 developed using Logistic Regression to evaluate the probability that an
4664 admitted student will pursue to enroll in the Institution or not. The dataset
4665 used was acquired from the University Admissions Office. The office designed an
4666 online application form to capture applicants' details. The online form was
4667 distributed to all student applicants, and most often, students, tend to
4668 provide incomplete information. Despite this fact, student characteristics, as
4669 well as geographic and demographic data based on the students' location are
4670 significant predictors of enrollment decision. The results of the study show
4671 that given limited information about prospective students, Higher Education
4672 Institutions can implement machine learning techniques to supplement management
4673 decisions and provide estimates of class sizes, in this way, it will allow the
4674 institution to optimize the allocation of resources and will have better
4675 control over net tuition revenue.
4676 </p>
4677 </description>
4678 <guid isPermaLink="false">oai:arXiv.org:2010.15601</guid>
4679 </item>
4680 <item>
4681 <title>Designing learning experiences for online teaching and learning. (arXiv:2010.15602v1 [cs.CY])</title>
4682 <link>http://fr.arxiv.org/abs/2010.15602</link>
4683 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sockalingam_N/0/1/0/all/0/1">Nachamma Sockalingam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_J/0/1/0/all/0/1">Junhua Liu</a></p>
4684
4685 <p>Teaching is about constantly innovating strategies, ways and means to engage
4686 diverse students in active and meaningful learning. In line with this, SUTD
4687 adopts various student-centric teaching and learning teaching methods and
4688 approaches. This means that our graduate/undergraduate instructors have to be
4689 ready to teach using these student student-centric teaching and learning
4690 pedagogies. In this article, I share my experiences of redesigning this
4691 teaching course that is typically conducted face-to-face to a synchronous
4692 online course and also invite one of the participant in this course to reflect
4693 on his experience as a student.
4694 </p>
4695 </description>
4696 <guid isPermaLink="false">oai:arXiv.org:2010.15602</guid>
4697 </item>
4698 <item>
4699 <title>Suppressing Mislabeled Data via Grouping and Self-Attention. (arXiv:2010.15603v1 [cs.CV])</title>
4700 <link>http://fr.arxiv.org/abs/2010.15603</link>
4701 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_X/0/1/0/all/0/1">Xiaojiang Peng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_K/0/1/0/all/0/1">Kai Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zeng_Z/0/1/0/all/0/1">Zhaoyang Zeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qing Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_J/0/1/0/all/0/1">Jianfei Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qiao_Y/0/1/0/all/0/1">Yu Qiao</a></p>
4702
4703 <p>Deep networks achieve excellent results on large-scale clean data but degrade
4704 significantly when learning from noisy labels. To suppressing the impact of
4705 mislabeled data, this paper proposes a conceptually simple yet efficient
4706 training block, termed as Attentive Feature Mixup (AFM), which allows paying
4707 more attention to clean samples and less to mislabeled ones via sample
4708 interactions in small groups. Specifically, this plug-and-play AFM first
4709 leverages a \textit{group-to-attend} module to construct groups and assign
4710 attention weights for group-wise samples, and then uses a \textit{mixup} module
4711 with the attention weights to interpolate massive noisy-suppressed samples. The
4712 AFM has several appealing benefits for noise-robust deep learning. (i) It does
4713 not rely on any assumptions and extra clean subset. (ii) With massive
4714 interpolations, the ratio of useless samples is reduced dramatically compared
4715 to the original noisy ratio. (iii) \pxj{It jointly optimizes the interpolation
4716 weights with classifiers, suppressing the influence of mislabeled data via low
4717 attention weights. (iv) It partially inherits the vicinal risk minimization of
4718 mixup to alleviate over-fitting while improves it by sampling fewer
4719 feature-target vectors around mislabeled data from the mixup vicinal
4720 distribution.} Extensive experiments demonstrate that AFM yields
4721 state-of-the-art results on two challenging real-world noisy datasets: Food101N
4722 and Clothing1M. The code will be available at
4723 https://github.com/kaiwang960112/AFM.
4724 </p>
4725 </description>
4726 <guid isPermaLink="false">oai:arXiv.org:2010.15603</guid>
4727 </item>
4728 <item>
4729 <title>Autoregressive Asymmetric Linear Gaussian Hidden Markov Models. (arXiv:2010.15604v1 [cs.LG])</title>
4730 <link>http://fr.arxiv.org/abs/2010.15604</link>
4731 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Puerto_Santana_C/0/1/0/all/0/1">Carlos Puerto-Santana</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Larranaga_P/0/1/0/all/0/1">Pedro Larra&#xf1;aga</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bielza_C/0/1/0/all/0/1">Concha Bielza</a></p>
4732
4733 <p>In a real life process evolving over time, the relationship between its
4734 relevant variables may change. Therefore, it is advantageous to have different
4735 inference models for each state of the process. Asymmetric hidden Markov models
4736 fulfil this dynamical requirement and provide a framework where the trend of
4737 the process can be expressed as a latent variable. In this paper, we modify
4738 these recent asymmetric hidden Markov models to have an asymmetric
4739 autoregressive component, allowing the model to choose the order of
4740 autoregression that maximizes its penalized likelihood for a given training
4741 set. Additionally, we show how inference, hidden states decoding and parameter
4742 learning must be adapted to fit the proposed model. Finally, we run experiments
4743 with synthetic and real data to show the capabilities of this new model.
4744 </p>
4745 </description>
4746 <guid isPermaLink="false">oai:arXiv.org:2010.15604</guid>
4747 </item>
4748 <item>
4749 <title>Manifold learning-based feature extraction for structural defect reconstruction. (arXiv:2010.15605v1 [cs.CE])</title>
4750 <link>http://fr.arxiv.org/abs/2010.15605</link>
4751 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qi Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_D/0/1/0/all/0/1">Dianzi Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_Z/0/1/0/all/0/1">Zhenghua Qian</a></p>
4752
4753 <p>Data-driven quantitative defect reconstructions using ultrasonic guided waves
4754 has recently demonstrated great potential in the area of non-destructive
4755 testing. In this paper, we develop an efficient deep learning-based defect
4756 reconstruction framework, called NetInv, which recasts the inverse guided wave
4757 scattering problem as a data-driven supervised learning progress that realizes
4758 a mapping between reflection coefficients in wavenumber domain and defect
4759 profiles in the spatial domain. The superiorities of the proposed NetInv over
4760 conventional reconstruction methods for defect reconstruction have been
4761 demonstrated by several examples. Results show that NetInv has the ability to
4762 achieve the higher quality of defect profiles with remarkable efficiency and
4763 provides valuable insight into the development of effective data driven
4764 structural health monitoring and defect reconstruction using machine learning.
4765 </p>
4766 </description>
4767 <guid isPermaLink="false">oai:arXiv.org:2010.15605</guid>
4768 </item>
4769 <item>
4770 <title>Design and Evaluation of Electric Bus Systems for Metropolitan Cities. (arXiv:2010.15606v1 [cs.CY])</title>
4771 <link>http://fr.arxiv.org/abs/2010.15606</link>
4772 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Menon_U/0/1/0/all/0/1">Unnikrishnan Menon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Panda_D/0/1/0/all/0/1">Divyani Panda</a></p>
4773
4774 <p>Over the past decade, most of the metropolitan cities across the world have
4775 been witnessing a degrading trend in air quality index. Exhaust emission data
4776 observations show that promotion of public transport could be a potential way
4777 out of this gridlock. Due to environmental concerns, numerous public transport
4778 authorities harbor a great interest in introducing zero emission electric
4779 buses. A shift from conventional diesel buses to electric buses comes with
4780 several benefits in terms of reduction in local pollution, noise, and fuel
4781 consumption. This paper proposes the relevant vehicle technologies, powertrain,
4782 and charging systems, which, in combination, provides a comprehensive
4783 methodology to design an Electric Bus that can be deployed in metropolitan
4784 cities to mitigate emission concerns.
4785 </p>
4786 </description>
4787 <guid isPermaLink="false">oai:arXiv.org:2010.15606</guid>
4788 </item>
4789 <item>
4790 <title>CRICTRS: Embeddings based Statistical and Semi Supervised Cricket Team Recommendation System. (arXiv:2010.15607v1 [cs.CY])</title>
4791 <link>http://fr.arxiv.org/abs/2010.15607</link>
4792 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chhabra_P/0/1/0/all/0/1">Prazwal Chhabra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ali_R/0/1/0/all/0/1">Rizwan Ali</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pudi_V/0/1/0/all/0/1">Vikram Pudi</a></p>
4793
4794 <p>Team Recommendation has always been a challenging aspect in team sports. Such
4795 systems aim to recommend a player combination best suited against the
4796 opposition players, resulting in an optimal outcome. In this paper, we propose
4797 a semi-supervised statistical approach to build a team recommendation system
4798 for cricket by modelling players into embeddings. To build these embeddings, we
4799 design a qualitative and quantitative rating system which considers the
4800 strength of opposition also for evaluating player performance. The embeddings
4801 obtained, describes the strengths and weaknesses of the players based on past
4802 performances of the player. We also embark on a critical aspect of team
4803 composition, which includes the number of batsmen and bowlers in the team. The
4804 team composition changes over time, depending on different factors which are
4805 tough to predict, so we take this input from the user and use the player
4806 embeddings to decide the best possible team combination with the given team
4807 composition.
4808 </p>
4809 </description>
4810 <guid isPermaLink="false">oai:arXiv.org:2010.15607</guid>
4811 </item>
4812 <item>
4813 <title>An Overview Of 3D Object Detection. (arXiv:2010.15614v1 [cs.CV])</title>
4814 <link>http://fr.arxiv.org/abs/2010.15614</link>
4815 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yilin Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ye_J/0/1/0/all/0/1">Jiayi Ye</a></p>
4816
4817 <p>Point cloud 3D object detection has recently received major attention and
4818 becomes an active research topic in 3D computer vision community. However,
4819 recognizing 3D objects in LiDAR (Light Detection and Ranging) is still a
4820 challenge due to the complexity of point clouds. Objects such as pedestrians,
4821 cyclists, or traffic cones are usually represented by quite sparse points,
4822 which makes the detection quite complex using only point cloud. In this
4823 project, we propose a framework that uses both RGB and point cloud data to
4824 perform multiclass object recognition. We use existing 2D detection models to
4825 localize the region of interest (ROI) on the RGB image, followed by a pixel
4826 mapping strategy in the point cloud, and finally, lift the initial 2D bounding
4827 box to 3D space. We use the recently released nuScenes dataset---a large-scale
4828 dataset contains many data formats---to training and evaluate our proposed
4829 architecture.
4830 </p>
4831 </description>
4832 <guid isPermaLink="false">oai:arXiv.org:2010.15614</guid>
4833 </item>
4834 <item>
4835 <title>Sampling and Reconstruction of Sparse Signals in Shift-Invariant Spaces: Generalized Shannon's Theorem Meets Compressive Sensing. (arXiv:2010.15618v1 [eess.SP])</title>
4836 <link>http://fr.arxiv.org/abs/2010.15618</link>
4837 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Vlasic_T/0/1/0/all/0/1">Tin Vla&#x161;i&#x107;</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sersic_D/0/1/0/all/0/1">Damir Ser&#x161;i&#x107;</a></p>
4838
4839 <p>This paper introduces a novel framework and corresponding methods for
4840 sampling and reconstruction of sparse signals in shift-invariant (SI) spaces.
4841 We reinterpret the random demodulator, a system that acquires sparse
4842 bandlimited signals, as a system for acquisition of linear combinations of the
4843 samples in the SI setting with the box function as the sampling kernel. The
4844 sparsity assumption is exploited by compressive sensing (CS) framework for
4845 recovery of the SI samples from a reduced set of measurements. The samples are
4846 subsequently filtered by a discrete-time correction filter in order to
4847 reconstruct expansion coefficients of an observed signal. Furthermore, we offer
4848 a generalization of the proposed framework to other sampling kernels that lie
4849 in arbitrary SI spaces. The generalized method embeds the correction filter in
4850 a CS optimization problem which directly reconstructs expansion coefficients of
4851 the signal. Both approaches recast an inherently infinite-dimensional inverse
4852 problem as a finite-dimensional CS problem in an exact way. Finally, we conduct
4853 numerical experiments on signals in B-spline spaces whose expansion
4854 coefficients are assumed to be sparse in a certain transform domain. The
4855 coefficients can be regarded as parametric models of an underlying continuous
4856 signal, obtained from a reduced set of measurements. Such continuous signal
4857 representations are particularly suitable for signal processing without
4858 converting them into samples.
4859 </p>
4860 </description>
4861 <guid isPermaLink="false">oai:arXiv.org:2010.15618</guid>
4862 </item>
4863 <item>
4864 <title>CAFE: Coarse-to-Fine Neural Symbolic Reasoning for Explainable Recommendation. (arXiv:2010.15620v1 [cs.IR])</title>
4865 <link>http://fr.arxiv.org/abs/2010.15620</link>
4866 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Xian_Y/0/1/0/all/0/1">Yikun Xian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_Z/0/1/0/all/0/1">Zuohui Fu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_H/0/1/0/all/0/1">Handong Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ge_Y/0/1/0/all/0/1">Yingqiang Ge</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_X/0/1/0/all/0/1">Xu Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_Q/0/1/0/all/0/1">Qiaoying Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Geng_S/0/1/0/all/0/1">Shijie Geng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_Z/0/1/0/all/0/1">Zhou Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Melo_G/0/1/0/all/0/1">Gerard de Melo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Muthukrishnan_S/0/1/0/all/0/1">S. Muthukrishnan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yongfeng Zhang</a></p>
4867
4868 <p>Recent research explores incorporating knowledge graphs (KG) into e-commerce
4869 recommender systems, not only to achieve better recommendation performance, but
4870 more importantly to generate explanations of why particular decisions are made.
4871 This can be achieved by explicit KG reasoning, where a model starts from a user
4872 node, sequentially determines the next step, and walks towards an item node of
4873 potential interest to the user. However, this is challenging due to the huge
4874 search space, unknown destination, and sparse signals over the KG, so
4875 informative and effective guidance is needed to achieve a satisfactory
4876 recommendation quality. To this end, we propose a CoArse-to-FinE neural
4877 symbolic reasoning approach (CAFE). It first generates user profiles as coarse
4878 sketches of user behaviors, which subsequently guide a path-finding process to
4879 derive reasoning paths for recommendations as fine-grained predictions. User
4880 profiles can capture prominent user behaviors from the history, and provide
4881 valuable signals about which kinds of path patterns are more likely to lead to
4882 potential items of interest for the user. To better exploit the user profiles,
4883 an improved path-finding algorithm called Profile-guided Path Reasoning (PPR)
4884 is also developed, which leverages an inventory of neural symbolic reasoning
4885 modules to effectively and efficiently find a batch of paths over a large-scale
4886 KG. We extensively experiment on four real-world benchmarks and observe
4887 substantial gains in the recommendation performance compared with
4888 state-of-the-art methods.
4889 </p>
4890 </description>
4891 <guid isPermaLink="false">oai:arXiv.org:2010.15620</guid>
4892 </item>
4893 <item>
4894 <title>Low-Variance Policy Gradient Estimation with World Models. (arXiv:2010.15622v1 [stat.ML])</title>
4895 <link>http://fr.arxiv.org/abs/2010.15622</link>
4896 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Nauman_M/0/1/0/all/0/1">Michal Nauman</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Hengst_F/0/1/0/all/0/1">Floris Den Hengst</a></p>
4897
4898 <p>In this paper, we propose World Model Policy Gradient (WMPG), an approach to
4899 reduce the variance of policy gradient estimates using learned world models
4900 (WM's). In WMPG, a WM is trained online and used to imagine trajectories. The
4901 imagined trajectories are used in two ways. Firstly, to calculate a
4902 without-replacement estimator of the policy gradient. Secondly, the return of
4903 the imagined trajectories is used as an informed baseline. We compare the
4904 proposed approach with AC and MAC on a set of environments of increasing
4905 complexity (CartPole, LunarLander and Pong) and find that WMPG has better
4906 sample efficiency. Based on these results, we conclude that WMPG can yield
4907 increased sample efficiency in cases where a robust latent representation of
4908 the environment can be learned.
4909 </p>
4910 </description>
4911 <guid isPermaLink="false">oai:arXiv.org:2010.15622</guid>
4912 </item>
4913 <item>
4914 <title>Fast Minimal Presentations of Bi-graded Persistence Modules. (arXiv:2010.15623v1 [math.AT])</title>
4915 <link>http://fr.arxiv.org/abs/2010.15623</link>
4916 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Kerber_M/0/1/0/all/0/1">Michael Kerber</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Rolle_A/0/1/0/all/0/1">Alexander Rolle</a></p>
4917
4918 <p>Multi-parameter persistent homology is a recent branch of topological data
4919 analysis. In this area, data sets are investigated through the lens of homology
4920 with respect to two or more scale parameters. The high computational cost of
4921 many algorithms calls for a preprocessing step to reduce the input size. In
4922 general, a minimal presentation is the smallest possible representation of a
4923 persistence module. Lesnick and Wright proposed recently an algorithm (the
4924 LW-algorithm) for computing minimal presentations based on matrix reduction. In
4925 this work, we propose, implement and benchmark several improvements over the
4926 LW-algorithm. Most notably, we propose the use of priority queues to avoid
4927 extensive scanning of the matrix columns, which constitutes the computational
4928 bottleneck in the LW-algorithm, and we combine their algorithm with ideas from
4929 the multi-parameter chunk algorithm by Fugacci and Kerber. Our extensive
4930 experiments show that our algorithm outperforms the LW-algorithm and computes
4931 the minimal presentation for data sets with millions of simplices within a few
4932 seconds. Our software is publicly available.
4933 </p>
4934 </description>
4935 <guid isPermaLink="false">oai:arXiv.org:2010.15623</guid>
4936 </item>
4937 <item>
4938 <title>Abstract Value Iteration for Hierarchical Reinforcement Learning. (arXiv:2010.15638v1 [cs.LG])</title>
4939 <link>http://fr.arxiv.org/abs/2010.15638</link>
4940 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jothimurugan_K/0/1/0/all/0/1">Kishor Jothimurugan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bastani_O/0/1/0/all/0/1">Osbert Bastani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Alur_R/0/1/0/all/0/1">Rajeev Alur</a></p>
4941
4942 <p>We propose a novel hierarchical reinforcement learning framework for control
4943 with continuous state and action spaces. In our framework, the user specifies
4944 subgoal regions which are subsets of states; then, we (i) learn options that
4945 serve as transitions between these subgoal regions, and (ii) construct a
4946 high-level plan in the resulting abstract decision process (ADP). A key
4947 challenge is that the ADP may not be Markov, which we address by proposing two
4948 algorithms for planning in the ADP. Our first algorithm is conservative,
4949 allowing us to prove theoretical guarantees on its performance, which help
4950 inform the design of subgoal regions. Our second algorithm is a practical one
4951 that interweaves planning at the abstract level and learning at the concrete
4952 level. In our experiments, we demonstrate that our approach outperforms
4953 state-of-the-art hierarchical reinforcement learning algorithms on several
4954 challenging benchmarks.
4955 </p>
4956 </description>
4957 <guid isPermaLink="false">oai:arXiv.org:2010.15638</guid>
4958 </item>
4959 <item>
4960 <title>Teaching a GAN What Not to Learn. (arXiv:2010.15639v1 [stat.ML])</title>
4961 <link>http://fr.arxiv.org/abs/2010.15639</link>
4962 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Asokan_S/0/1/0/all/0/1">Siddarth Asokan</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Seelamantula_C/0/1/0/all/0/1">Chandra Sekhar Seelamantula</a></p>
4963
4964 <p>Generative adversarial networks (GANs) were originally envisioned as
4965 unsupervised generative models that learn to follow a target distribution.
4966 Variants such as conditional GANs, auxiliary-classifier GANs (ACGANs) project
4967 GANs on to supervised and semi-supervised learning frameworks by providing
4968 labelled data and using multi-class discriminators. In this paper, we approach
4969 the supervised GAN problem from a different perspective, one that is motivated
4970 by the philosophy of the famous Persian poet Rumi who said, "The art of knowing
4971 is knowing what to ignore." In the GAN framework, we not only provide the GAN
4972 positive data that it must learn to model, but also present it with so-called
4973 negative samples that it must learn to avoid - we call this "The Rumi
4974 Framework." This formulation allows the discriminator to represent the
4975 underlying target distribution better by learning to penalize generated samples
4976 that are undesirable - we show that this capability accelerates the learning
4977 process of the generator. We present a reformulation of the standard GAN (SGAN)
4978 and least-squares GAN (LSGAN) within the Rumi setting. The advantage of the
4979 reformulation is demonstrated by means of experiments conducted on MNIST,
4980 Fashion MNIST, CelebA, and CIFAR-10 datasets. Finally, we consider an
4981 application of the proposed formulation to address the important problem of
4982 learning an under-represented class in an unbalanced dataset. The Rumi approach
4983 results in substantially lower FID scores than the standard GAN frameworks
4984 while possessing better generalization capability.
4985 </p>
4986 </description>
4987 <guid isPermaLink="false">oai:arXiv.org:2010.15639</guid>
4988 </item>
4989 <item>
4990 <title>Free-Form Image Inpainting via Contrastive Attention Network. (arXiv:2010.15643v1 [cs.CV])</title>
4991 <link>http://fr.arxiv.org/abs/2010.15643</link>
4992 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_X/0/1/0/all/0/1">Xin Ma</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_X/0/1/0/all/0/1">Xiaoqiang Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_H/0/1/0/all/0/1">Huaibo Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chai_Z/0/1/0/all/0/1">Zhenhua Chai</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_X/0/1/0/all/0/1">Xiaolin Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_R/0/1/0/all/0/1">Ran He</a></p>
4993
4994 <p>Most deep learning based image inpainting approaches adopt autoencoder or its
4995 variants to fill missing regions in images. Encoders are usually utilized to
4996 learn powerful representational spaces, which are important for dealing with
4997 sophisticated learning tasks. Specifically, in image inpainting tasks, masks
4998 with any shapes can appear anywhere in images (i.e., free-form masks) which
4999 form complex patterns. It is difficult for encoders to capture such powerful
5000 representations under this complex situation. To tackle this problem, we
5001 propose a self-supervised Siamese inference network to improve the robustness
5002 and generalization. It can encode contextual semantics from full resolution
5003 images and obtain more discriminative representations. we further propose a
5004 multi-scale decoder with a novel dual attention fusion module (DAF), which can
5005 combine both the restored and known regions in a smooth way. This multi-scale
5006 architecture is beneficial for decoding discriminative representations learned
5007 by encoders into images layer by layer. In this way, unknown regions will be
5008 filled naturally from outside to inside. Qualitative and quantitative
5009 experiments on multiple datasets, including facial and natural datasets (i.e.,
5010 Celeb-HQ, Pairs Street View, Places2 and ImageNet), demonstrate that our
5011 proposed method outperforms state-of-the-art methods in generating high-quality
5012 inpainting results.
5013 </p>
5014 </description>
5015 <guid isPermaLink="false">oai:arXiv.org:2010.15643</guid>
5016 </item>
5017 <item>
5018 <title>Brain Tumor Segmentation Network Using Attention-based Fusion and Spatial Relationship Constraint. (arXiv:2010.15647v1 [eess.IV])</title>
5019 <link>http://fr.arxiv.org/abs/2010.15647</link>
5020 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_C/0/1/0/all/0/1">Chenyu Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ding_W/0/1/0/all/0/1">Wangbin Ding</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_L/0/1/0/all/0/1">Lei Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_Z/0/1/0/all/0/1">Zhen Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pei_C/0/1/0/all/0/1">Chenhao Pei</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Huang_L/0/1/0/all/0/1">Liqin Huang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhuang_X/0/1/0/all/0/1">Xiahai Zhuang</a></p>
5021
5022 <p>Delineating the brain tumor from magnetic resonance (MR) images is critical
5023 for the treatment of gliomas. However, automatic delineation is challenging due
5024 to the complex appearance and ambiguous outlines of tumors. Considering that
5025 multi-modal MR images can reflect different tumor biological properties, we
5026 develop a novel multi-modal tumor segmentation network (MMTSN) to robustly
5027 segment brain tumors based on multi-modal MR images. The MMTSN is composed of
5028 three sub-branches and a main branch. Specifically, the sub-branches are used
5029 to capture different tumor features from multi-modal images, while in the main
5030 branch, we design a spatial-channel fusion block (SCFB) to effectively
5031 aggregate multi-modal features. Additionally, inspired by the fact that the
5032 spatial relationship between sub-regions of tumor is relatively fixed, e.g.,
5033 the enhancing tumor is always in the tumor core, we propose a spatial loss to
5034 constrain the relationship between different sub-regions of tumor. We evaluate
5035 our method on the test set of multi-modal brain tumor segmentation challenge
5036 2020 (BraTs2020). The method achieves 0.8764, 0.8243 and 0.773 dice score for
5037 whole tumor, tumor core and enhancing tumor, respectively.
5038 </p>
5039 </description>
5040 <guid isPermaLink="false">oai:arXiv.org:2010.15647</guid>
5041 </item>
5042 <item>
5043 <title>Reliable Graph Neural Networks via Robust Aggregation. (arXiv:2010.15651v1 [cs.LG])</title>
5044 <link>http://fr.arxiv.org/abs/2010.15651</link>
5045 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Geisler_S/0/1/0/all/0/1">Simon Geisler</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zugner_D/0/1/0/all/0/1">Daniel Z&#xfc;gner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gunnemann_S/0/1/0/all/0/1">Stephan G&#xfc;nnemann</a></p>
5046
5047 <p>Perturbations targeting the graph structure have proven to be extremely
5048 effective in reducing the performance of Graph Neural Networks (GNNs), and
5049 traditional defenses such as adversarial training do not seem to be able to
5050 improve robustness. This work is motivated by the observation that
5051 adversarially injected edges effectively can be viewed as additional samples to
5052 a node's neighborhood aggregation function, which results in distorted
5053 aggregations accumulating over the layers. Conventional GNN aggregation
5054 functions, such as a sum or mean, can be distorted arbitrarily by a single
5055 outlier. We propose a robust aggregation function motivated by the field of
5056 robust statistics. Our approach exhibits the largest possible breakdown point
5057 of 0.5, which means that the bias of the aggregation is bounded as long as the
5058 fraction of adversarial edges of a node is less than 50\%. Our novel
5059 aggregation function, Soft Medoid, is a fully differentiable generalization of
5060 the Medoid and therefore lends itself well for end-to-end deep learning.
5061 Equipping a GNN with our aggregation improves the robustness with respect to
5062 structure perturbations on Cora ML by a factor of 3 (and 5.5 on Citeseer) and
5063 by a factor of 8 for low-degree nodes.
5064 </p>
5065 </description>
5066 <guid isPermaLink="false">oai:arXiv.org:2010.15651</guid>
5067 </item>
5068 <item>
5069 <title>Semi-Supervised Speech Recognition via Graph-based Temporal Classification. (arXiv:2010.15653v1 [cs.LG])</title>
5070 <link>http://fr.arxiv.org/abs/2010.15653</link>
5071 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Moritz_N/0/1/0/all/0/1">Niko Moritz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hori_T/0/1/0/all/0/1">Takaaki Hori</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Roux_J/0/1/0/all/0/1">Jonathan Le Roux</a></p>
5072
5073 <p>Semi-supervised learning has demonstrated promising results in automatic
5074 speech recognition (ASR) by self-training using a seed ASR model with
5075 pseudo-labels generated for unlabeled data. The effectiveness of this approach
5076 largely relies on the pseudo-label accuracy, for which typically only the
5077 1-best ASR hypothesis is used. However, alternative ASR hypotheses of an N-best
5078 list can provide more accurate labels for an unlabeled speech utterance and
5079 also reflect uncertainties of the seed ASR model. In this paper, we propose a
5080 generalized form of the connectionist temporal classification (CTC) objective
5081 that accepts a graph representation of the training targets. The newly proposed
5082 graph-based temporal classification (GTC) objective is applied for
5083 self-training with WFST-based supervision, which is generated from an N-best
5084 list of pseudo-labels. In this setup, GTC is used to learn not only a temporal
5085 alignment, similarly to CTC, but also a label alignment to obtain the optimal
5086 pseudo-label sequence from the weighted graph. Results show that this approach
5087 can effectively exploit an N-best list of pseudo-labels with associated scores,
5088 outperforming standard pseudo-labeling by a large margin, with ASR results
5089 close to an oracle experiment in which the best hypotheses of the N-best lists
5090 are selected manually.
5091 </p>
5092 </description>
5093 <guid isPermaLink="false">oai:arXiv.org:2010.15653</guid>
5094 </item>
5095 <item>
5096 <title>Identification of complex mixtures for Raman spectroscopy using a novel scheme based on a new multi-label deep neural network. (arXiv:2010.15654v1 [eess.SP])</title>
5097 <link>http://fr.arxiv.org/abs/2010.15654</link>
5098 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Pan_L/0/1/0/all/0/1">Liangrui Pan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pipitsunthonsan_P/0/1/0/all/0/1">Pronthep Pipitsunthonsan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Daengngam_C/0/1/0/all/0/1">Chalongrat Daengngam</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chongcheawchamnan_M/0/1/0/all/0/1">Mitchai Chongcheawchamnan</a></p>
5099
5100 <p>With noisy environment caused by fluoresence and additive white noise as well
5101 as complicated spectrum fingerprints, the identification of complex mixture
5102 materials remains a major challenge in Raman spectroscopy application. In this
5103 paper, we propose a new scheme based on a constant wavelet transform (CWT) and
5104 a deep network for classifying complex mixture. The scheme first transforms the
5105 noisy Raman spectrum to a two-dimensional scale map using CWT. A multi-label
5106 deep neural network model (MDNN) is then applied for classifying material. The
5107 proposed model accelerates the feature extraction and expands the feature graph
5108 using the global averaging pooling layer. The Sigmoid function is implemented
5109 in the last layer of the model. The MDNN model was trained, validated and
5110 tested with data collected from the samples prepared from substances in palm
5111 oil. During training and validating process, data augmentation is applied to
5112 overcome the imbalance of data and enrich the diversity of Raman spectra. From
5113 the test results, it is found that the MDNN model outperforms previously
5114 proposed deep neural network models in terms of Hamming loss, one error,
5115 coverage, ranking loss, average precision, F1 macro averaging and F1 micro
5116 averaging, respectively. The average detection time obtained from our model is
5117 5.31 s, which is much faster than the detection time of the previously proposed
5118 models.
5119 </p>
5120 </description>
5121 <guid isPermaLink="false">oai:arXiv.org:2010.15654</guid>
5122 </item>
5123 <item>
5124 <title>Generalization bounds for deep thresholding networks. (arXiv:2010.15658v1 [math.ST])</title>
5125 <link>http://fr.arxiv.org/abs/2010.15658</link>
5126 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Behboodi_A/0/1/0/all/0/1">Arash Behboodi</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Rauhut_H/0/1/0/all/0/1">Holger Rauhut</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Schnoor_E/0/1/0/all/0/1">Ekkehard Schnoor</a></p>
5127
5128 <p>We consider compressive sensing in the scenario where the sparsity basis
5129 (dictionary) is not known in advance, but needs to be learned from examples.
5130 Motivated by the well-known iterative soft thresholding algorithm for the
5131 reconstruction, we define deep networks parametrized by the dictionary, which
5132 we call deep thresholding networks. Based on training samples, we aim at
5133 learning the optimal sparsifying dictionary and thereby the optimal network
5134 that reconstructs signals from their low-dimensional linear measurements. The
5135 dictionary learning is performed via minimizing the empirical risk. We derive
5136 generalization bounds by analyzing the Rademacher complexity of hypothesis
5137 classes consisting of such deep networks. We obtain estimates of the sample
5138 complexity that depend only linearly on the dimensions and on the depth.
5139 </p>
5140 </description>
5141 <guid isPermaLink="false">oai:arXiv.org:2010.15658</guid>
5142 </item>
5143 <item>
5144 <title>Independence Tests Without Ground Truth for Noisy Learners. (arXiv:2010.15662v1 [stat.ML])</title>
5145 <link>http://fr.arxiv.org/abs/2010.15662</link>
5146 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Corrada_Emmanuel_A/0/1/0/all/0/1">Andr&#xe9;s Corrada-Emmanuel</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Pantridge_E/0/1/0/all/0/1">Edward Pantridge</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Zahrebelski_E/0/1/0/all/0/1">Eddie Zahrebelski</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Chaganti_A/0/1/0/all/0/1">Aditya Chaganti</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Simeonov_S/0/1/0/all/0/1">Simeon Simeonov</a></p>
5147
5148 <p>Exact ground truth invariant polynomial systems can be written for
5149 arbitrarily correlated binary classifiers. Their solutions give estimates for
5150 sample statistics that require knowledge of the ground truth of the correct
5151 labels in the sample. Of these polynomial systems, only a few have been solved
5152 in closed form. Here we discuss the exact solution for independent binary
5153 classifiers - resolving an outstanding problem that has been presented at this
5154 conference and others. Its practical applicability is hampered by its sole
5155 remaining assumption - the classifiers need to be independent in their sample
5156 errors. We discuss how to use the closed form solution to create a
5157 self-consistent test that can validate the independence assumption itself
5158 absent the correct labels ground truth. It can be cast as an algebraic geometry
5159 conjecture for binary classifiers that remains unsolved. A similar conjecture
5160 for the ground truth invariant algebraic system for scalar regressors is
5161 solvable, and we present the solution here. We also discuss experiments on the
5162 Penn ML Benchmark classification tasks that provide further evidence that the
5163 conjecture may be true for the polynomial system of binary classifiers.
5164 </p>
5165 </description>
5166 <guid isPermaLink="false">oai:arXiv.org:2010.15662</guid>
5167 </item>
5168 <item>
5169 <title>Machine Ethics and Automated Vehicles. (arXiv:2010.15665v1 [cs.CY])</title>
5170 <link>http://fr.arxiv.org/abs/2010.15665</link>
5171 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goodall_N/0/1/0/all/0/1">Noah J. Goodall</a></p>
5172
5173 <p>Road vehicle travel at a reasonable speed involves some risk, even when using
5174 computer-controlled driving with failure-free hardware and perfect sensing. A
5175 fully-automated vehicle must continuously decide how to allocate this risk
5176 without a human driver's oversight. These are ethical decisions, particularly
5177 in instances where an automated vehicle cannot avoid crashing. In this chapter,
5178 I introduce the concept of moral behavior for an automated vehicle, argue the
5179 need for research in this area through responses to anticipated critiques, and
5180 discuss relevant applications from machine ethics and moral modeling research.
5181 </p>
5182 </description>
5183 <guid isPermaLink="false">oai:arXiv.org:2010.15665</guid>
5184 </item>
5185 <item>
5186 <title>PeopleXploit -- A hybrid tool to collect public data. (arXiv:2010.15668v1 [cs.CY])</title>
5187 <link>http://fr.arxiv.org/abs/2010.15668</link>
5188 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+V_A/0/1/0/all/0/1">Arjun Anand V</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+K_B/0/1/0/all/0/1">Buvanasri A K</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+R_M/0/1/0/all/0/1">Meenakshi R</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+S_D/0/1/0/all/0/1">Dr. Karthika S</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohan_A/0/1/0/all/0/1">Ashok Kumar Mohan</a></p>
5189
5190 <p>This paper introduces the concept of Open Source Intelligence (OSINT) as an
5191 important application in intelligent profiling of individuals. With a variety
5192 of tools available, significant data shall be obtained on an individual as a
5193 consequence of analyzing his/her internet presence but all of this comes at the
5194 cost of low relevance. To increase the relevance score in profiling,
5195 PeopleXploit is being introduced. PeopleXploit is a hybrid tool which helps in
5196 collecting the publicly available information that is reliable and relevant to
5197 the given input. This tool is used to track and trace the given target with
5198 their digital footprints like Name, Email, Phone Number, User IDs etc. and the
5199 tool will scan &amp; search other associated data from public available records
5200 from the internet and create a summary report against the target. PeopleXploit
5201 profiles a person using authorship analysis and finds the best matching guess.
5202 Also, the type of analysis performed (professional/matrimonial/criminal entity)
5203 varies with the requirement of the user.
5204 </p>
5205 </description>
5206 <guid isPermaLink="false">oai:arXiv.org:2010.15668</guid>
5207 </item>
5208 <item>
5209 <title>Using Twitter to Analyze Political Polarization During National Crises. (arXiv:2010.15669v1 [cs.CY])</title>
5210 <link>http://fr.arxiv.org/abs/2010.15669</link>
5211 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shisode_P/0/1/0/all/0/1">Parth Shisode</a></p>
5212
5213 <p>Democrats and Republicans have seemed to grow apart in the past three
5214 decades. Since the United States as we know it today is undeniably bipartisan,
5215 this phenomenon would not appear as a surprise to most. However, there are
5216 triggers which can cause spikes in disagreements between Democrats and
5217 Republicans at a higher rate than how the two parties have been growing apart
5218 gradually over time. This study has analyzed the idea that national events
5219 which generally are detrimental to all individuals can be one of those
5220 triggers. By testing polarization before and after three events (Hurricane
5221 Sandy [2012], N. Korea Missile Test Surge [2019], COVID-19 [2020]) using
5222 Twitter data, we show that a measurable spike in polarization occurs between
5223 the Democrat and Republican party. In order to measure polarization, sentiments
5224 of Twitter users aligned to the Democrat and Republican parties are compared on
5225 identical entities (events, people, locations, etc.). Using hundreds of
5226 thousands of data samples, a 2.8% increase in polarization was measured during
5227 times of crisis compared to times where no crises were occurring. Regardless of
5228 the reasoning that the gap between political parties can increase so much
5229 during times of suffering and stress, it is definitely alarming to see that
5230 among other aspects of life, the partisan gap worsens during detrimental
5231 national events.
5232 </p>
5233 </description>
5234 <guid isPermaLink="false">oai:arXiv.org:2010.15669</guid>
5235 </item>
5236 <item>
5237 <title>Detecting Individuals with Depressive Disorder fromPersonal Google Search and YouTube History Logs. (arXiv:2010.15670v1 [cs.CY])</title>
5238 <link>http://fr.arxiv.org/abs/2010.15670</link>
5239 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_B/0/1/0/all/0/1">Boyu Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zaman_A/0/1/0/all/0/1">Anis Zaman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Acharyya_R/0/1/0/all/0/1">Rupam Acharyya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hoque_E/0/1/0/all/0/1">Ehsan Hoque</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Silenzio_V/0/1/0/all/0/1">Vincent Silenzio</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kautz_H/0/1/0/all/0/1">Henry Kautz</a></p>
5240
5241 <p>Depressive disorder is one of the most prevalent mental illnesses among the
5242 global population. However, traditional screening methods require exacting
5243 in-person interviews and may fail to provide immediate interventions. In this
5244 work, we leverage ubiquitous personal longitudinal Google Search and YouTube
5245 engagement logs to detect individuals with depressive disorder. We collected
5246 Google Search and YouTube history data and clinical depression evaluation
5247 results from $212$ participants ($99$ of them suffered from moderate to severe
5248 depressions). We then propose a personalized framework for classifying
5249 individuals with and without depression symptoms based on mutual-exciting point
5250 process that captures both the temporal and semantic aspects of online
5251 activities. Our best model achieved an average F1 score of $0.77 \pm 0.04$ and
5252 an AUC ROC of $0.81 \pm 0.02$.
5253 </p>
5254 </description>
5255 <guid isPermaLink="false">oai:arXiv.org:2010.15670</guid>
5256 </item>
5257 <item>
5258 <title>Computing Crisp Bisimulations for Fuzzy Structures. (arXiv:2010.15671v1 [cs.DS])</title>
5259 <link>http://fr.arxiv.org/abs/2010.15671</link>
5260 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_L/0/1/0/all/0/1">Linh Anh Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_D/0/1/0/all/0/1">Dat Xuan Tran</a></p>
5261
5262 <p>Fuzzy structures such as fuzzy automata, fuzzy transition systems, weighted
5263 social networks and fuzzy interpretations in fuzzy description logics have been
5264 widely studied. For such structures, bisimulation is a natural notion for
5265 characterizing indiscernibility between states or individuals. There are two
5266 kinds of bisimulations for fuzzy structures: crisp bisimulations and fuzzy
5267 bisimulations. While the latter fits to the fuzzy paradigm, the former has also
5268 attracted attention due to the application of crisp equivalence relations, for
5269 example, in minimizing structures. Bisimulations can be formulated for fuzzy
5270 labeled graphs and then adapted to other fuzzy structures. In this article, we
5271 present an efficient algorithm for computing the partition corresponding to the
5272 largest crisp bisimulation of a given finite fuzzy labeled graph. Its
5273 complexity is of order $O((m\log{l} + n)\log{n})$, where $n$, $m$ and $l$ are
5274 the number of vertices, the number of nonzero edges and the number of different
5275 fuzzy degrees of edges of the input graph, respectively. We also study a
5276 similar problem for the setting with counting successors, which corresponds to
5277 the case with qualified number restrictions in description logics and graded
5278 modalities in modal logics. In particular, we provide an efficient algorithm
5279 with the complexity $O((m\log{m} + n)\log{n})$ for the considered problem in
5280 that setting.
5281 </p>
5282 </description>
5283 <guid isPermaLink="false">oai:arXiv.org:2010.15671</guid>
5284 </item>
5285 <item>
5286 <title>FD Cell-Free mMIMO: Analysis and Optimization. (arXiv:2010.15672v1 [eess.SP])</title>
5287 <link>http://fr.arxiv.org/abs/2010.15672</link>
5288 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Datta_S/0/1/0/all/0/1">Soumyadeep Datta</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sharma_E/0/1/0/all/0/1">Ekant Sharma</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Amudala_D/0/1/0/all/0/1">Dheeraj Naidu Amudala</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Budhiraja_R/0/1/0/all/0/1">Rohit Budhiraja</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Panwar_S/0/1/0/all/0/1">Shivendra S. Panwar</a></p>
5289
5290 <p>We consider a full-duplex cell-free massive multiple-input-multiple-output
5291 system with limited capacity fronthaul links. We derive its downlink/uplink
5292 closed-form spectral efficiency (SE) lower bounds with maximum-ratio
5293 transmission/maximum-ratio combining and optimal uniform quantization. To
5294 reduce carbon footprint, this paper maximizes the non-convex weighted sum
5295 energy efficiency (WSEE) via downlink and uplink power control, and successive
5296 convex approximation framework. We show that with low fronthaul capacity, the
5297 system requires a higher number of fronthaul quantization bits to achieve high
5298 SE and WSEE. For high fronthaul capacity, higher number of bits, however,
5299 achieves high SE but a reduced WSEE.
5300 </p>
5301 </description>
5302 <guid isPermaLink="false">oai:arXiv.org:2010.15672</guid>
5303 </item>
5304 <item>
5305 <title>Machine Learning Based Demand Modelling for On-Demand Transit Services: A Case Study of Belleville, Ontario. (arXiv:2010.15673v1 [cs.CY])</title>
5306 <link>http://fr.arxiv.org/abs/2010.15673</link>
5307 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Alsaleh_N/0/1/0/all/0/1">Nael Alsaleh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farooq_B/0/1/0/all/0/1">Bilal Farooq</a></p>
5308
5309 <p>The use of mobile applications apps and GPS service on smartphones for
5310 transportation management applications has enabled the new "on-demand mobility"
5311 service, where the transportation supply is following the users' schedule and
5312 routes. In September 2018, the City of Belleville in Canada and Pantonium
5313 operationalized the same idea, but for the public transit service in the city
5314 to develop an on-demand transit (ODT) service. An existing fixed route (RT 11)
5315 public transit service was converted into an on-demand service during the night
5316 as a pilot project to maintain a higher demand sensitivity and highest
5317 operation cost efficiency per trip. In this study, Random Forest (RF), Bagging,
5318 Artificial Neural Network (ANN), and Deep Neural Network (DNN) machine learning
5319 algorithms were adopted to develop a pickup demand model (trip generation) and
5320 a trip demand model (trip distribution model) for Belleville ODT service based
5321 on the dissemination areas' demographic characteristics and the existing trip
5322 characteristics. The developed models aim to explain the demand behavior,
5323 investigate the main factors affecting the trip pattern and their relative
5324 importance, and to predict the number of generated trips from any dissemination
5325 area as well as between any two dissemination areas. The results indicate that
5326 the developed models can predict 63% and 70% of the pickup and trip demand
5327 levels, respectively. Both models are most affected by the month of the year
5328 and the day of the week variables. In addition, the population density has a
5329 higher impact on the ODT service pickup demand levels than the other
5330 demographic characteristics followed by the working age percentages and median
5331 income characteristics. Whereas, the distribution of the trips depends on the
5332 demographic characteristics of the destination area more than the origin area.
5333 </p>
5334 </description>
5335 <guid isPermaLink="false">oai:arXiv.org:2010.15673</guid>
5336 </item>
5337 <item>
5338 <title>Analyzing Societal Impact of COVID-19: A Study During the Early Days of the Pandemic. (arXiv:2010.15674v1 [cs.SI])</title>
5339 <link>http://fr.arxiv.org/abs/2010.15674</link>
5340 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shanthakumar_S/0/1/0/all/0/1">Swaroop Gowdra Shanthakumar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Seetharam_A/0/1/0/all/0/1">Anand Seetharam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramesh_A/0/1/0/all/0/1">Arti Ramesh</a></p>
5341
5342 <p>In this paper, we collect and study Twitter communications to understand the
5343 societal impact of COVID-19 in the United States during the early days of the
5344 pandemic. With infections soaring rapidly, users took to Twitter asking people
5345 to self isolate and quarantine themselves. Users also demanded closure of
5346 schools, bars, and restaurants as well as lockdown of cities and states. We
5347 methodically collect tweets by identifying and tracking trending COVID-related
5348 hashtags. We first manually group the hashtags into six main categories,
5349 namely, 1) General COVID, 2) Quarantine, 3) Panic Buying, 4) School Closures,
5350 5) Lockdowns, and 6) Frustration and Hope}, and study the temporal evolution of
5351 tweets in these hashtags. We conduct a linguistic analysis of words common to
5352 all hashtag groups and specific to each hashtag group and identify the chief
5353 concerns of people as the pandemic gripped the nation (e.g., exploring bidets
5354 as an alternative to toilet paper). We conduct sentiment analysis and our
5355 investigation reveals that people reacted positively to school closures and
5356 negatively to the lack of availability of essential goods due to panic buying.
5357 We adopt a state-of-the-art semantic role labeling approach to identify the
5358 action words and then leverage a LSTM-based dependency parsing model to analyze
5359 the context of action words (e.g., verb deal is accompanied by nouns such as
5360 anxiety, stress, and crisis). Finally, we develop a scalable seeded topic
5361 modeling approach to automatically categorize and isolate tweets into hashtag
5362 groups and experimentally validate that our topic model provides a grouping
5363 similar to our manual grouping. Our study presents a systematic way to
5364 construct an aggregated picture of peoples' response to the pandemic and lays
5365 the groundwork for future fine-grained linguistic and behavioral analysis.
5366 </p>
5367 </description>
5368 <guid isPermaLink="false">oai:arXiv.org:2010.15674</guid>
5369 </item>
5370 <item>
5371 <title>Deep DA for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labeled Videos. (arXiv:2010.15675v1 [cs.CV])</title>
5372 <link>http://fr.arxiv.org/abs/2010.15675</link>
5373 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+R_G/0/1/0/all/0/1">Gnana Praveen R</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Granger_E/0/1/0/all/0/1">Eric Granger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cardinal_P/0/1/0/all/0/1">Patrick Cardinal</a></p>
5374
5375 <p>Automatic estimation of pain intensity from facial expressions in videos has
5376 an immense potential in health care applications. However, domain adaptation
5377 (DA) is needed to alleviate the problem of domain shifts that typically occurs
5378 between video data captured in source and target do-mains. Given the laborious
5379 task of collecting and annotating videos, and the subjective bias due to
5380 ambiguity among adjacent intensity levels, weakly-supervised learning (WSL)is
5381 gaining attention in such applications. Yet, most state-of-the-art WSL models
5382 are typically formulated as regression problems, and do not leverage the
5383 ordinal relation between intensity levels, nor the temporal coherence of
5384 multiple consecutive frames. This paper introduces a new deep learn-ing model
5385 for weakly-supervised DA with ordinal regression(WSDA-OR), where videos in
5386 target domain have coarse la-bels provided on a periodic basis. The WSDA-OR
5387 model enforces ordinal relationships among the intensity levels as-signed to
5388 the target sequences, and associates multiple relevant frames to sequence-level
5389 labels (instead of a single frame). In particular, it learns discriminant and
5390 domain-invariant feature representations by integrating multiple in-stance
5391 learning with deep adversarial DA, where soft Gaussian labels are used to
5392 efficiently represent the weak ordinal sequence-level labels from the target
5393 domain. The proposed approach was validated on the RECOLA video dataset as
5394 fully-labeled source domain, and UNBC-McMaster video data as weakly-labeled
5395 target domain. We have also validated WSDA-OR on BIOVID and Fatigue (private)
5396 datasets for sequence level estimation. Experimental results indicate that our
5397 approach can provide a significant improvement over the state-of-the-art
5398 models, allowing to achieve a greater localization accuracy.
5399 </p>
5400 </description>
5401 <guid isPermaLink="false">oai:arXiv.org:2010.15675</guid>
5402 </item>
5403 <item>
5404 <title>Optimization Fabrics for Behavioral Design. (arXiv:2010.15676v1 [cs.RO])</title>
5405 <link>http://fr.arxiv.org/abs/2010.15676</link>
5406 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ratliff_N/0/1/0/all/0/1">Nathan D. Ratliff</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wyk_K/0/1/0/all/0/1">Karl Van Wyk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_M/0/1/0/all/0/1">Mandy Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_A/0/1/0/all/0/1">Anqi Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rana_A/0/1/0/all/0/1">Asif Muhammad Rana</a></p>
5407
5408 <p>Second-order differential equations define smooth system behavior. In
5409 general, there is no guarantee that a system will behave well when forced by a
5410 potential function, but in some cases they do and may exhibit smooth
5411 optimization properties such as convergence to a local minimum of the
5412 potential. Such a property is desirable in system design since it is inherently
5413 linked to asymptotic stability. This paper presents a comprehensive theory of
5414 optimization fabrics which are second-order differential equations that encode
5415 nominal behaviors on a space and are guaranteed to optimize when forced away
5416 from those nominal trajectories by a potential function. Optimization fabrics,
5417 or fabrics for short, can encode commonalities among optimization problems that
5418 reflect the structure of the space itself, enabling smooth optimization
5419 processes to intelligently navigate each problem even when the potential
5420 function is simple and relatively naive. Importantly, optimization over a
5421 fabric is asymptotically stable, so optimization fabrics constitute a building
5422 block for provably stable system design.
5423 </p>
5424 </description>
5425 <guid isPermaLink="false">oai:arXiv.org:2010.15676</guid>
5426 </item>
5427 <item>
5428 <title>On the Failure of the Smart Approach of the GPT Cryptosystem. (arXiv:2010.15678v1 [cs.CR])</title>
5429 <link>http://fr.arxiv.org/abs/2010.15678</link>
5430 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kalachi_H/0/1/0/all/0/1">Herve Tale Kalachi</a></p>
5431
5432 <p>This paper describes a new algorithm for breaking the smart approach of the
5433 GPT cryptosystem. We show that by puncturing the public code several times on
5434 specific positions, we get a public code on which applying the Frobenius
5435 operator appropriately allows to build an alternative secret key.
5436 </p>
5437 </description>
5438 <guid isPermaLink="false">oai:arXiv.org:2010.15678</guid>
5439 </item>
5440 <item>
5441 <title>Lie-Trotter Splitting for the Nonlinear Stochastic Manakov System. (arXiv:2010.15679v1 [math.AP])</title>
5442 <link>http://fr.arxiv.org/abs/2010.15679</link>
5443 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Berg_A/0/1/0/all/0/1">Andr&#xe9; Berg</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Cohen_D/0/1/0/all/0/1">David Cohen</a> (Chalmers), <a href="http://fr.arxiv.org/find/math/1/au:+Dujardin_G/0/1/0/all/0/1">Guillaume Dujardin</a> (LPP)</p>
5444
5445 <p>This article analyses the convergence of the Lie-Trotter splitting scheme for
5446 the stochastic Manakov equation, a system arising in the study of pulse
5447 propagation in randomly birefringent optical fibers. First, we prove that the
5448 strong order of the numerical approximation is 1/2 if the nonlinear term in the
5449 system is globally Lipschitz. Then, we show that the splitting scheme has
5450 convergence order 1/2 in probability and almost sure order 1/2- in the case of
5451 a cubic nonlinearity. We provide several numerical experiments illustrating the
5452 aforementioned results and the efficiency of the Lie-Trotter splitting scheme.
5453 Finally, we numerically investigate the possible blowup of solutions for some
5454 power-law nonlinearities.
5455 </p>
5456 </description>
5457 <guid isPermaLink="false">oai:arXiv.org:2010.15679</guid>
5458 </item>
5459 <item>
5460 <title>LSTM for Model-Based Anomaly Detection in Cyber-Physical Systems. (arXiv:2010.15680v1 [cs.LG])</title>
5461 <link>http://fr.arxiv.org/abs/2010.15680</link>
5462 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Eiteneuer_B/0/1/0/all/0/1">Benedikt Eiteneuer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Niggemann_O/0/1/0/all/0/1">Oliver Niggemann</a></p>
5463
5464 <p>Anomaly detection is the task of detecting data which differs from the normal
5465 behaviour of a system in a given context. In order to approach this problem,
5466 data-driven models can be learned to predict current or future observations.
5467 Oftentimes, anomalous behaviour depends on the internal dynamics of the system
5468 and looks normal in a static context. To address this problem, the model should
5469 also operate depending on state. Long Short-Term Memory (LSTM) neural networks
5470 have been shown to be particularly useful to learn time sequences with varying
5471 length of temporal dependencies and are therefore an interesting general
5472 purpose approach to learn the behaviour of arbitrarily complex Cyber-Physical
5473 Systems. In order to perform anomaly detection, we slightly modify the standard
5474 norm 2 error to incorporate an estimate of model uncertainty. We analyse the
5475 approach on artificial and real data.
5476 </p>
5477 </description>
5478 <guid isPermaLink="false">oai:arXiv.org:2010.15680</guid>
5479 </item>
5480 <item>
5481 <title>Maximum a posteriori signal recovery for optical coherence tomography angiography image generation and denoising. (arXiv:2010.15682v1 [eess.IV])</title>
5482 <link>http://fr.arxiv.org/abs/2010.15682</link>
5483 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Husvogt_L/0/1/0/all/0/1">Lennart Husvogt</a> (1 and 2), <a href="http://fr.arxiv.org/find/eess/1/au:+Ploner_S/0/1/0/all/0/1">Stefan B. Ploner</a> (1), <a href="http://fr.arxiv.org/find/eess/1/au:+Chen_S/0/1/0/all/0/1">Siyu Chen</a> (2), <a href="http://fr.arxiv.org/find/eess/1/au:+Stromer_D/0/1/0/all/0/1">Daniel Stromer</a> (1, 2), <a href="http://fr.arxiv.org/find/eess/1/au:+Schottenhamml_J/0/1/0/all/0/1">Julia Schottenhamml</a> (1), <a href="http://fr.arxiv.org/find/eess/1/au:+Alibhai_A/0/1/0/all/0/1">A. Yasin Alibhai</a> (3), <a href="http://fr.arxiv.org/find/eess/1/au:+Moult_E/0/1/0/all/0/1">Eric Moult</a> (2), <a href="http://fr.arxiv.org/find/eess/1/au:+Waheed_N/0/1/0/all/0/1">Nadia K. Waheed</a> (3), <a href="http://fr.arxiv.org/find/eess/1/au:+Fujimoto_J/0/1/0/all/0/1">James G. Fujimoto</a> (2), <a href="http://fr.arxiv.org/find/eess/1/au:+Maier_A/0/1/0/all/0/1">Andreas Maier</a> (1) ((1) Friedrich-Alexander-Universit&#xe4;t Erlangen-N&#xfc;rnberg Germany, (2) Massachusetts Institute of Technology USA, (3) Tufts School of Medicine USA)</p>
5484
5485 <p>Optical coherence tomography angiography (OCTA) is a novel and clinically
5486 promising imaging modality to image retinal and sub-retinal vasculature. Based
5487 on repeated optical coherence tomography (OCT) scans, intensity changes are
5488 observed over time and used to compute OCTA image data. OCTA data are prone to
5489 noise and artifacts caused by variations in flow speed and patient movement. We
5490 propose a novel iterative maximum a posteriori signal recovery algorithm in
5491 order to generate OCTA volumes with reduced noise and increased image quality.
5492 This algorithm is based on previous work on probabilistic OCTA signal models
5493 and maximum likelihood estimates. Reconstruction results using total variation
5494 minimization and wavelet shrinkage for regularization were compared against an
5495 OCTA ground truth volume, merged from six co-registered single OCTA volumes.
5496 The results show a significant improvement in peak signal-to-noise ratio and
5497 structural similarity. The presented algorithm brings together OCTA image
5498 generation and Bayesian statistics and can be developed into new OCTA image
5499 generation and denoising algorithms.
5500 </p>
5501 </description>
5502 <guid isPermaLink="false">oai:arXiv.org:2010.15682</guid>
5503 </item>
5504 <item>
5505 <title>Resilient Energy Efficient Healthcare Monitoring Infrastructure with Server and Network Protection. (arXiv:2010.15683v1 [eess.SY])</title>
5506 <link>http://fr.arxiv.org/abs/2010.15683</link>
5507 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Isa_I/0/1/0/all/0/1">Ida Syafiza M. Isa</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+El_Gorashi_T/0/1/0/all/0/1">Taisir E.H. El-Gorashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Musa_M/0/1/0/all/0/1">Mohamed O.I. Musa</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Elmirghani_J/0/1/0/all/0/1">J.M.H. Elmirghani</a></p>
5508
5509 <p>In this paper, a 1+1 server protection scheme is considered where two
5510 servers, a primary and a secondary processing server are used to serve ECG
5511 monitoring applications concurrently. The infrastructure is designed to be
5512 resilient against server failure under two scenarios related to the geographic
5513 location of primary and secondary servers and resilient against both server and
5514 network failures. A Mixed Integer Linear Programming (MILP) model is used to
5515 optimise the number and locations of both primary and secondary processing
5516 servers so that the energy consumption of the networking equipment and
5517 processing are minimised. The results show that considering a scenario for
5518 server protection without geographical constraints compared to the
5519 non-resilient scenario has resulted in both network and processing energy
5520 penalty as the traffic is doubled. The results also reveal that increasing the
5521 level of resilience to consider geographical constraints compared to case
5522 without geographical constraints resulted in higher network energy penalty when
5523 the demand is low as more nodes are utilised to place the processing servers
5524 under the geographic constraints. Also, increasing the level of resilience to
5525 consider network protection with link and node disjoint selection has resulted
5526 in a low network energy penalty at high demands due to the activation of a
5527 large part of the network in any case due to the demands. However, the results
5528 show that the network energy penalty is reduced with the increasing number of
5529 processing servers at each candidate node. Meanwhile, the same energy for
5530 processing is consumed regardless of the increasing level of resilience as the
5531 same number of processing servers are utilised. A heuristic is developed for
5532 each resilience scenario for real-time implementation where the results show
5533 that the performance of the heuristic is approaching the results of the MILP
5534 model.
5535 </p>
5536 </description>
5537 <guid isPermaLink="false">oai:arXiv.org:2010.15683</guid>
5538 </item>
5539 <item>
5540 <title>Governance & Autonomy: Towards a Governance-based Analysis of Autonomy in Cyber-Physical Systems-of-Systems. (arXiv:2010.15684v1 [cs.SE])</title>
5541 <link>http://fr.arxiv.org/abs/2010.15684</link>
5542 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gharib_M/0/1/0/all/0/1">Mohamad Gharib</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lollini_P/0/1/0/all/0/1">Paolo Lollini</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ceccarelli_A/0/1/0/all/0/1">Andrea Ceccarelli</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bondavalli_A/0/1/0/all/0/1">Andrea Bondavalli</a></p>
5543
5544 <p>One of the main challenges in integrating Cyber-Physical System-of-Systems
5545 (CPSoS) to function as a single unified system is the autonomy of its
5546 Cyber-Physical Systems (CPSs), which may lead to a lack of coordination among
5547 CPSs and results in various kinds of conflicts. We advocate that to efficiently
5548 integrate CPSs within the CPSoS, we may need to adjust the autonomy of some
5549 CPSs in a way that allows them to coordinate their activities to avoid any
5550 potential conflict among one another. To achieve that, we need to incorporate
5551 the notion of governance within the design of CPSoS, which defines rules that
5552 can be used for clearly specifying who and how can adjust the autonomy of a
5553 CPS. In this paper, we try to tackle this problem by proposing a new conceptual
5554 model that can be used for performing a governance-based analysis of autonomy
5555 for CPSs within CPSoS. We illustrate the utility of the model with an example
5556 from the automotive domain.
5557 </p>
5558 </description>
5559 <guid isPermaLink="false">oai:arXiv.org:2010.15684</guid>
5560 </item>
5561 <item>
5562 <title>Deep Autofocus for Synthetic Aperture Sonar. (arXiv:2010.15687v1 [eess.IV])</title>
5563 <link>http://fr.arxiv.org/abs/2010.15687</link>
5564 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Gerg_I/0/1/0/all/0/1">Isaac Gerg</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Monga_V/0/1/0/all/0/1">Vishal Monga</a></p>
5565
5566 <p>Synthetic aperture sonar (SAS) requires precise positional and environmental
5567 information to produce well-focused output during the image reconstruction
5568 step. However, errors in these measurements are commonly present resulting in
5569 defocused imagery. To overcome these issues, an \emph{autofocus} algorithm is
5570 employed as a post-processing step after image reconstruction for the purpose
5571 of improving image quality using the image content itself. These algorithms are
5572 usually iterative and metric-based in that they seek to optimize an image
5573 sharpness metric. In this letter, we demonstrate the potential of machine
5574 learning, specifically deep learning, to address the autofocus problem. We
5575 formulate the problem as a self-supervised, phase error estimation task using a
5576 deep network we call Deep Autofocus. Our formulation has the advantages of
5577 being non-iterative (and thus fast) and not requiring ground truth
5578 focused-defocused images pairs as often required by other deblurring deep
5579 learning methods. We compare our technique against a set of common sharpness
5580 metrics optimized using gradient descent over a real-world dataset. Our results
5581 demonstrate Deep Autofocus can produce imagery that is perceptually as good as
5582 benchmark iterative techniques but at a substantially lower computational cost.
5583 We conclude that our proposed Deep Autofocus can provide a more favorable
5584 cost-quality trade-off than state-of-the-art alternatives with significant
5585 potential of future research.
5586 </p>
5587 </description>
5588 <guid isPermaLink="false">oai:arXiv.org:2010.15687</guid>
5589 </item>
5590 <item>
5591 <title>Learning Deep Interleaved Networks with Asymmetric Co-Attention for Image Restoration. (arXiv:2010.15689v1 [cs.CV])</title>
5592 <link>http://fr.arxiv.org/abs/2010.15689</link>
5593 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_F/0/1/0/all/0/1">Feng Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cong_R/0/1/0/all/0/1">Runmin Cong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bai_H/0/1/0/all/0/1">Huihui Bai</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_Y/0/1/0/all/0/1">Yifan He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhao_Y/0/1/0/all/0/1">Yao Zhao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_C/0/1/0/all/0/1">Ce Zhu</a></p>
5594
5595 <p>Recently, convolutional neural network (CNN) has demonstrated significant
5596 success for image restoration (IR) tasks (e.g., image super-resolution, image
5597 deblurring, rain streak removal, and dehazing). However, existing CNN based
5598 models are commonly implemented as a single-path stream to enrich feature
5599 representations from low-quality (LQ) input space for final predictions, which
5600 fail to fully incorporate preceding low-level contexts into later high-level
5601 features within networks, thereby producing inferior results. In this paper, we
5602 present a deep interleaved network (DIN) that learns how information at
5603 different states should be combined for high-quality (HQ) images
5604 reconstruction. The proposed DIN follows a multi-path and multi-branch pattern
5605 allowing multiple interconnected branches to interleave and fuse at different
5606 states. In this way, the shallow information can guide deep representative
5607 features prediction to enhance the feature expression ability. Furthermore, we
5608 propose asymmetric co-attention (AsyCA) which is attached at each interleaved
5609 node to model the feature dependencies. Such AsyCA can not only adaptively
5610 emphasize the informative features from different states, but also improves the
5611 discriminative ability of networks. Our presented DIN can be trained end-to-end
5612 and applied to various IR tasks. Comprehensive evaluations on public benchmarks
5613 and real-world datasets demonstrate that the proposed DIN perform favorably
5614 against the state-of-the-art methods quantitatively and qualitatively.
5615 </p>
5616 </description>
5617 <guid isPermaLink="false">oai:arXiv.org:2010.15689</guid>
5618 </item>
5619 <item>
5620 <title>Analyzing the tree-layer structure of Deep Forests. (arXiv:2010.15690v1 [cs.LG])</title>
5621 <link>http://fr.arxiv.org/abs/2010.15690</link>
5622 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Arnould_L/0/1/0/all/0/1">Ludovic Arnould</a> (LPSM UMR 8001), <a href="http://fr.arxiv.org/find/cs/1/au:+Boyer_C/0/1/0/all/0/1">Claire Boyer</a> (LPSM UMR 8001), <a href="http://fr.arxiv.org/find/cs/1/au:+Scornet_E/0/1/0/all/0/1">Erwan Scornet</a> (CMAP)</p>
5623
5624 <p>Random forests on the one hand, and neural networks on the other hand, have
5625 met great success in the machine learning community for their predictive
5626 performance. Combinations of both have been proposed in the literature, notably
5627 leading to the so-called deep forests (DF) [25]. In this paper, we investigate
5628 the mechanisms at work in DF and outline that DF architecture can generally be
5629 simplified into more simple and computationally efficient shallow forests
5630 networks. Despite some instability, the latter may outperform standard
5631 predictive tree-based methods. In order to precisely quantify the improvement
5632 achieved by these light network configurations over standard tree learners, we
5633 theoretically study the performance of a shallow tree network made of two
5634 layers, each one composed of a single centered tree. We provide tight
5635 theoretical lower and upper bounds on its excess risk. These theoretical
5636 results show the interest of tree-network architectures for well-structured
5637 data provided that the first layer, acting as a data encoder, is rich enough.
5638 </p>
5639 </description>
5640 <guid isPermaLink="false">oai:arXiv.org:2010.15690</guid>
5641 </item>
5642 <item>
5643 <title>Unveiling process insights from refactoring practices. (arXiv:2010.15692v1 [cs.SE])</title>
5644 <link>http://fr.arxiv.org/abs/2010.15692</link>
5645 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Caldeira_J/0/1/0/all/0/1">Jo&#xe3;o Caldeira</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Abreu_F/0/1/0/all/0/1">Fernando Brito e Abreu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cardoso_J/0/1/0/all/0/1">Jorge Cardoso</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Reis_J/0/1/0/all/0/1">Jos&#xe9; Reis</a></p>
5646
5647 <p>Context : Software comprehension and maintenance activities, such as
5648 refactoring, are said to be negatively impacted by software complexity. The
5649 methods used to measure software product and processes complexity have been
5650 thoroughly debated in the literature. However, the discernment about the
5651 possible links between these two dimensions, particularly on the benefits of
5652 using the process perspective, has a long journey ahead. Objective: To improve
5653 the understanding of the liaison of developers' activities and software
5654 complexity within a refactoring task, namely by evaluating if process metrics
5655 gathered from the IDE, using process mining methods and tools, are suitable to
5656 accurately classify different refactoring practices and the resulting software
5657 complexity. Method: We mined source code metrics from a software product after
5658 a quality improvement task was given in parallel to (117) software developers,
5659 organized in (71) teams. Simultaneously, we collected events from their IDE
5660 work sessions (320) and used process mining to model their processes and
5661 extract the correspondent metrics. Results: Most teams using a plugin for
5662 refactoring (JDeodorant) reduced software complexity more effectively and with
5663 simpler processes than the ones that performed refactoring using only Eclipse
5664 native features. We were able to find moderate correlations (43%) between
5665 software cyclomatic complexity and process cyclomatic complexity. The best
5666 models found for the refactoring method and cyclomatic complexity level
5667 predictions, had an accuracy of 92.95% and 94.36%, respectively. Conclusions:
5668 Our approach agnostic to programming languages, geographic location, or
5669 development practices. Initial findings are encouraging, and lead us to suggest
5670 practitioners may use our method in other development tasks, such as, defect
5671 analysis and unit or integration tests.
5672 </p>
5673 </description>
5674 <guid isPermaLink="false">oai:arXiv.org:2010.15692</guid>
5675 </item>
5676 <item>
5677 <title>Learning interaction kernels in mean-field equations of 1st-order systems of interacting particles. (arXiv:2010.15694v1 [stat.ML])</title>
5678 <link>http://fr.arxiv.org/abs/2010.15694</link>
5679 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Lang_Q/0/1/0/all/0/1">Quanjun Lang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Lu_F/0/1/0/all/0/1">Fei Lu</a></p>
5680
5681 <p>We introduce a nonparametric algorithm to learn interaction kernels of
5682 mean-field equations for 1st-order systems of interacting particles. The data
5683 consist of discrete space-time observations of the solution. By least squares
5684 with regularization, the algorithm learns the kernel on data-adaptive
5685 hypothesis spaces efficiently. A key ingredient is a probabilistic error
5686 functional derived from the likelihood of the mean-field equation's diffusion
5687 process. The estimator converges, in a reproducing kernel Hilbert space and an
5688 L2 space under an identifiability condition, at a rate optimal in the sense
5689 that it equals the numerical integrator's order. We demonstrate our algorithm
5690 on three typical examples: the opinion dynamics with a piecewise linear kernel,
5691 the granular media model with a quadratic kernel, and the aggregation-diffusion
5692 with a repulsive-attractive kernel.
5693 </p>
5694 </description>
5695 <guid isPermaLink="false">oai:arXiv.org:2010.15694</guid>
5696 </item>
5697 <item>
5698 <title>Generalized Insider Attack Detection Implementation using NetFlow Data. (arXiv:2010.15697v1 [cs.CR])</title>
5699 <link>http://fr.arxiv.org/abs/2010.15697</link>
5700 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Samtani_Y/0/1/0/all/0/1">Yash Samtani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Elwell_J/0/1/0/all/0/1">Jesse Elwell</a></p>
5701
5702 <p>Insider Attack Detection in commercial networks is a critical problem that
5703 does not have any good solutions at this current time. The problem is
5704 challenging due to the lack of visibility into live networks and a lack of a
5705 standard feature set to distinguish between different attacks. In this paper,
5706 we study an approach centered on using network data to identify attacks. Our
5707 work builds on unsupervised machine learning techniques such as One-Class SVM
5708 and bi-clustering as weak indicators of insider network attacks. We combine
5709 these techniques to limit the number of false positives to an acceptable level
5710 required for real-world deployments by using One-Class SVM to check for
5711 anomalies detected by the proposed Bi-clustering algorithm. We present a
5712 prototype implementation in Python and associated results for two different
5713 real-world representative data sets. We show that our approach is a promising
5714 tool for insider attack detection in realistic settings.
5715 </p>
5716 </description>
5717 <guid isPermaLink="false">oai:arXiv.org:2010.15697</guid>
5718 </item>
5719 <item>
5720 <title>Constrained Online Learning to Mitigate Distortion Effects in Pulse-Agile Cognitive Radar. (arXiv:2010.15698v1 [cs.IT])</title>
5721 <link>http://fr.arxiv.org/abs/2010.15698</link>
5722 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Thornton_C/0/1/0/all/0/1">Charles E. Thornton</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Buehrer_R/0/1/0/all/0/1">R. Michael Buehrer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Martone_A/0/1/0/all/0/1">Anthony F. Martone</a></p>
5723
5724 <p>Pulse-agile radar systems have demonstrated favorable performance in dynamic
5725 electromagnetic scenarios. However, the use of non-identical waveforms within a
5726 radar's coherent processing interval may lead to harmful distortion effects
5727 when pulse-Doppler processing is used. This paper presents an online learning
5728 framework to optimize detection performance while mitigating harmful sidelobe
5729 levels. The radar waveform selection process is formulated as a linear
5730 contextual bandit problem, within which waveform adaptations which exceed a
5731 tolerable level of expected distortion are eliminated. The constrained online
5732 learning approach is effective and computationally feasible, evidenced by
5733 simulations in a radar-communication coexistence scenario and in the presence
5734 of intentional adaptive jamming. This approach is applied to both stochastic
5735 and adversarial contextual bandit learning models and the detection performance
5736 in dynamic scenarios is evaluated.
5737 </p>
5738 </description>
5739 <guid isPermaLink="false">oai:arXiv.org:2010.15698</guid>
5740 </item>
5741 <item>
5742 <title>Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks. (arXiv:2010.15703v1 [cs.CV])</title>
5743 <link>http://fr.arxiv.org/abs/2010.15703</link>
5744 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Martinez_J/0/1/0/all/0/1">Julieta Martinez</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shewakramani_J/0/1/0/all/0/1">Jashan Shewakramani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_T/0/1/0/all/0/1">Ting Wei Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Barsan_I/0/1/0/all/0/1">Ioan Andrei B&#xe2;rsan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zeng_W/0/1/0/all/0/1">Wenyuan Zeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Urtasun_R/0/1/0/all/0/1">Raquel Urtasun</a></p>
5745
5746 <p>Compressing large neural networks is an important step for their deployment
5747 in resource-constrained computational platforms. In this context, vector
5748 quantization is an appealing framework that expresses multiple parameters using
5749 a single code, and has recently achieved state-of-the-art network compression
5750 on a range of core vision and natural language processing tasks. Key to the
5751 success of vector quantization is deciding which parameter groups should be
5752 compressed together. Previous work has relied on heuristics that group the
5753 spatial dimension of individual convolutional filters, but a general solution
5754 remains unaddressed. This is desirable for pointwise convolutions (which
5755 dominate modern architectures), linear layers (which have no notion of spatial
5756 dimension), and convolutions (when more than one filter is compressed to the
5757 same codeword). In this paper we make the observation that the weights of two
5758 adjacent layers can be permuted while expressing the same function. We then
5759 establish a connection to rate-distortion theory and search for permutations
5760 that result in networks that are easier to compress. Finally, we rely on an
5761 annealed quantization algorithm to better compress the network and achieve
5762 higher final accuracy. We show results on image classification, object
5763 detection, and segmentation, reducing the gap with the uncompressed model by 40
5764 to 70% with respect to the current state of the art.
5765 </p>
5766 </description>
5767 <guid isPermaLink="false">oai:arXiv.org:2010.15703</guid>
5768 </item>
5769 <item>
5770 <title>5W1H-based Expression for the Effective Sharing of Information in Digital Forensic Investigations. (arXiv:2010.15711v1 [cs.CR])</title>
5771 <link>http://fr.arxiv.org/abs/2010.15711</link>
5772 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Han_J/0/1/0/all/0/1">Jaehyeok Han</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kim_J/0/1/0/all/0/1">Jieon Kim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_S/0/1/0/all/0/1">Sangjin Lee</a></p>
5773
5774 <p>Digital forensic investigation is used in various areas related to digital
5775 devices including the cyber crime. This is an investigative process using many
5776 techniques, which have implemented as tools. The types of files covered by the
5777 digital forensic investigation are wide and varied, however, there is no way to
5778 express the results into a standardized format. The standardization are
5779 different by types of device, file system, or application. Different outputs
5780 make it time-consuming and difficult to share information and to implement
5781 integration. In addition, it could weaken cyber security. Thus, it is important
5782 to define normalization and to present data in the same format. In this paper,
5783 a 5W1H-based expression for information sharing for effective digital forensic
5784 investigation is proposed to analyze digital forensic information using six
5785 questions--what, who, where, when, why and how. Based on the 5W1H-based
5786 expression, digital information from different types of files is converted and
5787 represented in the same format of outputs. As the 5W1H is the basic writing
5788 principle, application of the 5W1H-based expression on the case studies shows
5789 that this expression enhances clarity and correctness for information sharing.
5790 Furthermore, in the case of security incidents, this expression has an
5791 advantage in being compatible with STIX.
5792 </p>
5793 </description>
5794 <guid isPermaLink="false">oai:arXiv.org:2010.15711</guid>
5795 </item>
5796 <item>
5797 <title>Playing a Part: Speaker Verification at the Movies. (arXiv:2010.15716v1 [cs.SD])</title>
5798 <link>http://fr.arxiv.org/abs/2010.15716</link>
5799 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Brown_A/0/1/0/all/0/1">Andrew Brown</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huh_J/0/1/0/all/0/1">Jaesung Huh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nagrani_A/0/1/0/all/0/1">Arsha Nagrani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chung_J/0/1/0/all/0/1">Joon Son Chung</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zisserman_A/0/1/0/all/0/1">Andrew Zisserman</a></p>
5800
5801 <p>The goal of this work is to investigate the performance of popular speaker
5802 recognition models on speech segments from movies, where often actors
5803 intentionally disguise their voice to play a character. We make the following
5804 three contributions: (i) We collect a novel, challenging speaker recognition
5805 dataset called VoxMovies, with speech for 856 identities from almost 4000 movie
5806 clips. VoxMovies contains utterances with varying emotion, accents and
5807 background noise, and therefore comprises an entirely different domain to the
5808 interview-style, emotionally calm utterances in current speaker recognition
5809 datasets such as VoxCeleb; (ii) We provide a number of domain adaptation
5810 evaluation sets, and benchmark the performance of state-of-the-art speaker
5811 recognition models on these evaluation pairs. We demonstrate that both speaker
5812 verification and identification performance drops steeply on this new data,
5813 showing the challenge in transferring models across domains; and finally (iii)
5814 We show that simple domain adaptation paradigms improve performance, but there
5815 is still large room for improvement.
5816 </p>
5817 </description>
5818 <guid isPermaLink="false">oai:arXiv.org:2010.15716</guid>
5819 </item>
5820 <item>
5821 <title>What can we learn from gradients?. (arXiv:2010.15718v1 [cs.CR])</title>
5822 <link>http://fr.arxiv.org/abs/2010.15718</link>
5823 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_J/0/1/0/all/0/1">Jia Qian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hansen_L/0/1/0/all/0/1">Lars Kai Hansen</a></p>
5824
5825 <p>Recent work (\cite{zhu2019deep}) has shown that it is possible to reconstruct
5826 the input (image) from the gradient of a neural network. In this paper, our aim
5827 is to better understand the limits to reconstruction and to speed up image
5828 reconstruction by imposing prior image information and improved initialization.
5829 Firstly, we show that for the \textbf{non-linear} neural network,
5830 gradient-based reconstruction approximates to solving a high-dimension
5831 \textbf{linear} equations for both fully-connected neural network and
5832 convolutional neural network. Exploring the theoretical limits of input
5833 reconstruction, we show that a fully-connected neural network with a
5834 \textbf{one} hidden node is enough to reconstruct a \textbf{single} input
5835 image, regardless of the number of nodes in the output layer. Then we
5836 generalize this result to a gradient averaged over mini-batches of size B. In
5837 this case, the full mini-batch can be reconstructed in a fully-connected
5838 network if the number of hidden units exceeds B. For a convolutional neural
5839 network, the required number of filters in the first convolutional layer again
5840 is decided by the batch size B, however, in this case, input width d and the
5841 width after filter $d^{'}$ also play the role $h=(\frac{d}{d^{'}})^2BC$, where
5842 C is channel number of input. Finally, we validate and underpin our theoretical
5843 analysis on bio-medical data (fMRI, ECG signals, and cell images) and on
5844 benchmark data (MNIST, CIFAR100, and face images).
5845 </p>
5846 </description>
5847 <guid isPermaLink="false">oai:arXiv.org:2010.15718</guid>
5848 </item>
5849 <item>
5850 <title>Attentive Clustering Processes. (arXiv:2010.15727v1 [stat.ML])</title>
5851 <link>http://fr.arxiv.org/abs/2010.15727</link>
5852 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Pakman_A/0/1/0/all/0/1">Ari Pakman</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wang_Y/0/1/0/all/0/1">Yueqi Wang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Lee_Y/0/1/0/all/0/1">Yoonho Lee</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Basu_P/0/1/0/all/0/1">Pallab Basu</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Lee_J/0/1/0/all/0/1">Juho Lee</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Teh_Y/0/1/0/all/0/1">Yee Whye Teh</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Paninski_L/0/1/0/all/0/1">Liam Paninski</a></p>
5853
5854 <p>Amortized approaches to clustering have recently received renewed attention
5855 thanks to novel objective functions that exploit the expressiveness of deep
5856 learning models. In this work we revisit a recent proposal for fast amortized
5857 probabilistic clustering, the Clusterwise Clustering Process (CCP), which
5858 yields samples from the posterior distribution of cluster labels for sets of
5859 arbitrary size using only O(K) forward network evaluations, where K is an
5860 arbitrary number of clusters. While adequate in simple datasets, we show that
5861 the model can severely underfit complex datasets, and hypothesize that this
5862 limitation can be traced back to the implicit assumption that the probability
5863 of a point joining a cluster is equally sensitive to all the points available
5864 to join the same cluster. We propose an improved model, the Attentive
5865 Clustering Process (ACP), that selectively pays more attention to relevant
5866 points while preserving the invariance properties of the generative model. We
5867 illustrate the advantages of the new model in applications to spike-sorting in
5868 multi-electrode arrays and community discovery in networks. The latter case
5869 combines the ACP model with graph convolutional networks, and to our knowledge
5870 is the first deep learning model that handles an arbitrary number of
5871 communities.
5872 </p>
5873 </description>
5874 <guid isPermaLink="false">oai:arXiv.org:2010.15727</guid>
5875 </item>
5876 <item>
5877 <title>Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation. (arXiv:2010.15728v1 [cs.CL])</title>
5878 <link>http://fr.arxiv.org/abs/2010.15728</link>
5879 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_H/0/1/0/all/0/1">Hang Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Suarez_Paniagua_V/0/1/0/all/0/1">V&#xed;ctor Su&#xe1;rez-Paniagua</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Whiteley_W/0/1/0/all/0/1">William Whiteley</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_H/0/1/0/all/0/1">Honghan Wu</a></p>
5880
5881 <p>Diagnostic or procedural coding of clinical notes aims to derive a coded
5882 summary of disease-related information about patients. Such coding is usually
5883 done manually in hospitals but could potentially be automated to improve the
5884 efficiency and accuracy of medical coding. Recent studies on deep learning for
5885 automated medical coding achieved promising performances. However, the
5886 explainability of these models is usually poor, preventing them to be used
5887 confidently in supporting clinical practice. Another limitation is that these
5888 models mostly assume independence among labels, ignoring the complex
5889 correlation among medical codes which can potentially be exploited to improve
5890 the performance. We propose a Hierarchical Label-wise Attention Network (HLAN),
5891 which aimed to interpret the model by quantifying importance (as attention
5892 weights) of words and sentences related to each of the labels. Secondly, we
5893 propose to enhance the major deep learning models with a label embedding (LE)
5894 initialisation approach, which learns a dense, continuous vector representation
5895 and then injects the representation into the final layers and the label-wise
5896 attention layers in the models. We evaluated the methods using three settings
5897 on the MIMIC-III discharge summaries: full codes, top-50 codes, and the UK NHS
5898 COVID-19 shielding codes. Experiments were conducted to compare HLAN and LE
5899 initialisation to the state-of-the-art neural network based methods. HLAN
5900 achieved the best Micro-level AUC and $F_1$ on the top-50 code prediction and
5901 comparable results on the NHS COVID-19 shielding code prediction to other
5902 models. By highlighting the most salient words and sentences for each label,
5903 HLAN showed more meaningful and comprehensive model interpretation compared to
5904 its downgraded baselines and the CNN-based models. LE initialisation
5905 consistently boosted most deep learning models for automated medical coding.
5906 </p>
5907 </description>
5908 <guid isPermaLink="false">oai:arXiv.org:2010.15728</guid>
5909 </item>
5910 <item>
5911 <title>Fundamental limitations to key distillation from Gaussian states with Gaussian operations. (arXiv:2010.15729v1 [quant-ph])</title>
5912 <link>http://fr.arxiv.org/abs/2010.15729</link>
5913 <description><p>Authors: <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Lami_L/0/1/0/all/0/1">Ludovico Lami</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Mista_L/0/1/0/all/0/1">Ladislav Mi&#x161;ta, Jr.</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Adesso_G/0/1/0/all/0/1">Gerardo Adesso</a></p>
5914
5915 <p>We establish fundamental upper bounds on the amount of secret key that can be
5916 extracted from continuous variable quantum Gaussian states by using only local
5917 Gaussian operations, local classical processing, and public communication. For
5918 one-way communication, we prove that the key is bounded by the R\'enyi-$2$
5919 Gaussian entanglement of formation $E_{F,2}^{\mathrm{\scriptscriptstyle G}}$,
5920 with the inequality being saturated for pure Gaussian states. The same is true
5921 if two-way public communication is allowed but Alice and Bob employ protocols
5922 that start with destructive local Gaussian measurements. In the most general
5923 setting of two-way communication and arbitrary interactive protocols, we argue
5924 that $2 E_{F,2}^{\mathrm{\scriptscriptstyle G}}$ is still a bound on the
5925 extractable key, although we conjecture that the factor of $2$ is superfluous.
5926 Finally, for a wide class of Gaussian states that includes all two-mode states,
5927 we prove a recently proposed conjecture on the equality between
5928 $E_{F,2}^{\mathrm{\scriptscriptstyle G}}$ and the Gaussian intrinsic
5929 entanglement, thus endowing both measures with a more solid operational
5930 meaning.
5931 </p>
5932 </description>
5933 <guid isPermaLink="false">oai:arXiv.org:2010.15729</guid>
5934 </item>
5935 <item>
5936 <title>The Agile Coach Role: Coaching for Agile Performance Impact. (arXiv:2010.15738v1 [cs.SE])</title>
5937 <link>http://fr.arxiv.org/abs/2010.15738</link>
5938 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Stray_V/0/1/0/all/0/1">Viktoria Stray</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tkalich_A/0/1/0/all/0/1">Anastasiia Tkalich</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Moe_N/0/1/0/all/0/1">Nils Brede Moe</a></p>
5939
5940 <p>It is increasingly common to introduce agile coaches to help gain speed and
5941 advantage in agile companies. Following the success of Spotify, the role of the
5942 agile coach has branched out in terms of tasks and responsibilities, but little
5943 research has been conducted to examine how this role is practiced. This paper
5944 examines the role of the agile coach through 19 semistructured interviews with
5945 agile coaches from ten different companies. We describe the role in terms of
5946 the tasks the coach has in agile projects, valuable traits, skills, tools, and
5947 the enablers of agile coaching. Our findings indicate that agile coaches
5948 perform at the team and organizational levels. They affect effort, strategies,
5949 knowledge, and skills of the agile teams. The most essential traits of an agile
5950 coach are being emphatic, people-oriented, able to listen, diplomatic, and
5951 persistent. We suggest empirically based advice for agile coaching, for example
5952 companies giving their agile coaches the authority to implement the required
5953 organizational changes within and outside the teams.
5954 </p>
5955 </description>
5956 <guid isPermaLink="false">oai:arXiv.org:2010.15738</guid>
5957 </item>
5958 <item>
5959 <title>Recurrent Neural Networks for video object detection. (arXiv:2010.15740v1 [cs.CV])</title>
5960 <link>http://fr.arxiv.org/abs/2010.15740</link>
5961 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qasim_A/0/1/0/all/0/1">Ahmad B Qasim</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pettirsch_A/0/1/0/all/0/1">Arnd Pettirsch</a></p>
5962
5963 <p>There is lots of scientific work about object detection in images. For many
5964 applications like for example autonomous driving the actual data on which
5965 classification has to be done are videos. This work compares different methods,
5966 especially those which use Recurrent Neural Networks to detect objects in
5967 videos. We differ between feature-based methods, which feed feature maps of
5968 different frames into the recurrent units, box-level methods, which feed
5969 bounding boxes with class probabilities into the recurrent units and methods
5970 which use flow networks. This study indicates common outcomes of the compared
5971 methods like the benefit of including the temporal context into object
5972 detection and states conclusions and guidelines for video object detection
5973 networks.
5974 </p>
5975 </description>
5976 <guid isPermaLink="false">oai:arXiv.org:2010.15740</guid>
5977 </item>
5978 <item>
5979 <title>Causal variables from reinforcement learning using generalized Bellman equations. (arXiv:2010.15745v1 [cs.LG])</title>
5980 <link>http://fr.arxiv.org/abs/2010.15745</link>
5981 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Herlau_T/0/1/0/all/0/1">Tue Herlau</a></p>
5982
5983 <p>Many open problems in machine learning are intrinsically related to
5984 causality, however, the use of causal analysis in machine learning is still in
5985 its early stage. Within a general reinforcement learning setting, we consider
5986 the problem of building a general reinforcement learning agent which uses
5987 experience to construct a causal graph of the environment, and use this graph
5988 to inform its policy. Our approach has three characteristics: First, we learn a
5989 simple, coarse-grained causal graph, in which the variables reflect states at
5990 many time instances, and the interventions happen at the level of policies,
5991 rather than individual actions. Secondly, we use mediation analysis to obtain
5992 an optimization target. By minimizing this target, we define the causal
5993 variables. Thirdly, our approach relies on estimating conditional expectations
5994 rather the familiar expected return from reinforcement learning, and we
5995 therefore apply a generalization of Bellman's equations. We show the method can
5996 learn a plausible causal graph in a grid-world environment, and the agent
5997 obtains an improvement in performance when using the causally informed policy.
5998 To our knowledge, this is the first attempt to apply causal analysis in a
5999 reinforcement learning setting without strict restrictions on the number of
6000 states. We have observed that mediation analysis provides a promising avenue
6001 for transforming the problem of causal acquisition into one of cost-function
6002 minimization, but importantly one which involves estimating conditional
6003 expectations. This is a new challenge, and we think that causal reinforcement
6004 learning will involve development methods suited for online estimation of such
6005 conditional expectations. Finally, a benefit of our approach is the use of very
6006 simple causal models, which are arguably a more natural model of human causal
6007 understanding.
6008 </p>
6009 </description>
6010 <guid isPermaLink="false">oai:arXiv.org:2010.15745</guid>
6011 </item>
6012 <item>
6013 <title>Gaussian Process Bandit Optimization of theThermodynamic Variational Objective. (arXiv:2010.15750v1 [cs.LG])</title>
6014 <link>http://fr.arxiv.org/abs/2010.15750</link>
6015 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_V/0/1/0/all/0/1">Vu Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Masrani_V/0/1/0/all/0/1">Vaden Masrani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Brekelmans_R/0/1/0/all/0/1">Rob Brekelmans</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Osborne_M/0/1/0/all/0/1">Michael A. Osborne</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wood_F/0/1/0/all/0/1">Frank Wood</a></p>
6016
6017 <p>Achieving the full promise of the Thermodynamic Variational Objective (TVO),a
6018 recently proposed variational lower bound on the log evidence involving a
6019 one-dimensional Riemann integral approximation, requires choosing a "schedule"
6020 ofsorted discretization points. This paper introduces a bespoke Gaussian
6021 processbandit optimization method for automatically choosing these points. Our
6022 approach not only automates their one-time selection, but also dynamically
6023 adaptstheir positions over the course of optimization, leading to improved
6024 model learning and inference. We provide theoretical guarantees that our bandit
6025 optimizationconverges to the regret-minimizing choice of integration points.
6026 Empirical validation of our algorithm is provided in terms of improved learning
6027 and inference inVariational Autoencoders and Sigmoid Belief Networks.
6028 </p>
6029 </description>
6030 <guid isPermaLink="false">oai:arXiv.org:2010.15750</guid>
6031 </item>
6032 <item>
6033 <title>A more Pragmatic Implementation of the Lock-free, Ordered, Linked List. (arXiv:2010.15755v1 [cs.DS])</title>
6034 <link>http://fr.arxiv.org/abs/2010.15755</link>
6035 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Traff_J/0/1/0/all/0/1">Jesper Larsson Tr&#xe4;ff</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Poter_M/0/1/0/all/0/1">Manuel P&#xf6;ter</a></p>
6036
6037 <p>The lock-free, ordered, linked list is an important, standard example of a
6038 concurrent data structure. An obvious, practical drawback of textbook
6039 implementations is that failed compare-and-swap (CAS) operations lead to
6040 retraversal of the entire list (retries), which is particularly harmful for
6041 such a linear-time data structure. We alleviate this drawback by first
6042 observing that failed CAS operations under some conditions do not require a
6043 full retry, and second by maintaining approximate backwards pointers that are
6044 used to find a closer starting position in the list for operation retry.
6045 Experiments with both a worst-case deterministic benchmark, and a standard,
6046 randomized, mixed-operation throughput benchmark on three shared-memory systems
6047 (Intel Xeon, AMD EPYC, SPARC-T5) show practical improvements ranging from
6048 significant, to dramatic, several orders of magnitude.
6049 </p>
6050 </description>
6051 <guid isPermaLink="false">oai:arXiv.org:2010.15755</guid>
6052 </item>
6053 <item>
6054 <title>Identifying Transition States of Chemical Kinetic Systems using Network Embedding Techniques. (arXiv:2010.15760v1 [math.NA])</title>
6055 <link>http://fr.arxiv.org/abs/2010.15760</link>
6056 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Mercurio_P/0/1/0/all/0/1">Paula Mercurio</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Liu_D/0/1/0/all/0/1">Di Liu</a></p>
6057
6058 <p>Using random walk sampling methods for feature learning on networks, we
6059 develop a method for generating low-dimensional node embeddings for directed
6060 graphs and identifying transition states of stochastic chemical reacting
6061 systems. We modified objective functions adopted in existing random walk based
6062 network embedding methods to handle directed graphs and neighbors of different
6063 degrees. Through optimization via gradient ascent, we embed the weighted graph
6064 vertices into a low-dimensional vector space Rd while preserving the
6065 neighborhood of each node. We then demonstrate the effectiveness of the method
6066 on dimension reduction through several examples regarding identification of
6067 transition states of chemical reactions, especially for entropic systems.
6068 </p>
6069 </description>
6070 <guid isPermaLink="false">oai:arXiv.org:2010.15760</guid>
6071 </item>
6072 <item>
6073 <title>A Helmholtz equation solver using unsupervised learning: Application to transcranial ultrasound. (arXiv:2010.15761v1 [physics.comp-ph])</title>
6074 <link>http://fr.arxiv.org/abs/2010.15761</link>
6075 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Stanziola_A/0/1/0/all/0/1">Antonio Stanziola</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Arridge_S/0/1/0/all/0/1">Simon R. Arridge</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Cox_B/0/1/0/all/0/1">Ben T. Cox</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Treeby_B/0/1/0/all/0/1">Bradley E. Treeby</a></p>
6076
6077 <p>Transcranial ultrasound therapy is increasingly used for the non-invasive
6078 treatment of brain disorders. However, conventional numerical wave solvers are
6079 currently too computationally expensive to be used online during treatments to
6080 predict the acoustic field passing through the skull (e.g., to account for
6081 subject-specific dose and targeting variations). As a step towards real-time
6082 predictions, in the current work, a fast iterative solver for the heterogeneous
6083 Helmholtz equation in 2D is developed using a fully-learned optimizer. The
6084 lightweight network architecture is based on a modified UNet that includes a
6085 learned hidden state. The network is trained using a physics-based loss
6086 function and a set of idealized sound speed distributions with fully
6087 unsupervised training (no knowledge of the true solution is required). The
6088 learned optimizer shows excellent performance on the test set, and is capable
6089 of generalization well outside the training examples, including to much larger
6090 computational domains, and more complex source and sound speed distributions,
6091 for example, those derived from x-ray computed tomography images of the skull.
6092 </p>
6093 </description>
6094 <guid isPermaLink="false">oai:arXiv.org:2010.15761</guid>
6095 </item>
6096 <item>
6097 <title>Domain adaptation under structural causal models. (arXiv:2010.15764v1 [stat.ML])</title>
6098 <link>http://fr.arxiv.org/abs/2010.15764</link>
6099 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Chen_Y/0/1/0/all/0/1">Yuansi Chen</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Buhlmann_P/0/1/0/all/0/1">Peter B&#xfc;hlmann</a></p>
6100
6101 <p>Domain adaptation (DA) arises as an important problem in statistical machine
6102 learning when the source data used to train a model is different from the
6103 target data used to test the model. Recent advances in DA have mainly been
6104 application-driven and have largely relied on the idea of a common subspace for
6105 source and target data. To understand the empirical successes and failures of
6106 DA methods, we propose a theoretical framework via structural causal models
6107 that enables analysis and comparison of the prediction performance of DA
6108 methods. This framework also allows us to itemize the assumptions needed for
6109 the DA methods to have a low target error. Additionally, with insights from our
6110 theory, we propose a new DA method called CIRM that outperforms existing DA
6111 methods when both the covariates and label distributions are perturbed in the
6112 target data. We complement the theoretical analysis with extensive simulations
6113 to show the necessity of the devised assumptions. Reproducible synthetic and
6114 real data experiments are also provided to illustrate the strengths and
6115 weaknesses of DA methods when parts of the assumptions of our theory are
6116 violated.
6117 </p>
6118 </description>
6119 <guid isPermaLink="false">oai:arXiv.org:2010.15764</guid>
6120 </item>
6121 <item>
6122 <title>A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems. (arXiv:2010.15768v1 [math.OC])</title>
6123 <link>http://fr.arxiv.org/abs/2010.15768</link>
6124 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Zhang_J/0/1/0/all/0/1">Jiawei Zhang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Xiao_P/0/1/0/all/0/1">Peijun Xiao</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Sun_R/0/1/0/all/0/1">Ruoyu Sun</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Luo_Z/0/1/0/all/0/1">Zhi-Quan Luo</a></p>
6125
6126 <p>Nonconvex-concave min-max problem arises in many machine learning
6127 applications including minimizing a pointwise maximum of a set of nonconvex
6128 functions and robust adversarial training of neural networks. A popular
6129 approach to solve this problem is the gradient descent-ascent (GDA) algorithm
6130 which unfortunately can exhibit oscillation in case of nonconvexity. In this
6131 paper, we introduce a "smoothing" scheme which can be combined with GDA to
6132 stabilize the oscillation and ensure convergence to a stationary solution. We
6133 prove that the stabilized GDA algorithm can achieve an $O(1/\epsilon^2)$
6134 iteration complexity for minimizing the pointwise maximum of a finite
6135 collection of nonconvex functions. Moreover, the smoothed GDA algorithm
6136 achieves an $O(1/\epsilon^4)$ iteration complexity for general
6137 nonconvex-concave problems. Extensions of this stabilized GDA algorithm to
6138 multi-block cases are presented. To the best of our knowledge, this is the
6139 first algorithm to achieve $O(1/\epsilon^2)$ for a class of nonconvex-concave
6140 problem. We illustrate the practical efficiency of the stabilized GDA algorithm
6141 on robust training.
6142 </p>
6143 </description>
6144 <guid isPermaLink="false">oai:arXiv.org:2010.15768</guid>
6145 </item>
6146 <item>
6147 <title>Recursive Random Contraction Revisited. (arXiv:2010.15770v1 [cs.DS])</title>
6148 <link>http://fr.arxiv.org/abs/2010.15770</link>
6149 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Karger_D/0/1/0/all/0/1">David R. Karger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Williamson_D/0/1/0/all/0/1">David P. Williamson</a></p>
6150
6151 <p>In this note, we revisit the recursive random contraction algorithm of Karger
6152 and Stein for finding a minimum cut in a graph. Our revisit is occasioned by a
6153 paper of Fox, Panigrahi, and Zhang which gives an extension of the Karger-Stein
6154 algorithm to minimum cuts and minimum $k$-cuts in hypergraphs. When specialized
6155 to the case of graphs, the algorithm is somewhat different than the original
6156 Karger-Stein algorithm. We show that the analysis becomes particularly clean in
6157 this case: we can prove that the probability that a fixed minimum cut in an $n$
6158 node graph is returned by the algorithm is bounded below by $1/(2H_n-2)$, where
6159 $H_n$ is the $n$th harmonic number. We also consider other similar variants of
6160 the algorithm, and show that no such algorithm can achieve an asymptotically
6161 better probability of finding a fixed minimum cut.
6162 </p>
6163 </description>
6164 <guid isPermaLink="false">oai:arXiv.org:2010.15770</guid>
6165 </item>
6166 <item>
6167 <title>GANs & Reels: Creating Irish Music using a Generative Adversarial Network. (arXiv:2010.15772v1 [cs.SD])</title>
6168 <link>http://fr.arxiv.org/abs/2010.15772</link>
6169 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kolokolova_A/0/1/0/all/0/1">Antonina Kolokolova</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Billard_M/0/1/0/all/0/1">Mitchell Billard</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bishop_R/0/1/0/all/0/1">Robert Bishop</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Elsisy_M/0/1/0/all/0/1">Moustafa Elsisy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Northcott_Z/0/1/0/all/0/1">Zachary Northcott</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Graves_L/0/1/0/all/0/1">Laura Graves</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nagisetty_V/0/1/0/all/0/1">Vineel Nagisetty</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Patey_H/0/1/0/all/0/1">Heather Patey</a></p>
6170
6171 <p>In this paper we present a method for algorithmic melody generation using a
6172 generative adversarial network without recurrent components. Music generation
6173 has been successfully done using recurrent neural networks, where the model
6174 learns sequence information that can help create authentic sounding melodies.
6175 Here, we use DC-GAN architecture with dilated convolutions and towers to
6176 capture sequential information as spatial image information, and learn
6177 long-range dependencies in fixed-length melody forms such as Irish traditional
6178 reel.
6179 </p>
6180 </description>
6181 <guid isPermaLink="false">oai:arXiv.org:2010.15772</guid>
6182 </item>
6183 <item>
6184 <title>WaveTransform: Crafting Adversarial Examples via Input Decomposition. (arXiv:2010.15773v1 [cs.CV])</title>
6185 <link>http://fr.arxiv.org/abs/2010.15773</link>
6186 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Anshumaan_D/0/1/0/all/0/1">Divyam Anshumaan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Agarwal_A/0/1/0/all/0/1">Akshay Agarwal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vatsa_M/0/1/0/all/0/1">Mayank Vatsa</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Singh_R/0/1/0/all/0/1">Richa Singh</a></p>
6187
6188 <p>Frequency spectrum has played a significant role in learning unique and
6189 discriminating features for object recognition. Both low and high frequency
6190 information present in images have been extracted and learnt by a host of
6191 representation learning techniques, including deep learning. Inspired by this
6192 observation, we introduce a novel class of adversarial attacks, namely
6193 `WaveTransform', that creates adversarial noise corresponding to low-frequency
6194 and high-frequency subbands, separately (or in combination). The frequency
6195 subbands are analyzed using wavelet decomposition; the subbands are corrupted
6196 and then used to construct an adversarial example. Experiments are performed
6197 using multiple databases and CNN models to establish the effectiveness of the
6198 proposed WaveTransform attack and analyze the importance of a particular
6199 frequency component. The robustness of the proposed attack is also evaluated
6200 through its transferability and resiliency against a recent adversarial defense
6201 algorithm. Experiments show that the proposed attack is effective against the
6202 defense algorithm and is also transferable across CNNs.
6203 </p>
6204 </description>
6205 <guid isPermaLink="false">oai:arXiv.org:2010.15773</guid>
6206 </item>
6207 <item>
6208 <title>Understanding the Failure Modes of Out-of-Distribution Generalization. (arXiv:2010.15775v1 [cs.LG])</title>
6209 <link>http://fr.arxiv.org/abs/2010.15775</link>
6210 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Nagarajan_V/0/1/0/all/0/1">Vaishnavh Nagarajan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Andreassen_A/0/1/0/all/0/1">Anders Andreassen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Neyshabur_B/0/1/0/all/0/1">Behnam Neyshabur</a></p>
6211
6212 <p>Empirical studies suggest that machine learning models often rely on
6213 features, such as the background, that may be spuriously correlated with the
6214 label only during training time, resulting in poor accuracy during test-time.
6215 In this work, we identify the fundamental factors that give rise to this
6216 behavior, by explaining why models fail this way {\em even} in easy-to-learn
6217 tasks where one would expect these models to succeed. In particular, through a
6218 theoretical study of gradient-descent-trained linear classifiers on some
6219 easy-to-learn tasks, we uncover two complementary failure modes. These modes
6220 arise from how spurious correlations induce two kinds of skews in the data: one
6221 geometric in nature, and another, statistical in nature. Finally, we construct
6222 natural modifications of image classification datasets to understand when these
6223 failure modes can arise in practice. We also design experiments to isolate the
6224 two failure modes when training modern neural networks on these datasets.
6225 </p>
6226 </description>
6227 <guid isPermaLink="false">oai:arXiv.org:2010.15775</guid>
6228 </item>
6229 <item>
6230 <title>Quantum advantage for differential equation analysis. (arXiv:2010.15776v1 [quant-ph])</title>
6231 <link>http://fr.arxiv.org/abs/2010.15776</link>
6232 <description><p>Authors: <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Kiani_B/0/1/0/all/0/1">Bobak T. Kiani</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Palma_G/0/1/0/all/0/1">Giacomo De Palma</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Englund_D/0/1/0/all/0/1">Dirk Englund</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Kaminsky_W/0/1/0/all/0/1">William Kaminsky</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Marvian_M/0/1/0/all/0/1">Milad Marvian</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Lloyd_S/0/1/0/all/0/1">Seth Lloyd</a></p>
6233
6234 <p>Quantum algorithms for both differential equation solving and for machine
6235 learning potentially offer an exponential speedup over all known classical
6236 algorithms. However, there also exist obstacles to obtaining this potential
6237 speedup in useful problem instances. The essential obstacle for quantum
6238 differential equation solving is that outputting useful information may require
6239 difficult post-processing, and the essential obstacle for quantum machine
6240 learning is that inputting the training set is a difficult task just by itself.
6241 In this paper, we demonstrate, when combined, these difficulties solve one
6242 another. We show how the output of quantum differential equation solving can
6243 serve as the input for quantum machine learning, allowing dynamical analysis in
6244 terms of principal components, power spectra, and wavelet decompositions. To
6245 illustrate this, we consider continuous time Markov processes on
6246 epidemiological and social networks. These quantum algorithms provide an
6247 exponential advantage over existing classical Monte Carlo methods.
6248 </p>
6249 </description>
6250 <guid isPermaLink="false">oai:arXiv.org:2010.15776</guid>
6251 </item>
6252 <item>
6253 <title>Contextual BERT: Conditioning the Language Model Using a Global State. (arXiv:2010.15778v1 [cs.CL])</title>
6254 <link>http://fr.arxiv.org/abs/2010.15778</link>
6255 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Denk_T/0/1/0/all/0/1">Timo I. Denk</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramallo_A/0/1/0/all/0/1">Ana Peleteiro Ramallo</a></p>
6256
6257 <p>BERT is a popular language model whose main pre-training task is to fill in
6258 the blank, i.e., predicting a word that was masked out of a sentence, based on
6259 the remaining words. In some applications, however, having an additional
6260 context can help the model make the right prediction, e.g., by taking the
6261 domain or the time of writing into account. This motivates us to advance the
6262 BERT architecture by adding a global state for conditioning on a fixed-sized
6263 context. We present our two novel approaches and apply them to an industry
6264 use-case, where we complete fashion outfits with missing articles, conditioned
6265 on a specific customer. An experimental comparison to other methods from the
6266 literature shows that our methods improve personalization significantly.
6267 </p>
6268 </description>
6269 <guid isPermaLink="false">oai:arXiv.org:2010.15778</guid>
6270 </item>
6271 <item>
6272 <title>Stable and efficient Petrov-Galerkin methods for a kinetic Fokker-Planck equation. (arXiv:2010.15784v1 [math.NA])</title>
6273 <link>http://fr.arxiv.org/abs/2010.15784</link>
6274 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Brunken_J/0/1/0/all/0/1">Julia Brunken</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Smetana_K/0/1/0/all/0/1">Kathrin Smetana</a></p>
6275
6276 <p>We propose a stable Petrov-Galerkin discretization of a kinetic Fokker-Planck
6277 equation constructed in such a way that uniform inf-sup stability can be
6278 inferred directly from the variational formulation. Inspired by well-posedness
6279 results for parabolic equations, we derive a lower bound for the dual inf-sup
6280 constant of the Fokker-Planck bilinear form by means of stable pairs of trial
6281 and test functions. The trial function of such a pair is constructed by
6282 applying the kinetic transport operator and the inverse velocity
6283 Laplace-Beltrami operator to a given test function. For the Petrov-Galerkin
6284 projection we choose an arbitrary discrete test space and then define the
6285 discrete trial space using the same application of transport and inverse
6286 Laplace-Beltrami operator. As a result, the spaces replicate the stable pairs
6287 of the continuous level and we obtain a well-posed numerical method with a
6288 discrete inf-sup constant identical to the inf-sup constant of the continuous
6289 problem independently of the mesh size. We show how the specific basis
6290 functions can be efficiently computed by low-dimensional elliptic problems, and
6291 confirm the practicability and performance of the method for a numerical
6292 example.
6293 </p>
6294 </description>
6295 <guid isPermaLink="false">oai:arXiv.org:2010.15784</guid>
6296 </item>
6297 <item>
6298 <title>Quickest detection of false data injection attack in remote state estimation. (arXiv:2010.15785v1 [eess.SY])</title>
6299 <link>http://fr.arxiv.org/abs/2010.15785</link>
6300 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Gupta_A/0/1/0/all/0/1">Akanshu Gupta</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sikdar_A/0/1/0/all/0/1">Abhinava Sikdar</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chattopadhyay_A/0/1/0/all/0/1">Arpan Chattopadhyay</a></p>
6301
6302 <p>In this paper, quickest detection of false data injection attack on remote
6303 state estimation is considered. A set of $N$ sensors make noisy linear
6304 observations of a discrete-time linear process with Gaussian noise, and report
6305 the observations to a remote estimator. The challenge is the presence of a few
6306 potentially malicious sensors which can start strategically manipulating their
6307 observations at a random time in order to skew the estimates. The quickest
6308 attack detection problem for a known linear attack scheme is posed as a
6309 constrained Markov decision process in order to minimise the expected detection
6310 delay subject to a false alarm constraint, with the state involving the
6311 probability belief at the estimator that the system is under attack. State
6312 transition probabilities are derived in terms of system parameters, and the
6313 structure of the optimal policy is derived analytically. It turns out that the
6314 optimal policy amounts to checking whether the probability belief exceeds a
6315 threshold. Numerical results demonstrate significant performance gain under the
6316 proposed algorithm against competing algorithms.
6317 </p>
6318 </description>
6319 <guid isPermaLink="false">oai:arXiv.org:2010.15785</guid>
6320 </item>
6321 <item>
6322 <title>Light-Weight DDoS Mitigation at Network Edge with Limited Resources. (arXiv:2010.15786v1 [cs.NI])</title>
6323 <link>http://fr.arxiv.org/abs/2010.15786</link>
6324 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yaegashi_R/0/1/0/all/0/1">Ryo Yaegashi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hisano_D/0/1/0/all/0/1">Daisuke Hisano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nakayama_Y/0/1/0/all/0/1">Yu Nakayama</a></p>
6325
6326 <p>The Internet of Things (IoT) has been growing rapidly in recent years. With
6327 the appearance of 5G, it is expected to become even more indispensable to
6328 people's lives. In accordance with the increase of Distributed
6329 Denial-of-Service (DDoS) attacks from IoT devices, DDoS defense has become a
6330 hot research topic. DDoS detection mechanisms executed on routers and SDN
6331 environments have been intensely studied. However, these methods have the
6332 disadvantage of requiring the cost and performance of the devices. In addition,
6333 there is no existing DDoS mitigation algorithm on the network edge that can be
6334 performed with the low-cost and low performance equipments. Therefore, this
6335 paper proposes a light-weight DDoS mitigation scheme at the network edge using
6336 limited resources of inexpensive devices such as home gateways. The goal of the
6337 proposed scheme is to simply detect and mitigate flooding attacks. It utilizes
6338 unused queue resources to detect malicious flows by random shuffling of queue
6339 allocation and discard the packets of the detected flows. The performance of
6340 the proposed scheme was confirmed via theoretical analysis and computer
6341 simulation. The simulation results match the theoretical results and the
6342 proposed algorithm can efficiently detect malicious flows using limited
6343 resources.
6344 </p>
6345 </description>
6346 <guid isPermaLink="false">oai:arXiv.org:2010.15786</guid>
6347 </item>
6348 <item>
6349 <title>A Framework for Learning Predator-prey Agents from Simulation to Real World. (arXiv:2010.15792v1 [cs.RO])</title>
6350 <link>http://fr.arxiv.org/abs/2010.15792</link>
6351 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_J/0/1/0/all/0/1">Jiunhan Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gao_Z/0/1/0/all/0/1">Zhenyu Gao</a></p>
6352
6353 <p>In this paper, we propose an evolutionary predatorprey robot system which can
6354 be generally implemented from simulation to the real world. We design the
6355 closed-loop robot system with camera and infrared sensors as inputs of
6356 controller. Both the predators and prey are co-evolved by NeuroEvolution of
6357 Augmenting Topologies (NEAT) to learn the expected behaviours. We design a
6358 framework that integrate Gym of OpenAI, Robot Operating System (ROS), Gazebo.
6359 In such a framework, users only need to focus on algorithms without being
6360 worried about the detail of manipulating robots in both simulation and the real
6361 world. Combining simulations, real-world evolution, and robustness analysis, it
6362 can be applied to develop the solutions for the predator-prey tasks. For the
6363 convenience of users, the source code and videos of the simulated and real
6364 world are published on Github.
6365 </p>
6366 </description>
6367 <guid isPermaLink="false">oai:arXiv.org:2010.15792</guid>
6368 </item>
6369 <item>
6370 <title>A computational periporomechanics model for localized failure in unsaturated porous media. (arXiv:2010.15793v1 [math.NA])</title>
6371 <link>http://fr.arxiv.org/abs/2010.15793</link>
6372 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Menon_S/0/1/0/all/0/1">Shashank Menon</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Song_X/0/1/0/all/0/1">Xiaoyu Song</a></p>
6373
6374 <p>We implement a computational periporomechanics model for simulating localized
6375 failure in unsaturated porous media. The coupled periporomechanics model is
6376 based on the peridynamic state concept and the effective force state concept.
6377 The coupled governing equations are integral-differential equations without
6378 assuming the continuity of solid displacement and fluid pressures. The fluid
6379 flow and effective force states are determined by nonlocal fluid pressure and
6380 deformation gradients through the recently formulated multiphase constitutive
6381 correspondence principle. The coupled peri-poromechanics is implemented
6382 numerically for high-performance computing by an implicit multiphase meshfree
6383 method utilizing the message passing interface. The numerical implementation is
6384 validated by simulating classical poromechanics problems and comparing the
6385 numerical results with analytical solutions and experimental data. Numerical
6386 examples are presented to demonstrate the robustness of the fully coupled
6387 peri-poromechanics in modeling localized failures in unsaturated porous media.
6388 </p>
6389 </description>
6390 <guid isPermaLink="false">oai:arXiv.org:2010.15793</guid>
6391 </item>
6392 <item>
6393 <title>Eccentricity queries and beyond using Hub Labels. (arXiv:2010.15794v1 [cs.DS])</title>
6394 <link>http://fr.arxiv.org/abs/2010.15794</link>
6395 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ducoffe_G/0/1/0/all/0/1">Guillaume Ducoffe</a></p>
6396
6397 <p>Hub labeling schemes are popular methods for computing distances on road
6398 networks and other large complex networks, often answering to a query within a
6399 few microseconds for graphs with millions of edges. In this work, we study
6400 their algorithmic applications beyond distance queries. We focus on
6401 eccentricity queries and distance-sum queries, for several versions of these
6402 problems on directed weighted graphs, that is in part motivated by their
6403 importance in facility location problems. On the negative side, we show
6404 conditional lower bounds for these above problems on unweighted undirected
6405 sparse graphs, via standard constructions from "Fine-grained" complexity.
6406 However, things take a different turn when the hub labels have a sublogarithmic
6407 size. Indeed, given a hub labeling of maximum label size $\leq k$, after
6408 pre-processing the labels in total $2^{{O}(k)} \cdot |V|^{1+o(1)}$ time, we can
6409 compute both the eccentricity and the distance-sum of any vertex in $2^{{O}(k)}
6410 \cdot |V|^{o(1)}$ time. It can also be applied to the fast global computation
6411 of some topological indices. Finally, as a by-product of our approach, on any
6412 fixed class of unweighted graphs with bounded expansion, we can decide whether
6413 the diameter of an $n$-vertex graph in the class is at most $k$ in $f(k) \cdot
6414 n^{1+o(1)}$ time, for some "explicit" function $f$.
6415 </p>
6416 </description>
6417 <guid isPermaLink="false">oai:arXiv.org:2010.15794</guid>
6418 </item>
6419 <item>
6420 <title>Ray-marching Thurston geometries. (arXiv:2010.15801v1 [math.GT])</title>
6421 <link>http://fr.arxiv.org/abs/2010.15801</link>
6422 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Coulon_R/0/1/0/all/0/1">R&#xe9;mi Coulon</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Matsumoto_E/0/1/0/all/0/1">Elisabetta A. Matsumoto</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Segerman_H/0/1/0/all/0/1">Henry Segerman</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Trettel_S/0/1/0/all/0/1">Steve J. Trettel</a></p>
6423
6424 <p>We describe algorithms that produce accurate real-time interactive in-space
6425 views of the eight Thurston geometries using ray-marching. We give a
6426 theoretical framework for our algorithms, independent of the geometry involved.
6427 In addition to scenes within a geometry $X$, we also consider scenes within
6428 quotient manifolds and orbifolds $X / \Gamma$. We adapt the Phong lighting
6429 model to non-euclidean geometries. The most difficult part of this is the
6430 calculation of light intensity, which relates to the area density of geodesic
6431 spheres. We also give extensive practical details for each geometry.
6432 </p>
6433 </description>
6434 <guid isPermaLink="false">oai:arXiv.org:2010.15801</guid>
6435 </item>
6436 <item>
6437 <title>Isometric embeddings in trees and their use in the diameter problem. (arXiv:2010.15803v1 [cs.DS])</title>
6438 <link>http://fr.arxiv.org/abs/2010.15803</link>
6439 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ducoffe_G/0/1/0/all/0/1">Guillaume Ducoffe</a></p>
6440
6441 <p>We prove that given a discrete space with $n$ points which is either embedded
6442 in a system of $k$ trees, or the Cartesian product of $k$ trees, we can compute
6443 all eccentricities in ${\cal O}(2^{{\cal O}(k\log{k})}(N+n)^{1+o(1)})$ time,
6444 where $N$ is the cumulative total order over all these $k$ trees. This is near
6445 optimal under the Strong Exponential-Time Hypothesis, even in the very special
6446 case of an $n$-vertex graph embedded in a system of $\omega(\log{n})$ spanning
6447 trees. However, given such an embedding in the strong product of $k$ trees,
6448 there is a much faster ${\cal O}(N + kn)$-time algorithm for this problem. All
6449 our positive results can be turned into approximation algorithms for the graphs
6450 and finite spaces with a quasi isometric embedding in trees, if such embedding
6451 is given as input, where the approximation factor (resp., the approximation
6452 constant) depends on the distortion of the embedding (resp., of its stretch).
6453 The existence of embeddings in the Cartesian product of finitely many trees has
6454 been thoroughly investigated for cube-free median graphs. We give the
6455 first-known quasi linear-time algorithm for computing the diameter within this
6456 graph class. It does not require an embedding in a product of trees to be given
6457 as part of the input. On our way, being given an $n$-node tree $T$, we propose
6458 a data structure with ${\cal O}(n\log{n})$ pre-processing time in order to
6459 compute in ${\cal O}(k\log^2{n})$ time the eccentricity of any subset of $k$
6460 nodes. We combine the latter technical contribution, of independent interest,
6461 with a recent distance-labeling scheme that was designed for cube-free median
6462 graphs.
6463 </p>
6464 </description>
6465 <guid isPermaLink="false">oai:arXiv.org:2010.15803</guid>
6466 </item>
6467 <item>
6468 <title>A Local Search Framework for Experimental Design. (arXiv:2010.15805v1 [cs.DS])</title>
6469 <link>http://fr.arxiv.org/abs/2010.15805</link>
6470 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lau_L/0/1/0/all/0/1">Lap Chi Lau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_H/0/1/0/all/0/1">Hong Zhou</a></p>
6471
6472 <p>We present a local search framework to design and analyze both combinatorial
6473 algorithms and rounding algorithms for experimental design problems. This
6474 framework provides a unifying approach to match and improve all known results
6475 in D/A/E-design and to obtain new results in previously unknown settings.
6476 </p>
6477 <p>For combinatorial algorithms, we provide a new analysis of the classical
6478 Fedorov's exchange method. We prove that this simple local search algorithm
6479 works well as long as there exists an almost optimal solution with good
6480 condition number. Moreover, we design a new combinatorial local search
6481 algorithm for E-design using the regret minimization framework.
6482 </p>
6483 <p>For rounding algorithms, we provide a unified randomized exchange algorithm
6484 to match and improve previous results for D/A/E-design. Furthermore, the
6485 algorithm works in the more general setting to approximately satisfy multiple
6486 knapsack constraints, which can be used for weighted experimental design and
6487 for incorporating fairness constraints into experimental design.
6488 </p>
6489 </description>
6490 <guid isPermaLink="false">oai:arXiv.org:2010.15805</guid>
6491 </item>
6492 <item>
6493 <title>The ins and outs of speaker recognition: lessons from VoxSRC 2020. (arXiv:2010.15809v1 [cs.SD])</title>
6494 <link>http://fr.arxiv.org/abs/2010.15809</link>
6495 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kwon_Y/0/1/0/all/0/1">Yoohwan Kwon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heo_H/0/1/0/all/0/1">Hee-Soo Heo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_B/0/1/0/all/0/1">Bong-Jin Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chung_J/0/1/0/all/0/1">Joon Son Chung</a></p>
6496
6497 <p>The VoxCeleb Speaker Recognition Challenge (VoxSRC) at Interspeech 2020
6498 offers a challenging evaluation for speaker recognition systems, which includes
6499 celebrities playing different parts in movies. The goal of this work is robust
6500 speaker recognition of utterances recorded in these challenging environments.
6501 We utilise variants of the popular ResNet architecture for speaker recognition
6502 and perform extensive experiments using a range of loss functions and training
6503 parameters. To this end, we optimise an efficient training framework that
6504 allows powerful models to be trained with limited time and resources. Our
6505 trained models demonstrate improvements over most existing works with lighter
6506 models and a simple pipeline. The paper shares the lessons learned from our
6507 participation in the challenge.
6508 </p>
6509 </description>
6510 <guid isPermaLink="false">oai:arXiv.org:2010.15809</guid>
6511 </item>
6512 <item>
6513 <title>Algorithmic pure states for the negative spherical perceptron. (arXiv:2010.15811v1 [math.PR])</title>
6514 <link>http://fr.arxiv.org/abs/2010.15811</link>
6515 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Alaoui_A/0/1/0/all/0/1">Ahmed El Alaoui</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Sellke_M/0/1/0/all/0/1">Mark Sellke</a></p>
6516
6517 <p>We consider the spherical perceptron with Gaussian disorder. This is the set
6518 $S$ of points $\sigma \in \mathbb{R}^N$ on the sphere of radius $\sqrt{N}$
6519 satisfying $\langle g_a , \sigma \rangle \ge \kappa\sqrt{N}\,$ for all $1 \le a
6520 \le M$, where $(g_a)_{a=1}^M$ are independent standard gaussian vectors and
6521 $\kappa \in \mathbb{R}$ is fixed. Various characteristics of $S$ such as its
6522 surface measure and the largest $M$ for which it is non-empty, were computed
6523 heuristically in statistical physics in the asymptotic regime $N \to \infty$,
6524 $M/N \to \alpha$. The case $\kappa&lt;0$ is of special interest as $S$ is
6525 conjectured to exhibit a hierarchical tree-like geometry known as "full
6526 replica-symmetry breaking" (FRSB) close to the satisfiability threshold
6527 $\alpha_{\text{SAT}}(\kappa)$, and whose characteristics are captured by a
6528 Parisi variational principle akin to the one appearing in the
6529 Sherrington-Kirkpatrick model. In this paper we design an efficient algorithm
6530 which, given oracle access to the solution of the Parisi variational principle,
6531 exploits this conjectured FRSB structure for $\kappa&lt;0$ and outputs a vector
6532 $\hat{\sigma}$ satisfying $\langle g_a , \hat{\sigma}\rangle \ge \kappa
6533 \sqrt{N}$ for all $1\le a \le M$ and lying on a sphere of non-trivial radius
6534 $\sqrt{\bar{q} N}$, where $\bar{q} \in (0,1)$ is the right-end of the support
6535 of the associated Parisi measure. We expect $\hat{\sigma}$ to be approximately
6536 the barycenter of a pure state of the spherical perceptron. Moreover we expect
6537 that $\bar{q} \to 1$ as $\alpha \to \alpha_{\text{SAT}}(\kappa)$, so that
6538 $\big\langle g_a,\hat{\sigma}/|\hat{\sigma}|\big\rangle \geq
6539 (\kappa-o(1))\sqrt{N}$ near criticality.
6540 </p>
6541 </description>
6542 <guid isPermaLink="false">oai:arXiv.org:2010.15811</guid>
6543 </item>
6544 <item>
6545 <title>Around the diameter of AT-free graphs. (arXiv:2010.15814v1 [cs.DS])</title>
6546 <link>http://fr.arxiv.org/abs/2010.15814</link>
6547 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ducoffe_G/0/1/0/all/0/1">Guillaume Ducoffe</a></p>
6548
6549 <p>A graph algorithm is truly subquadratic if it runs in ${\cal O}(m^b)$ time on
6550 connected $m$-edge graphs, for some positive $b &lt; 2$. Roditty and Vassilevska
6551 Williams (STOC'13) proved that under plausible complexity assumptions, there is
6552 no truly subquadratic algorithm for computing the diameter of general graphs.
6553 In this work, we present positive and negative results on the existence of such
6554 algorithms for computing the diameter on some special graph classes.
6555 Specifically, three vertices in a graph form an asteroidal triple (AT) if
6556 between any two of them there exists a path that avoids the closed
6557 neighbourhood of the third one. We call a graph AT-free if it does not contain
6558 an AT. We first prove that for all $m$-edge AT-free graphs, one can compute all
6559 the eccentricities in truly subquadratic ${\cal O}(m^{3/2})$ time. Then, we
6560 extend our study to several subclasses of chordal graphs -- all of them
6561 generalizing interval graphs in various ways --, as an attempt to understand
6562 which of the properties of AT-free graphs, or natural generalizations of the
6563 latter, can help in the design of fast algorithms for the diameter problem on
6564 broader graph classes. For instance, for all chordal graphs with a dominating
6565 shortest path, there is a linear-time algorithm for computing a diametral pair
6566 if the diameter is at least four. However, already for split graphs with a
6567 dominating edge, under plausible complexity assumptions, there is no truly
6568 subquadratic algorithm for deciding whether the diameter is either $2$ or $3$.
6569 </p>
6570 </description>
6571 <guid isPermaLink="false">oai:arXiv.org:2010.15814</guid>
6572 </item>
6573 <item>
6574 <title>Tensor Completion via Tensor Networks with a Tucker Wrapper. (arXiv:2010.15819v1 [stat.ML])</title>
6575 <link>http://fr.arxiv.org/abs/2010.15819</link>
6576 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Cai_Y/0/1/0/all/0/1">Yunfeng Cai</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Li_P/0/1/0/all/0/1">Ping Li</a></p>
6577
6578 <p>In recent years, low-rank tensor completion (LRTC) has received considerable
6579 attention due to its applications in image/video inpainting, hyperspectral data
6580 recovery, etc. With different notions of tensor rank (e.g., CP, Tucker, tensor
6581 train/ring, etc.), various optimization based numerical methods are proposed to
6582 LRTC. However, tensor network based methods have not been proposed yet. In this
6583 paper, we propose to solve LRTC via tensor networks with a Tucker wrapper. Here
6584 by "Tucker wrapper" we mean that the outermost factor matrices of the tensor
6585 network are all orthonormal. We formulate LRTC as a problem of solving a system
6586 of nonlinear equations, rather than a constrained optimization problem. A
6587 two-level alternative least square method is then employed to update the
6588 unknown factors. The computation of the method is dominated by tensor matrix
6589 multiplications and can be efficiently performed. Also, under proper
6590 assumptions, it is shown that with high probability, the method converges to
6591 the exact solution at a linear rate. Numerical simulations show that the
6592 proposed algorithm is comparable with state-of-the-art methods.
6593 </p>
6594 </description>
6595 <guid isPermaLink="false">oai:arXiv.org:2010.15819</guid>
6596 </item>
6597 <item>
6598 <title>Down the bot hole: actionable insights from a 1-year analysis of bots activity on Twitter. (arXiv:2010.15820v1 [cs.SI])</title>
6599 <link>http://fr.arxiv.org/abs/2010.15820</link>
6600 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Luceri_L/0/1/0/all/0/1">Luca Luceri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cardoso_F/0/1/0/all/0/1">Felipe Cardoso</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Giordano_S/0/1/0/all/0/1">Silvia Giordano</a></p>
6601
6602 <p>Nowadays, social media represent persuasive tools that have been
6603 progressively weaponized to affect people's beliefs, spread manipulative
6604 narratives, and sow conflicts along divergent factions. Software-controlled
6605 accounts (i.e., bots) are one of the main actors associated with manipulation
6606 campaigns, especially in the political context. Uncovering the strategies
6607 behind bots' activities is of paramount importance to detect and curb such
6608 campaigns. In this paper, we present a long term (one year) analysis of bots
6609 activity on Twitter in the run-up to the 2018 U.S. Midterm Elections. We
6610 identify different classes of accounts based on their nature (bot vs. human)
6611 and engagement within the online discussion and we observe that hyperactive
6612 bots played a pivotal role in the dissemination of conspiratorial narratives,
6613 while dominating the political debate since the year before the election. Our
6614 analysis, on the horizon of the upcoming U.S. 2020 Presidential Election,
6615 reveals both alarming findings of humans' susceptibility to bots and actionable
6616 insights that can contribute to curbing coordinated campaigns.
6617 </p>
6618 </description>
6619 <guid isPermaLink="false">oai:arXiv.org:2010.15820</guid>
6620 </item>
6621 <item>
6622 <title>Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search. (arXiv:2010.15821v1 [cs.CV])</title>
6623 <link>http://fr.arxiv.org/abs/2010.15821</link>
6624 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_H/0/1/0/all/0/1">Houwen Peng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Du_H/0/1/0/all/0/1">Hao Du</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_H/0/1/0/all/0/1">Hongyuan Yu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qi Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liao_J/0/1/0/all/0/1">Jing Liao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fu_J/0/1/0/all/0/1">Jianlong Fu</a></p>
6625
6626 <p>One-shot weight sharing methods have recently drawn great attention in neural
6627 architecture search due to high efficiency and competitive performance.
6628 However, weight sharing across models has an inherent deficiency, i.e.,
6629 insufficient training of subnetworks in the hypernetwork. To alleviate this
6630 problem, we present a simple yet effective architecture distillation method.
6631 The central idea is that subnetworks can learn collaboratively and teach each
6632 other throughout the training process, aiming to boost the convergence of
6633 individual models. We introduce the concept of prioritized path, which refers
6634 to the architecture candidates exhibiting superior performance during training.
6635 Distilling knowledge from the prioritized paths is able to boost the training
6636 of subnetworks. Since the prioritized paths are changed on the fly depending on
6637 their performance and complexity, the final obtained paths are the cream of the
6638 crop. We directly select the most promising one from the prioritized paths as
6639 the final architecture, without using other complex search methods, such as
6640 reinforcement learning or evolution algorithms. The experiments on ImageNet
6641 verify such path distillation method can improve the convergence ratio and
6642 performance of the hypernetwork, as well as boosting the training of
6643 subnetworks. The discovered architectures achieve superior performance compared
6644 to the recent MobileNetV3 and EfficientNet families under aligned settings.
6645 Moreover, the experiments on object detection and more challenging search space
6646 show the generality and robustness of the proposed method. Code and models are
6647 available at https://github.com/microsoft/cream.git.
6648 </p>
6649 </description>
6650 <guid isPermaLink="false">oai:arXiv.org:2010.15821</guid>
6651 </item>
6652 <item>
6653 <title>Black-Box Optimization of Object Detector Scales. (arXiv:2010.15823v1 [cs.CV])</title>
6654 <link>http://fr.arxiv.org/abs/2010.15823</link>
6655 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Muthuraja_M/0/1/0/all/0/1">Mohandass Muthuraja</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Arriaga_O/0/1/0/all/0/1">Octavio Arriaga</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ploger_P/0/1/0/all/0/1">Paul Pl&#xf6;ger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kirchner_F/0/1/0/all/0/1">Frank Kirchner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Valdenegro_Toro_M/0/1/0/all/0/1">Matias Valdenegro-Toro</a></p>
6656
6657 <p>Object detectors have improved considerably in the last years by using
6658 advanced CNN architectures. However, many detector hyper-parameters are
6659 generally manually tuned, or they are used with values set by the detector
6660 authors. Automatic Hyper-parameter optimization has not been explored in
6661 improving CNN-based object detectors hyper-parameters. In this work, we propose
6662 the use of Black-box optimization methods to tune the prior/default box scales
6663 in Faster R-CNN and SSD, using Bayesian Optimization, SMAC, and CMA-ES. We show
6664 that by tuning the input image size and prior box anchor scale on Faster R-CNN
6665 mAP increases by 2% on PASCAL VOC 2007, and by 3% with SSD. On the COCO dataset
6666 with SSD there are mAP improvement in the medium and large objects, but mAP
6667 decreases by 1% in small objects. We also perform a regression analysis to find
6668 the significant hyper-parameters to tune.
6669 </p>
6670 </description>
6671 <guid isPermaLink="false">oai:arXiv.org:2010.15823</guid>
6672 </item>
6673 <item>
6674 <title>Passport-aware Normalization for Deep Model Protection. (arXiv:2010.15824v1 [cs.CV])</title>
6675 <link>http://fr.arxiv.org/abs/2010.15824</link>
6676 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jie Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_D/0/1/0/all/0/1">Dongdong Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liao_J/0/1/0/all/0/1">Jing Liao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_W/0/1/0/all/0/1">Weiming Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hua_G/0/1/0/all/0/1">Gang Hua</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yu_N/0/1/0/all/0/1">Nenghai Yu</a></p>
6677
6678 <p>Despite tremendous success in many application scenarios, deep learning faces
6679 serious intellectual property (IP) infringement threats. Considering the cost
6680 of designing and training a good model, infringements will significantly
6681 infringe the interests of the original model owner. Recently, many impressive
6682 works have emerged for deep model IP protection. However, they either are
6683 vulnerable to ambiguity attacks, or require changes in the target network
6684 structure by replacing its original normalization layers and hence cause
6685 significant performance drops. To this end, we propose a new passport-aware
6686 normalization formulation, which is generally applicable to most existing
6687 normalization layers and only needs to add another passport-aware branch for IP
6688 protection. This new branch is jointly trained with the target model but
6689 discarded in the inference stage. Therefore it causes no structure change in
6690 the target model. Only when the model IP is suspected to be stolen by someone,
6691 the private passport-aware branch is added back for ownership verification.
6692 Through extensive experiments, we verify its effectiveness in both image and 3D
6693 point recognition models. It is demonstrated to be robust not only to common
6694 attack techniques like fine-tuning and model compression, but also to ambiguity
6695 attacks. By further combining it with trigger-set based methods, both black-box
6696 and white-box verification can be achieved for enhanced security of deep
6697 learning models deployed in real systems. Code can be found at
6698 https://github.com/ZJZAC/Passport-aware-Normalization.
6699 </p>
6700 </description>
6701 <guid isPermaLink="false">oai:arXiv.org:2010.15824</guid>
6702 </item>
6703 <item>
6704 <title>RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder. (arXiv:2010.15831v1 [cs.CV])</title>
6705 <link>http://fr.arxiv.org/abs/2010.15831</link>
6706 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chi_C/0/1/0/all/0/1">Cheng Chi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_F/0/1/0/all/0/1">Fangyun Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_H/0/1/0/all/0/1">Han Hu</a></p>
6707
6708 <p>Existing object detection frameworks are usually built on a single format of
6709 object/part representation, i.e., anchor/proposal rectangle boxes in RetinaNet
6710 and Faster R-CNN, center points in FCOS and RepPoints, and corner points in
6711 CornerNet. While these different representations usually drive the frameworks
6712 to perform well in different aspects, e.g., better classification or finer
6713 localization, it is in general difficult to combine these representations in a
6714 single framework to make good use of each strength, due to the heterogeneous or
6715 non-grid feature extraction by different representations. This paper presents
6716 an attention-based decoder module similar as that in
6717 Transformer~\cite{vaswani2017attention} to bridge other representations into a
6718 typical object detector built on a single representation format, in an
6719 end-to-end fashion. The other representations act as a set of \emph{key}
6720 instances to strengthen the main \emph{query} representation features in the
6721 vanilla detectors. Novel techniques are proposed towards efficient computation
6722 of the decoder module, including a \emph{key sampling} approach and a
6723 \emph{shared location embedding} approach. The proposed module is named
6724 \emph{bridging visual representations} (BVR). It can perform in-place and we
6725 demonstrate its broad effectiveness in bridging other representations into
6726 prevalent object detection frameworks, including RetinaNet, Faster R-CNN, FCOS
6727 and ATSS, where about $1.5\sim3.0$ AP improvements are achieved. In particular,
6728 we improve a state-of-the-art framework with a strong backbone by about $2.0$
6729 AP, reaching $52.7$ AP on COCO test-dev. The resulting network is named
6730 RelationNet++. The code will be available at
6731 https://github.com/microsoft/RelationNet2.
6732 </p>
6733 </description>
6734 <guid isPermaLink="false">oai:arXiv.org:2010.15831</guid>
6735 </item>
6736 <item>
6737 <title>Proceedings 9th International Workshop on Theorem Proving Components for Educational Software. (arXiv:2010.15832v1 [cs.AI])</title>
6738 <link>http://fr.arxiv.org/abs/2010.15832</link>
6739 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Quaresma_P/0/1/0/all/0/1">Pedro Quaresma</a> (University of Coimbra, Portugal), <a href="http://fr.arxiv.org/find/cs/1/au:+Neuper_W/0/1/0/all/0/1">Walther Neuper</a> (JKU Johannes Kepler University, Linz, Austria), <a href="http://fr.arxiv.org/find/cs/1/au:+Marcos_J/0/1/0/all/0/1">Jo&#xe3;o Marcos</a> (UFRN, Brazil)</p>
6740
6741 <p>The 9th International Workshop on Theorem-Proving Components for Educational
6742 Software (ThEdu'20) was scheduled to happen on June 29 as a satellite of the
6743 IJCAR-FSCD 2020 joint meeting, in Paris. The COVID-19 pandemic came by
6744 surprise, though, and the main conference was virtualised. Fearing that an
6745 online meeting would not allow our community to fully reproduce the usual
6746 face-to-face networking opportunities of the ThEdu initiative, the Steering
6747 Committee of ThEdu decided to cancel our workshop. Given that many of us had
6748 already planned and worked for that moment, we decided that ThEdu'20 could
6749 still live in the form of an EPTCS volume. The EPTCS concurred with us,
6750 recognising this very singular situation, and accepted our proposal of
6751 organising a special issue with papers submitted to ThEdu'20. An open call for
6752 papers was then issued, and attracted five submissions, all of which have been
6753 accepted by our reviewers, who produced three careful reports on each of the
6754 contributions. The resulting revised papers are collected in the present
6755 volume. We, the volume editors, hope that this collection of papers will help
6756 further promoting the development of theorem-proving-based software, and that
6757 it will collaborate to improve the mutual understanding between computer
6758 mathematicians and stakeholders in education. With some luck, we would actually
6759 expect that the very special circumstances set up by the worst sanitary crisis
6760 in a century will happen to reinforce the need for the application of certified
6761 components and of verification methods for the production of educational
6762 software that would be available even when the traditional on-site learning
6763 experiences turn out not to be recommendable.
6764 </p>
6765 </description>
6766 <guid isPermaLink="false">oai:arXiv.org:2010.15832</guid>
6767 </item>
6768 <item>
6769 <title>Property Checking Without Invariant Generation. (arXiv:1602.05829v3 [cs.LO] UPDATED)</title>
6770 <link>http://fr.arxiv.org/abs/1602.05829</link>
6771 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goldberg_E/0/1/0/all/0/1">Eugene Goldberg</a></p>
6772
6773 <p>We introduce a procedure for proving safety properties. This procedure is
6774 based on a technique called Partial Quantifier Elimination (PQE). In contrast
6775 to complete quantifier elimination, in PQE, only a part of the formula is taken
6776 out of the scope of quantifiers. So, PQE can be dramatically more efficient
6777 than complete quantifier elimination. The appeal of our procedure is twofold.
6778 First, it can prove a property without generating an inductive invariant.
6779 Second, it employs depth-first search and so can be used to find deep bugs.
6780 </p>
6781 </description>
6782 <guid isPermaLink="false">oai:arXiv.org:1602.05829</guid>
6783 </item>
6784 <item>
6785 <title>Minimax Rate-Optimal Estimation of Divergences between Discrete Distributions. (arXiv:1605.09124v4 [cs.IT] UPDATED)</title>
6786 <link>http://fr.arxiv.org/abs/1605.09124</link>
6787 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Han_Y/0/1/0/all/0/1">Yanjun Han</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jiao_J/0/1/0/all/0/1">Jiantao Jiao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weissman_T/0/1/0/all/0/1">Tsachy Weissman</a></p>
6788
6789 <p>We study the minimax estimation of $\alpha$-divergences between discrete
6790 distributions for integer $\alpha\ge 1$, which include the Kullback--Leibler
6791 divergence and the $\chi^2$-divergences as special examples. Dropping the usual
6792 theoretical tricks to acquire independence, we construct the first minimax
6793 rate-optimal estimator which does not require any Poissonization, sample
6794 splitting, or explicit construction of approximating polynomials. The estimator
6795 uses a hybrid approach which solves a problem-independent linear program based
6796 on moment matching in the non-smooth regime, and applies a problem-dependent
6797 bias-corrected plug-in estimator in the smooth regime, with a soft decision
6798 boundary between these regimes.
6799 </p>
6800 </description>
6801 <guid isPermaLink="false">oai:arXiv.org:1605.09124</guid>
6802 </item>
6803 <item>
6804 <title>Sequence Graph Transform (SGT): A Feature Embedding Function for Sequence Data Mining. (arXiv:1608.03533v13 [stat.ML] UPDATED)</title>
6805 <link>http://fr.arxiv.org/abs/1608.03533</link>
6806 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Ranjan_C/0/1/0/all/0/1">Chitta Ranjan</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Ebrahimi_S/0/1/0/all/0/1">Samaneh Ebrahimi</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Paynabar_K/0/1/0/all/0/1">Kamran Paynabar</a></p>
6807
6808 <p>Sequence feature embedding is a challenging task due to un-structuredness of
6809 sequences -- arbitrary strings of arbitrary length. Existing methods are
6810 efficient in extracting short-term dependencies but typically suffer from
6811 computation issues for the long-term. Sequence Graph Transform (SGT), a feature
6812 embedding function, that can extract any amount of short- to long- term
6813 dependencies without increasing the computation -- proved theoretically -- is
6814 proposed. SGT features yield significantly superior results in sequence
6815 clustering and classification with higher accuracy and lower computation as
6816 compared to the existing methods, including the state-of-the-art
6817 sequence/string Kernels and LSTM.
6818 </p>
6819 </description>
6820 <guid isPermaLink="false">oai:arXiv.org:1608.03533</guid>
6821 </item>
6822 <item>
6823 <title>Time-Space Trade-Offs for Computing Euclidean Minimum Spanning Trees. (arXiv:1712.06431v3 [cs.CG] UPDATED)</title>
6824 <link>http://fr.arxiv.org/abs/1712.06431</link>
6825 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Banyassady_B/0/1/0/all/0/1">Bahareh Banyassady</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Barba_L/0/1/0/all/0/1">Luis Barba</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mulzer_W/0/1/0/all/0/1">Wolfgang Mulzer</a></p>
6826
6827 <p>We present time-space trade-offs for computing the Euclidean minimum spanning
6828 tree of a set $S$ of $n$ point-sites in the plane. More precisely, we assume
6829 that $S$ resides in a random-access memory that can only be read. The edges of
6830 the Euclidean minimum spanning tree $\text{EMST}(S)$ have to be reported
6831 sequentially, and they cannot be accessed or modified afterwards. There is a
6832 parameter $s \in \{1, \dots, n\}$ so that the algorithm may use $O(s)$ cells of
6833 read-write memory (called the workspace) for its computations. Our goal is to
6834 find an algorithm that has the best possible running time for any given $s$
6835 between $1$ and $n$.
6836 </p>
6837 <p>We show how to compute $\text{EMST}(S)$ in $O\big((n^3/s^2)\log s \big)$ time
6838 with $O(s)$ cells of workspace, giving a smooth trade-off between the two best
6839 known bounds $O(n^3)$ for $s = 1$ and $O(n \log n)$ for $s = n$. For this, we
6840 run Kruskal's algorithm on the relative neighborhood graph (RNG) of $S$. It is
6841 a classic fact that the minimum spanning tree of $\text{RNG}(S)$ is exactly
6842 $\text{EMST}(S)$. To implement Kruskal's algorithm with $O(s)$ cells of
6843 workspace, we define $s$-nets, a compact representation of planar graphs. This
6844 allows us to efficiently maintain and update the components of the current
6845 minimum spanning forest as the edges are being inserted.
6846 </p>
6847 </description>
6848 <guid isPermaLink="false">oai:arXiv.org:1712.06431</guid>
6849 </item>
6850 <item>
6851 <title>Type-two polynomial-time and restricted lookahead. (arXiv:1801.07485v2 [cs.CC] UPDATED)</title>
6852 <link>http://fr.arxiv.org/abs/1801.07485</link>
6853 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kapron_B/0/1/0/all/0/1">Bruce M. Kapron</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Steinberg_F/0/1/0/all/0/1">Florian Steinberg</a></p>
6854
6855 <p>This paper provides an alternate characterization of type-two polynomial-time
6856 computability, with the goal of making second-order complexity theory more
6857 approachable. We rely on the usual oracle machines to model programs with
6858 subroutine calls. In contrast to previous results, the use of higher-order
6859 objects as running times is avoided, either explicitly or implicitly. Instead,
6860 regular polynomials are used. This is achieved by refining the notion of
6861 oracle-polynomial-time introduced by Cook. We impose a further restriction on
6862 the oracle interactions to force feasibility. Both the restriction as well as
6863 its purpose are very simple: it is well-known that Cook's model allows
6864 polynomial depth iteration of functional inputs with no restrictions on size,
6865 and thus does not guarantee that polynomial-time computability is preserved. To
6866 mend this we restrict the number of lookahead revisions, that is the number of
6867 times a query can be asked that is bigger than any of the previous queries. We
6868 prove that this leads to a class of feasible functionals and that all feasible
6869 problems can be solved within this class if one is allowed to separate a task
6870 into efficiently solvable subtasks. Formally put: the closure of our class
6871 under lambda-abstraction and application includes all feasible operations. We
6872 also revisit the very similar class of strongly polynomial-time computable
6873 operators previously introduced by Kawamura and Steinberg. We prove it to be
6874 strictly included in our class and, somewhat surprisingly, to have the same
6875 closure property. This can be attributed to properties of the limited recursion
6876 operator: It is not strongly polynomial-time computable but decomposes into two
6877 such operations and lies in our class.
6878 </p>
6879 </description>
6880 <guid isPermaLink="false">oai:arXiv.org:1801.07485</guid>
6881 </item>
6882 <item>
6883 <title>Comparing Type Systems for Deadlock Freedom. (arXiv:1810.00635v3 [cs.LO] UPDATED)</title>
6884 <link>http://fr.arxiv.org/abs/1810.00635</link>
6885 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dardha_O/0/1/0/all/0/1">Ornela Dardha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Perez_J/0/1/0/all/0/1">Jorge A. P&#xe9;rez</a></p>
6886
6887 <p>Message-passing software systems exhibit non-trivial forms of concurrency and
6888 distribution; they are expected to follow intended protocols among
6889 communicating services, but also to never "get stuck". This intuitive
6890 requirement has been expressed by liveness properties such as progress or
6891 (dead)lock freedom and various type systems ensure these properties for
6892 concurrent processes. Unfortunately, very little is known about the precise
6893 relationship between these type systems and the classes of typed processes they
6894 induce.
6895 </p>
6896 <p>This paper puts forward the first comparative study of different type systems
6897 for message-passing processes that guarantee deadlock freedom. We compare two
6898 classes of deadlock-free typed processes, here denoted L and K. The class L
6899 stands out for its canonicity: it results from Curry-Howard interpretations of
6900 linear logic propositions as session types. The class K, obtained by encoding
6901 session types into Kobayashi's linear types with usages, includes processes not
6902 typable in other type systems. We show that L is strictly included in K, and
6903 identify the precise conditions under which they coincide. We also provide two
6904 type-preserving translations of processes in K into processes in L.
6905 </p>
6906 </description>
6907 <guid isPermaLink="false">oai:arXiv.org:1810.00635</guid>
6908 </item>
6909 <item>
6910 <title>AADS: Augmented Autonomous Driving Simulation using Data-driven Algorithms. (arXiv:1901.07849v3 [cs.CV] UPDATED)</title>
6911 <link>http://fr.arxiv.org/abs/1901.07849</link>
6912 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_W/0/1/0/all/0/1">Wei Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pan_C/0/1/0/all/0/1">Chengwei Pan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_R/0/1/0/all/0/1">Rong Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_J/0/1/0/all/0/1">Jiaping Ren</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_Y/0/1/0/all/0/1">Yuexin Ma</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fang_J/0/1/0/all/0/1">Jin Fang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yan_F/0/1/0/all/0/1">Feilong Yan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Geng_Q/0/1/0/all/0/1">Qichuan Geng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_X/0/1/0/all/0/1">Xinyu Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gong_H/0/1/0/all/0/1">Huajun Gong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_W/0/1/0/all/0/1">Weiwei Xu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_G/0/1/0/all/0/1">Guoping Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Manocha_D/0/1/0/all/0/1">Dinesh Manocha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_R/0/1/0/all/0/1">Ruigang Yang</a></p>
6913
6914 <p>Simulation systems have become an essential component in the development and
6915 validation of autonomous driving technologies. The prevailing state-of-the-art
6916 approach for simulation is to use game engines or high-fidelity computer
6917 graphics (CG) models to create driving scenarios. However, creating CG models
6918 and vehicle movements (e.g., the assets for simulation) remains a manual task
6919 that can be costly and time-consuming. In addition, the fidelity of CG images
6920 still lacks the richness and authenticity of real-world images and using these
6921 images for training leads to degraded performance.
6922 </p>
6923 <p>In this paper we present a novel approach to address these issues: Augmented
6924 Autonomous Driving Simulation (AADS). Our formulation augments real-world
6925 pictures with a simulated traffic flow to create photo-realistic simulation
6926 images and renderings. More specifically, we use LiDAR and cameras to scan
6927 street scenes. From the acquired trajectory data, we generate highly plausible
6928 traffic flows for cars and pedestrians and compose them into the background.
6929 The composite images can be re-synthesized with different viewpoints and sensor
6930 models. The resulting images are photo-realistic, fully annotated, and ready
6931 for end-to-end training and testing of autonomous driving systems from
6932 perception to planning. We explain our system design and validate our
6933 algorithms with a number of autonomous driving tasks from detection to
6934 segmentation and predictions.
6935 </p>
6936 <p>Compared to traditional approaches, our method offers unmatched scalability
6937 and realism. Scalability is particularly important for AD simulation and we
6938 believe the complexity and diversity of the real world cannot be realistically
6939 captured in a virtual environment. Our augmented approach combines the
6940 flexibility in a virtual environment (e.g., vehicle movements) with the
6941 richness of the real world to allow effective simulation of anywhere on earth.
6942 </p>
6943 </description>
6944 <guid isPermaLink="false">oai:arXiv.org:1901.07849</guid>
6945 </item>
6946 <item>
6947 <title>Mockingbird: Defending Against Deep-Learning-Based Website Fingerprinting Attacks with Adversarial Traces. (arXiv:1902.06626v5 [cs.CR] UPDATED)</title>
6948 <link>http://fr.arxiv.org/abs/1902.06626</link>
6949 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rahman_M/0/1/0/all/0/1">Mohammad Saidur Rahman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Imani_M/0/1/0/all/0/1">Mohsen Imani</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mathews_N/0/1/0/all/0/1">Nate Mathews</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wright_M/0/1/0/all/0/1">Matthew Wright</a></p>
6950
6951 <p>Website Fingerprinting (WF) is a type of traffic analysis attack that enables
6952 a local passive eavesdropper to infer the victim's activity, even when the
6953 traffic is protected by a VPN or an anonymity system like Tor. Leveraging a
6954 deep-learning classifier, a WF attacker can gain over 98% accuracy on Tor
6955 traffic. In this paper, we explore a novel defense, Mockingbird, based on the
6956 idea of adversarial examples that have been shown to undermine machine-learning
6957 classifiers in other domains. Since the attacker gets to design and train his
6958 attack classifier based on the defense, we first demonstrate that at a
6959 straightforward technique for generating adversarial-example based traces fails
6960 to protect against an attacker using adversarial training for robust
6961 classification. We then propose Mockingbird, a technique for generating traces
6962 that resists adversarial training by moving randomly in the space of viable
6963 traces and not following more predictable gradients. The technique drops the
6964 accuracy of the state-of-the-art attack hardened with adversarial training from
6965 98% to 42-58% while incurring only 58% bandwidth overhead. The attack accuracy
6966 is generally lower than state-of-the-art defenses, and much lower when
6967 considering Top-2 accuracy, while incurring lower bandwidth overheads.
6968 </p>
6969 </description>
6970 <guid isPermaLink="false">oai:arXiv.org:1902.06626</guid>
6971 </item>
6972 <item>
6973 <title>Global Optimality Guarantees For Policy Gradient Methods. (arXiv:1906.01786v2 [cs.LG] UPDATED)</title>
6974 <link>http://fr.arxiv.org/abs/1906.01786</link>
6975 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bhandari_J/0/1/0/all/0/1">Jalaj Bhandari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Russo_D/0/1/0/all/0/1">Daniel Russo</a></p>
6976
6977 <p>Policy gradients methods apply to complex, poorly understood, control
6978 problems by performing stochastic gradient descent over a parameterized class
6979 of polices. Unfortunately, even for simple control problems solvable by
6980 standard dynamic programming techniques, policy gradient algorithms face
6981 non-convex optimization problems and are widely understood to converge only to
6982 a stationary point. This work identifies structural properties -- shared by
6983 several classic control problems -- that ensure the policy gradient objective
6984 function has no suboptimal stationary points despite being non-convex. When
6985 these conditions are strengthened, this objective satisfies a
6986 Polyak-lojasiewicz (gradient dominance) condition that yields convergence
6987 rates. We also provide bounds on the optimality gap of any stationary point
6988 when some of these conditions are relaxed.
6989 </p>
6990 </description>
6991 <guid isPermaLink="false">oai:arXiv.org:1906.01786</guid>
6992 </item>
6993 <item>
6994 <title>ATRW: A Benchmark for Amur Tiger Re-identification in the Wild. (arXiv:1906.05586v4 [cs.CV] UPDATED)</title>
6995 <link>http://fr.arxiv.org/abs/1906.05586</link>
6996 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_S/0/1/0/all/0/1">Shuyuan Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_J/0/1/0/all/0/1">Jianguo Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_H/0/1/0/all/0/1">Hanlin Tang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Qian_R/0/1/0/all/0/1">Rui Qian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_W/0/1/0/all/0/1">Weiyao Lin</a></p>
6997
6998 <p>Monitoring the population and movements of endangered species is an important
6999 task to wildlife conversation. Traditional tagging methods do not scale to
7000 large populations, while applying computer vision methods to camera sensor data
7001 requires re-identification (re-ID) algorithms to obtain accurate counts and
7002 moving trajectory of wildlife. However, existing re-ID methods are largely
7003 targeted at persons and cars, which have limited pose variations and
7004 constrained capture environments. This paper tries to fill the gap by
7005 introducing a novel large-scale dataset, the Amur Tiger Re-identification in
7006 the Wild (ATRW) dataset. ATRW contains over 8,000 video clips from 92 Amur
7007 tigers, with bounding box, pose keypoint, and tiger identity annotations. In
7008 contrast to typical re-ID datasets, the tigers are captured in a diverse set of
7009 unconstrained poses and lighting conditions. We demonstrate with a set of
7010 baseline algorithms that ATRW is a challenging dataset for re-ID. Lastly, we
7011 propose a novel method for tiger re-identification, which introduces precise
7012 pose parts modeling in deep neural networks to handle large pose variation of
7013 tigers, and reaches notable performance improvement over existing re-ID
7014 methods. The dataset is public available at https://cvwc2019.github.io/ .
7015 </p>
7016 </description>
7017 <guid isPermaLink="false">oai:arXiv.org:1906.05586</guid>
7018 </item>
7019 <item>
7020 <title>A Simple Local Minimal Intensity Prior and An Improved Algorithm for Blind Image Deblurring. (arXiv:1906.06642v5 [eess.IV] UPDATED)</title>
7021 <link>http://fr.arxiv.org/abs/1906.06642</link>
7022 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Wen_F/0/1/0/all/0/1">Fei Wen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ying_R/0/1/0/all/0/1">Rendong Ying</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_Y/0/1/0/all/0/1">Yipeng Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_P/0/1/0/all/0/1">Peilin Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Truong_T/0/1/0/all/0/1">Trieu-Kien Truong</a></p>
7023
7024 <p>Blind image deblurring is a long standing challenging problem in image
7025 processing and low-level vision. Recently, sophisticated priors such as dark
7026 channel prior, extreme channel prior, and local maximum gradient prior, have
7027 shown promising effectiveness. However, these methods are computationally
7028 expensive. Meanwhile, since these priors involved subproblems cannot be solved
7029 explicitly, approximate solution is commonly used, which limits the best
7030 exploitation of their capability. To address these problems, this work firstly
7031 proposes a simplified sparsity prior of local minimal pixels, namely patch-wise
7032 minimal pixels (PMP). The PMP of clear images is much more sparse than that of
7033 blurred ones, and hence is very effective in discriminating between clear and
7034 blurred images. Then, a novel algorithm is designed to efficiently exploit the
7035 sparsity of PMP in deblurring. The new algorithm flexibly imposes sparsity
7036 inducing on the PMP under the MAP framework rather than directly uses the half
7037 quadratic splitting algorithm. By this, it avoids non-rigorous approximation
7038 solution in existing algorithms, while being much more computationally
7039 efficient. Extensive experiments demonstrate that the proposed algorithm can
7040 achieve better practical stability compared with state-of-the-arts. In terms of
7041 deblurring quality, robustness and computational efficiency, the new algorithm
7042 is superior to state-of-the-arts. Code for reproducing the results of the new
7043 method is available at https://github.com/FWen/deblur-pmp.git.
7044 </p>
7045 </description>
7046 <guid isPermaLink="false">oai:arXiv.org:1906.06642</guid>
7047 </item>
7048 <item>
7049 <title>Multi-type Resource Allocation with Partial Preferences. (arXiv:1906.06836v3 [cs.AI] UPDATED)</title>
7050 <link>http://fr.arxiv.org/abs/1906.06836</link>
7051 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_H/0/1/0/all/0/1">Haibin Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sikdar_S/0/1/0/all/0/1">Sujoy Sikdar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_X/0/1/0/all/0/1">Xiaoxi Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xia_L/0/1/0/all/0/1">Lirong Xia</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cao_Y/0/1/0/all/0/1">Yongzhi Cao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_H/0/1/0/all/0/1">Hanpin Wang</a></p>
7052
7053 <p>We propose multi-type probabilistic serial (MPS) and multi-type random
7054 priority (MRP) as extensions of the well known PS and RP mechanisms to the
7055 multi-type resource allocation problem (MTRA) with partial preferences. In our
7056 setting, there are multiple types of divisible items, and a group of agents who
7057 have partial order preferences over bundles consisting of one item of each
7058 type. We show that for the unrestricted domain of partial order preferences, no
7059 mechanism satisfies both sd-efficiency and sd-envy-freeness. Notwithstanding
7060 this impossibility result, our main message is positive: When agents'
7061 preferences are represented by acyclic CP-nets, MPS satisfies sd-efficiency,
7062 sd-envy-freeness, ordinal fairness, and upper invariance, while MRP satisfies
7063 ex-post-efficiency, sd-strategy-proofness, and upper invariance, recovering the
7064 properties of PS and RP.
7065 </p>
7066 </description>
7067 <guid isPermaLink="false">oai:arXiv.org:1906.06836</guid>
7068 </item>
7069 <item>
7070 <title>Dimensional Reweighting Graph Convolutional Networks. (arXiv:1907.02237v3 [cs.LG] UPDATED)</title>
7071 <link>http://fr.arxiv.org/abs/1907.02237</link>
7072 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zou_X/0/1/0/all/0/1">Xu Zou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jia_Q/0/1/0/all/0/1">Qiuye Jia</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jianwei Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_C/0/1/0/all/0/1">Chang Zhou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_H/0/1/0/all/0/1">Hongxia Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_J/0/1/0/all/0/1">Jie Tang</a></p>
7073
7074 <p>Graph Convolution Networks (GCNs) are becoming more and more popular for
7075 learning node representations on graphs. Though there exist various
7076 developments on sampling and aggregation to accelerate the training process and
7077 improve the performances, limited works focus on dealing with the dimensional
7078 information imbalance of node representations. To bridge the gap, we propose a
7079 method named Dimensional reweighting Graph Convolution Network (DrGCN). We
7080 theoretically prove that our DrGCN can guarantee to improve the stability of
7081 GCNs via mean field theory. Our dimensional reweighting method is very flexible
7082 and can be easily combined with most sampling and aggregation techniques for
7083 GCNs. Experimental results demonstrate its superior performances on several
7084 challenging transductive and inductive node classification benchmark datasets.
7085 Our DrGCN also outperforms existing models on an industrial-sized Alibaba
7086 recommendation dataset.
7087 </p>
7088 </description>
7089 <guid isPermaLink="false">oai:arXiv.org:1907.02237</guid>
7090 </item>
7091 <item>
7092 <title>Lexical Simplification with Pretrained Encoders. (arXiv:1907.06226v5 [cs.CL] UPDATED)</title>
7093 <link>http://fr.arxiv.org/abs/1907.06226</link>
7094 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qiang_J/0/1/0/all/0/1">Jipeng Qiang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1">Yun Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_Y/0/1/0/all/0/1">Yi Zhu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yuan_Y/0/1/0/all/0/1">Yunhao Yuan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_X/0/1/0/all/0/1">Xindong Wu</a></p>
7095
7096 <p>Lexical simplification (LS) aims to replace complex words in a given sentence
7097 with their simpler alternatives of equivalent meaning. Recently unsupervised
7098 lexical simplification approaches only rely on the complex word itself
7099 regardless of the given sentence to generate candidate substitutions, which
7100 will inevitably produce a large number of spurious candidates. We present a
7101 simple LS approach that makes use of the Bidirectional Encoder Representations
7102 from Transformers (BERT) which can consider both the given sentence and the
7103 complex word during generating candidate substitutions for the complex word.
7104 Specifically, we mask the complex word of the original sentence for feeding
7105 into the BERT to predict the masked token. The predicted results will be used
7106 as candidate substitutions. Despite being entirely unsupervised, experimental
7107 results show that our approach obtains obvious improvement compared with these
7108 baselines leveraging linguistic databases and parallel corpus, outperforming
7109 the state-of-the-art by more than 12 Accuracy points on three well-known
7110 benchmarks.
7111 </p>
7112 </description>
7113 <guid isPermaLink="false">oai:arXiv.org:1907.06226</guid>
7114 </item>
7115 <item>
7116 <title>Cover and variable degeneracy. (arXiv:1907.06630v3 [math.CO] UPDATED)</title>
7117 <link>http://fr.arxiv.org/abs/1907.06630</link>
7118 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Lu_F/0/1/0/all/0/1">Fangyao Lu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Wang_Q/0/1/0/all/0/1">Qianqian Wang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Wang_T/0/1/0/all/0/1">Tao Wang</a></p>
7119
7120 <p>Let $f$ be a nonnegative integer valued function on the vertex set of a
7121 graph. A graph is {\bf strictly $f$-degenerate} if each nonempty subgraph
7122 $\Gamma$ has a vertex $v$ such that $\mathrm{deg}_{\Gamma}(v) &lt; f(v)$. In this
7123 paper, we define a new concept, strictly $f$-degenerate transversal, which
7124 generalizes list coloring, signed coloring, DP-coloring, $L$-forested-coloring,
7125 and $(f_{1}, f_{2}, \dots, f_{s})$-partition. A {\bf cover} of a graph $G$ is a
7126 graph $H$ with vertex set $V(H) = \bigcup_{v \in V(G)} X_{v}$, where $X_{v} =
7127 \{(v, 1), (v, 2), \dots, (v, s)\}$; the edge set $\mathscr{M} = \bigcup_{uv \in
7128 E(G)}\mathscr{M}_{uv}$, where $\mathscr{M}_{uv}$ is a matching between $X_{u}$
7129 and $X_{v}$. A vertex set $R \subseteq V(H)$ is a {\bf transversal} of $H$ if
7130 $|R \cap X_{v}| = 1$ for each $v \in V(G)$. A transversal $R$ is a {\bf
7131 strictly $f$-degenerate transversal} if $H[R]$ is strictly $f$-degenerate. The
7132 main result of this paper is a degree type result, which generalizes Brooks'
7133 theorem, Gallai's theorem, degree-choosable result, signed degree-colorable
7134 result, and DP-degree-colorable result. Similar to Borodin, Kostochka and
7135 Toft's variable degeneracy, this degree type result is also self-strengthening.
7136 We also give some structural results on critical graphs with respect to
7137 strictly $f$-degenerate transversal. Using these results, we can uniformly
7138 prove many new and known results. In the final section, we pose some open
7139 problems.
7140 </p>
7141 </description>
7142 <guid isPermaLink="false">oai:arXiv.org:1907.06630</guid>
7143 </item>
7144 <item>
7145 <title>An Iterative Vertex Enumeration Method for Objective Space Based Vector Optimization Algorithms. (arXiv:1907.08813v2 [math.OC] UPDATED)</title>
7146 <link>http://fr.arxiv.org/abs/1907.08813</link>
7147 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Kaya_I/0/1/0/all/0/1">Irfan Caner Kaya</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Ulus_F/0/1/0/all/0/1">Firdevs Ulus</a></p>
7148
7149 <p>An application area of vertex enumeration problem (VEP) is the usage within
7150 objective space based linear/convex {vector} optimization algorithms whose aim
7151 is to generate (an approximation of) the Pareto frontier. In such algorithms,
7152 VEP, which is defined in the objective space, is solved in each iteration and
7153 it has a special structure. Namely, the recession cone of the polyhedron to be
7154 generated is the {ordering} cone. We {consider and give a detailed description
7155 of} a vertex enumeration procedure, which iterates by calling a modified
7156 `double description (DD) method' that works for such unbounded polyhedrons. We
7157 employ this procedure as a function of an existing objective space based
7158 {vector} optimization algorithm (Algorithm 1); and test the performance of it
7159 for randomly generated linear multiobjective optimization problems. We compare
7160 the efficiency of this procedure with another existing DD method as well as
7161 with the current vertex enumeration subroutine of Algorithm 1. We observe that
7162 the modified procedure excels the others especially as the dimension of the
7163 vertex enumeration problem (the number of objectives of the corresponding
7164 multiobjective problem) increases.
7165 </p>
7166 </description>
7167 <guid isPermaLink="false">oai:arXiv.org:1907.08813</guid>
7168 </item>
7169 <item>
7170 <title>Developing an Unsupervised Real-time Anomaly Detection Scheme for Time Series with Multi-seasonality. (arXiv:1908.01146v2 [cs.LG] UPDATED)</title>
7171 <link>http://fr.arxiv.org/abs/1908.01146</link>
7172 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_W/0/1/0/all/0/1">Wentai Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_L/0/1/0/all/0/1">Ligang He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_W/0/1/0/all/0/1">Weiwei Lin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Su_Y/0/1/0/all/0/1">Yi Su</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_Y/0/1/0/all/0/1">Yuhua Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maple_C/0/1/0/all/0/1">Carsten Maple</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jarvis_S/0/1/0/all/0/1">Stephen Jarvis</a></p>
7173
7174 <p>On-line detection of anomalies in time series is a key technique used in
7175 various event-sensitive scenarios such as robotic system monitoring, smart
7176 sensor networks and data center security. However, the increasing diversity of
7177 data sources and the variety of demands make this task more challenging than
7178 ever. Firstly, the rapid increase in unlabeled data means supervised learning
7179 is becoming less suitable in many cases. Secondly, a large portion of time
7180 series data have complex seasonality features. Thirdly, on-line anomaly
7181 detection needs to be fast and reliable. In light of this, we have developed a
7182 prediction-driven, unsupervised anomaly detection scheme, which adopts a
7183 backbone model combining the decomposition and the inference of time series
7184 data. Further, we propose a novel metric, Local Trend Inconsistency (LTI), and
7185 an efficient detection algorithm that computes LTI in a real-time manner and
7186 scores each data point robustly in terms of its probability of being anomalous.
7187 We have conducted extensive experimentation to evaluate our algorithm with
7188 several datasets from both public repositories and production environments. The
7189 experimental results show that our scheme outperforms existing representative
7190 anomaly detection algorithms in terms of the commonly used metric, Area Under
7191 Curve (AUC), while achieving the desired efficiency.
7192 </p>
7193 </description>
7194 <guid isPermaLink="false">oai:arXiv.org:1908.01146</guid>
7195 </item>
7196 <item>
7197 <title>Cluster-based Distributed Augmented Lagrangian Algorithm for a Class of Constrained Convex Optimization Problems. (arXiv:1908.06634v3 [cs.MA] UPDATED)</title>
7198 <link>http://fr.arxiv.org/abs/1908.06634</link>
7199 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Moradian_H/0/1/0/all/0/1">Hossein Moradian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kia_S/0/1/0/all/0/1">Solmaz S. Kia</a></p>
7200
7201 <p>We propose a distributed solution for a constrained convex optimization
7202 problem over a network of clustered agents each consisted of a set of
7203 subagents. The communication range of the clustered agents is such that they
7204 can form a connected undirected graph topology. The total cost in this
7205 optimization problem is the sum of the local convex costs of the subagents of
7206 each cluster. We seek a minimizer of this cost subject to a set of affine
7207 equality constraints, and a set of affine inequality constraints specifying the
7208 bounds on the decision variables if such bounds exist. We design our
7209 distributed algorithm in a cluster-based framework which results in a
7210 significant reduction in communication and computation costs. Our proposed
7211 distributed solution is a novel continuous-time algorithm that is linked to the
7212 augmented Lagrangian approach. It converges asymptotically when the local cost
7213 functions are convex and exponentially when they are strongly convex and have
7214 Lipschitz gradients. Moreover, we use an $\epsilon$-exact penalty function to
7215 address the inequality constraints and derive an explicit lower bound on the
7216 penalty function weight to guarantee convergence to $\epsilon$-neighborhood of
7217 the global minimum value of the cost. A numerical example demonstrates our
7218 results.
7219 </p>
7220 </description>
7221 <guid isPermaLink="false">oai:arXiv.org:1908.06634</guid>
7222 </item>
7223 <item>
7224 <title>Optimal Machine Intelligence at the Edge of Chaos. (arXiv:1909.05176v2 [cs.LG] UPDATED)</title>
7225 <link>http://fr.arxiv.org/abs/1909.05176</link>
7226 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Feng_L/0/1/0/all/0/1">Ling Feng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_L/0/1/0/all/0/1">Lin Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lai_C/0/1/0/all/0/1">Choy Heng Lai</a></p>
7227
7228 <p>It has long been suggested that the biological brain operates at some
7229 critical point between two different phases, possibly order and chaos. Despite
7230 many indirect empirical evidence from the brain and analytical indication on
7231 simple neural networks, the foundation of this hypothesis on generic non-linear
7232 systems remains unclear. Here we develop a general theory that reveals the
7233 exact edge of chaos is the boundary between the chaotic phase and the
7234 (pseudo)periodic phase arising from Neimark-Sacker bifurcation. This edge is
7235 analytically determined by the asymptotic Jacobian norm values of the
7236 non-linear operator and influenced by the dimensionality of the system. The
7237 optimality at the edge of chaos is associated with the highest information
7238 transfer between input and output at this point similar to that of the logistic
7239 map. As empirical validations, our experiments on the various deep learning
7240 models in computer vision demonstrate the optimality of the models near the
7241 edge of chaos, and we observe that the state-of-art training algorithms push
7242 the models towards such edge as they become more accurate. We further
7243 establishes the theoretical understanding of deep learning model generalization
7244 through asymptotic stability.
7245 </p>
7246 </description>
7247 <guid isPermaLink="false">oai:arXiv.org:1909.05176</guid>
7248 </item>
7249 <item>
7250 <title>Inverse Kinematics for Serial Kinematic Chains via Sum of Squares Optimization. (arXiv:1909.09318v3 [cs.RO] UPDATED)</title>
7251 <link>http://fr.arxiv.org/abs/1909.09318</link>
7252 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Maric_F/0/1/0/all/0/1">Filip Maric</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Giamou_M/0/1/0/all/0/1">Matthew Giamou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Khoubyarian_S/0/1/0/all/0/1">Soroush Khoubyarian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Petrovic_I/0/1/0/all/0/1">Ivan Petrovic</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kelly_J/0/1/0/all/0/1">Jonathan Kelly</a></p>
7253
7254 <p>Inverse kinematics is a fundamental problem for articulated robots: fast and
7255 accurate algorithms are needed for translating task-related workspace
7256 constraints and goals into feasible joint configurations. In general, inverse
7257 kinematics for serial kinematic chains is a difficult nonlinear problem, for
7258 which closed form solutions cannot be easily obtained. Therefore,
7259 computationally efficient numerical methods that can be adapted to a general
7260 class of manipulators are of great importance. % to motion planning and
7261 workspace generation tasks. In this paper, we use convex optimization
7262 techniques to solve the inverse kinematics problem with joint limit constraints
7263 for highly redundant serial kinematic chains with spherical joints in two and
7264 three dimensions. This is accomplished through a novel formulation of inverse
7265 kinematics as a nearest point problem, and with a fast sum of squares solver
7266 that exploits the sparsity of kinematic constraints for serial manipulators.
7267 Our method has the advantages of post-hoc certification of global optimality
7268 and a runtime that scales polynomialy with the number of degrees of freedom.
7269 Additionally, we prove that our convex relaxation leads to a globally optimal
7270 solution when certain conditions are met, and demonstrate empirically that
7271 these conditions are common and represent many practical instances. Finally, we
7272 provide an open source implementation of our algorithm.
7273 </p>
7274 </description>
7275 <guid isPermaLink="false">oai:arXiv.org:1909.09318</guid>
7276 </item>
7277 <item>
7278 <title>Noisy Batch Active Learning with Deterministic Annealing. (arXiv:1909.12473v2 [cs.LG] UPDATED)</title>
7279 <link>http://fr.arxiv.org/abs/1909.12473</link>
7280 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_G/0/1/0/all/0/1">Gaurav Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sahu_A/0/1/0/all/0/1">Anit Kumar Sahu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_W/0/1/0/all/0/1">Wan-Yi Lin</a></p>
7281
7282 <p>We study the problem of training machine learning models incrementally with
7283 batches of samples annotated with noisy oracles. We select each batch of
7284 samples that are important and also diverse via clustering and importance
7285 sampling. More importantly, we incorporate model uncertainty into the sampling
7286 probability to compensate for poor estimation of the importance scores when the
7287 training data is too small to build a meaningful model. Experiments on
7288 benchmark image classification datasets (MNIST, SVHN, CIFAR10, and EMNIST) show
7289 improvement over existing active learning strategies. We introduce an extra
7290 denoising layer to deep networks to make active learning robust to label noises
7291 and show significant improvements.
7292 </p>
7293 </description>
7294 <guid isPermaLink="false">oai:arXiv.org:1909.12473</guid>
7295 </item>
7296 <item>
7297 <title>Subspace Estimation from Unbalanced and Incomplete Data Matrices: $\ell_{2,\infty}$ Statistical Guarantees. (arXiv:1910.04267v4 [math.ST] UPDATED)</title>
7298 <link>http://fr.arxiv.org/abs/1910.04267</link>
7299 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Cai_C/0/1/0/all/0/1">Changxiao Cai</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Li_G/0/1/0/all/0/1">Gen Li</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Chi_Y/0/1/0/all/0/1">Yuejie Chi</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Poor_H/0/1/0/all/0/1">H. Vincent Poor</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Chen_Y/0/1/0/all/0/1">Yuxin Chen</a></p>
7300
7301 <p>This paper is concerned with estimating the column space of an unknown
7302 low-rank matrix $\boldsymbol{A}^{\star}\in\mathbb{R}^{d_{1}\times d_{2}}$,
7303 given noisy and partial observations of its entries. There is no shortage of
7304 scenarios where the observations -- while being too noisy to support faithful
7305 recovery of the entire matrix -- still convey sufficient information to enable
7306 reliable estimation of the column space of interest. This is particularly
7307 evident and crucial for the highly unbalanced case where the column dimension
7308 $d_{2}$ far exceeds the row dimension $d_{1}$, which is the focal point of the
7309 current paper. We investigate an efficient spectral method, which operates upon
7310 the sample Gram matrix with diagonal deletion. While this algorithmic idea has
7311 been studied before, we establish new statistical guarantees for this method in
7312 terms of both $\ell_{2}$ and $\ell_{2,\infty}$ estimation accuracy, which
7313 improve upon prior results if $d_{2}$ is substantially larger than $d_{1}$. To
7314 illustrate the effectiveness of our findings, we derive matching minimax lower
7315 bounds with respect to the noise levels, and develop consequences of our
7316 general theory for three applications of practical importance: (1) tensor
7317 completion from noisy data, (2) covariance estimation / principal component
7318 analysis with missing data, and (3) community recovery in bipartite graphs. Our
7319 theory leads to improved performance guarantees for all three cases.
7320 </p>
7321 </description>
7322 <guid isPermaLink="false">oai:arXiv.org:1910.04267</guid>
7323 </item>
7324 <item>
7325 <title>ProxIQA: A Proxy Approach to Perceptual Optimization of Learned Image Compression. (arXiv:1910.08845v2 [eess.IV] UPDATED)</title>
7326 <link>http://fr.arxiv.org/abs/1910.08845</link>
7327 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Chen_L/0/1/0/all/0/1">Li-Heng Chen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bampis_C/0/1/0/all/0/1">Christos G. Bampis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_Z/0/1/0/all/0/1">Zhi Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Norkin_A/0/1/0/all/0/1">Andrey Norkin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Bovik_A/0/1/0/all/0/1">Alan C. Bovik</a></p>
7328
7329 <p>The use of $\ell_p$ $(p=1,2)$ norms has largely dominated the measurement of
7330 loss in neural networks due to their simplicity and analytical properties.
7331 However, when used to assess the loss of visual information, these simple norms
7332 are not very consistent with human perception. Here, we describe a different
7333 "proximal" approach to optimize image analysis networks against quantitative
7334 perceptual models. Specifically, we construct a proxy network, broadly termed
7335 ProxIQA, which mimics the perceptual model while serving as a loss layer of the
7336 network. We experimentally demonstrate how this optimization framework can be
7337 applied to train an end-to-end optimized image compression network. By building
7338 on top of an existing deep image compression model, we are able to demonstrate
7339 a bitrate reduction of as much as $31\%$ over MSE optimization, given a
7340 specified perceptual quality (VMAF) level.
7341 </p>
7342 </description>
7343 <guid isPermaLink="false">oai:arXiv.org:1910.08845</guid>
7344 </item>
7345 <item>
7346 <title>Federated Learning over Wireless Networks: Convergence Analysis and Resource Allocation. (arXiv:1910.13067v4 [cs.LG] UPDATED)</title>
7347 <link>http://fr.arxiv.org/abs/1910.13067</link>
7348 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dinh_C/0/1/0/all/0/1">Canh T. Dinh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_N/0/1/0/all/0/1">Nguyen H. Tran</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nguyen_M/0/1/0/all/0/1">Minh N. H. Nguyen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hong_C/0/1/0/all/0/1">Choong Seon Hong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bao_W/0/1/0/all/0/1">Wei Bao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zomaya_A/0/1/0/all/0/1">Albert Y. Zomaya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gramoli_V/0/1/0/all/0/1">Vincent Gramoli</a></p>
7349
7350 <p>There is an increasing interest in a fast-growing machine learning technique
7351 called Federated Learning, in which the model training is distributed over
7352 mobile user equipments (UEs), exploiting UEs' local computation and training
7353 data. Despite its advantages in data privacy-preserving, Federated Learning
7354 (FL) still has challenges in heterogeneity across UEs' data and physical
7355 resources. We first propose a FL algorithm which can handle the heterogeneous
7356 UEs' data challenge without further assumptions except strongly convex and
7357 smooth loss functions. We provide the convergence rate characterizing the
7358 trade-off between local computation rounds of UE to update its local model and
7359 global communication rounds to update the FL global model. We then employ the
7360 proposed FL algorithm in wireless networks as a resource allocation
7361 optimization problem that captures the trade-off between the FL convergence
7362 wall clock time and energy consumption of UEs with heterogeneous computing and
7363 power resources. Even though the wireless resource allocation problem of FL is
7364 non-convex, we exploit this problem's structure to decompose it into three
7365 sub-problems and analyze their closed-form solutions as well as insights to
7366 problem design. Finally, we illustrate the theoretical analysis for the new
7367 algorithm with Tensorflow experiments and extensive numerical results for the
7368 wireless resource allocation sub-problems. The experiment results not only
7369 verify the theoretical convergence but also show that our proposed algorithm
7370 outperforms the vanilla FedAvg algorithm in terms of convergence rate and
7371 testing accuracy.
7372 </p>
7373 </description>
7374 <guid isPermaLink="false">oai:arXiv.org:1910.13067</guid>
7375 </item>
7376 <item>
7377 <title>Making the Best Use of Review Summary for Sentiment Analysis. (arXiv:1911.02711v2 [cs.CL] UPDATED)</title>
7378 <link>http://fr.arxiv.org/abs/1911.02711</link>
7379 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_S/0/1/0/all/0/1">Sen Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_L/0/1/0/all/0/1">Leyang Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xie_J/0/1/0/all/0/1">Jun Xie</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yue Zhang</a></p>
7380
7381 <p>Sentiment analysis provides a useful overview of customer review contents.
7382 Many review websites allow a user to enter a summary in addition to a full
7383 review. Intuitively, summary information may give additional benefit for review
7384 sentiment analysis. In this paper, we conduct a study to exploit methods for
7385 better use of summary information. We start by finding out that the sentimental
7386 signal distribution of a review and that of its corresponding summary are in
7387 fact complementary to each other. We thus explore various architectures to
7388 better guide the interactions between the two and propose a
7389 hierarchically-refined review-centric attention model. Empirical results show
7390 that our review-centric model can make better use of user-written summaries for
7391 review sentiment analysis, and is also more effective compared to existing
7392 methods when the user summary is replaced with summary generated by an
7393 automatic summarization system.
7394 </p>
7395 </description>
7396 <guid isPermaLink="false">oai:arXiv.org:1911.02711</guid>
7397 </item>
7398 <item>
7399 <title>Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy. (arXiv:1911.03849v5 [cs.LG] UPDATED)</title>
7400 <link>http://fr.arxiv.org/abs/1911.03849</link>
7401 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qu_X/0/1/0/all/0/1">Xinghua Qu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sun_Z/0/1/0/all/0/1">Zhu Sun</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ong_Y/0/1/0/all/0/1">Yew-Soon Ong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_A/0/1/0/all/0/1">Abhishek Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_P/0/1/0/all/0/1">Pengfei Wei</a></p>
7402
7403 <p>Recent studies have revealed that neural network-based policies can be easily
7404 fooled by adversarial examples. However, while most prior works analyze the
7405 effects of perturbing every pixel of every frame assuming white-box policy
7406 access, in this paper we take a more restrictive view towards adversary
7407 generation - with the goal of unveiling the limits of a model's vulnerability.
7408 In particular, we explore minimalistic attacks by defining three key settings:
7409 (1) black-box policy access: where the attacker only has access to the input
7410 (state) and output (action probability) of an RL policy; (2) fractional-state
7411 adversary: where only several pixels are perturbed, with the extreme case being
7412 a single-pixel adversary; and (3) tactically-chanced attack: where only
7413 significant frames are tactically chosen to be attacked. We formulate the
7414 adversarial attack by accommodating the three key settings and explore their
7415 potency on six Atari games by examining four fully trained state-of-the-art
7416 policies. In Breakout, for example, we surprisingly find that: (i) all policies
7417 showcase significant performance degradation by merely modifying 0.01% of the
7418 input state, and (ii) the policy trained by DQN is totally deceived by
7419 perturbation to only 1% frames.
7420 </p>
7421 </description>
7422 <guid isPermaLink="false">oai:arXiv.org:1911.03849</guid>
7423 </item>
7424 <item>
7425 <title>Rethinking Self-Attention: Towards Interpretability in Neural Parsing. (arXiv:1911.03875v3 [cs.CL] UPDATED)</title>
7426 <link>http://fr.arxiv.org/abs/1911.03875</link>
7427 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mrini_K/0/1/0/all/0/1">Khalil Mrini</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dernoncourt_F/0/1/0/all/0/1">Franck Dernoncourt</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tran_Q/0/1/0/all/0/1">Quan Tran</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bui_T/0/1/0/all/0/1">Trung Bui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chang_W/0/1/0/all/0/1">Walter Chang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nakashole_N/0/1/0/all/0/1">Ndapa Nakashole</a></p>
7428
7429 <p>Attention mechanisms have improved the performance of NLP tasks while
7430 allowing models to remain explainable. Self-attention is currently widely used,
7431 however interpretability is difficult due to the numerous attention
7432 distributions. Recent work has shown that model representations can benefit
7433 from label-specific information, while facilitating interpretation of
7434 predictions. We introduce the Label Attention Layer: a new form of
7435 self-attention where attention heads represent labels. We test our novel layer
7436 by running constituency and dependency parsing experiments and show our new
7437 model obtains new state-of-the-art results for both tasks on both the Penn
7438 Treebank (PTB) and Chinese Treebank. Additionally, our model requires fewer
7439 self-attention layers compared to existing work. Finally, we find that the
7440 Label Attention heads learn relations between syntactic categories and show
7441 pathways to analyze errors.
7442 </p>
7443 </description>
7444 <guid isPermaLink="false">oai:arXiv.org:1911.03875</guid>
7445 </item>
7446 <item>
7447 <title>Privacy-Preserving Gradient Boosting Decision Trees. (arXiv:1911.04209v3 [cs.LG] UPDATED)</title>
7448 <link>http://fr.arxiv.org/abs/1911.04209</link>
7449 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_Q/0/1/0/all/0/1">Qinbin Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_Z/0/1/0/all/0/1">Zhaomin Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wen_Z/0/1/0/all/0/1">Zeyi Wen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_B/0/1/0/all/0/1">Bingsheng He</a></p>
7450
7451 <p>The Gradient Boosting Decision Tree (GBDT) is a popular machine learning
7452 model for various tasks in recent years. In this paper, we study how to improve
7453 model accuracy of GBDT while preserving the strong guarantee of differential
7454 privacy. Sensitivity and privacy budget are two key design aspects for the
7455 effectiveness of differential private models. Existing solutions for GBDT with
7456 differential privacy suffer from the significant accuracy loss due to too loose
7457 sensitivity bounds and ineffective privacy budget allocations (especially
7458 across different trees in the GBDT model). Loose sensitivity bounds lead to
7459 more noise to obtain a fixed privacy level. Ineffective privacy budget
7460 allocations worsen the accuracy loss especially when the number of trees is
7461 large. Therefore, we propose a new GBDT training algorithm that achieves
7462 tighter sensitivity bounds and more effective noise allocations. Specifically,
7463 by investigating the property of gradient and the contribution of each tree in
7464 GBDTs, we propose to adaptively control the gradients of training data for each
7465 iteration and leaf node clipping in order to tighten the sensitivity bounds.
7466 Furthermore, we design a novel boosting framework to allocate the privacy
7467 budget between trees so that the accuracy loss can be further reduced. Our
7468 experiments show that our approach can achieve much better model accuracy than
7469 other baselines.
7470 </p>
7471 </description>
7472 <guid isPermaLink="false">oai:arXiv.org:1911.04209</guid>
7473 </item>
7474 <item>
7475 <title>A Continuous Teleoperation Subspace with Empirical and Algorithmic Mapping Algorithms for Non-Anthropomorphic Hands. (arXiv:1911.09565v5 [cs.RO] UPDATED)</title>
7476 <link>http://fr.arxiv.org/abs/1911.09565</link>
7477 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Meeker_C/0/1/0/all/0/1">Cassie Meeker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Haas_Heger_M/0/1/0/all/0/1">Maximilian Haas-Heger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ciocarlie_M/0/1/0/all/0/1">Matei Ciocarlie</a></p>
7478
7479 <p>Teleoperation is a valuable tool for robotic manipulators in highly
7480 unstructured environments. However, finding an intuitive mapping between a
7481 human hand and a non-anthropomorphic robot hand can be difficult, due to the
7482 hands' dissimilar kinematics. In this paper, we seek to create a mapping
7483 between the human hand and a fully actuated, non-anthropomorphic robot hand
7484 that is intuitive enough to enable effective real-time teleoperation, even for
7485 novice users. To accomplish this, we propose a low-dimensional teleoperation
7486 subspace which can be used as an intermediary for mapping between hand pose
7487 spaces. We present two different methods to define the teleoperation subspace:
7488 an empirical definition, which requires a person to define hand motions in an
7489 intuitive, hand-specific way, and an algorithmic definition, which is
7490 kinematically independent, and uses objects to define the subspace. We use each
7491 of these definitions to create a teleoperation mapping for different hands. One
7492 of the main contributions of this paper is the validation of both the empirical
7493 and algorithmic mappings with teleoperation experiments controlled by ten
7494 novices and performed on two kinematically distinct hands. The experiments show
7495 that the proposed subspace is relevant to teleoperation, intuitive enough to
7496 enable control by novices, and can generalize to non-anthropomorphic hands with
7497 different kinematics.
7498 </p>
7499 </description>
7500 <guid isPermaLink="false">oai:arXiv.org:1911.09565</guid>
7501 </item>
7502 <item>
7503 <title>QoS-Aware Joint Power Allocation and Task Offloading in a MEC/NFV-enabled C-RAN Network. (arXiv:1912.00187v2 [cs.NI] UPDATED)</title>
7504 <link>http://fr.arxiv.org/abs/1912.00187</link>
7505 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tajallifar_M/0/1/0/all/0/1">Mohsen Tajallifar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ebrahimi_S/0/1/0/all/0/1">Sina Ebrahimi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Javan_M/0/1/0/all/0/1">Mohammad Reza Javan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mokari_N/0/1/0/all/0/1">Nader Mokari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chiaraviglio_L/0/1/0/all/0/1">Luca Chiaraviglio</a></p>
7506
7507 <p>In this paper, we propose a novel resource management scheme that jointly
7508 allocates the transmission power and computational resources in a centralized
7509 radio access network architecture. The network comprises a set of computing
7510 nodes to which the requested tasks of different users are offloaded. The
7511 optimization problem takes the transmission, execution, and propagation delays
7512 of each task into account, with the aim to allocate the transmission power and
7513 computational resources such that the user's maximum tolerable latency is
7514 satisfied. Since the optimization problem is highly non-convex, we adopt the
7515 alternate search method (ASM) to divide it into smaller subproblems. A
7516 heuristic algorithm is proposed to jointly manage the allocated computational
7517 resources and placement of the tasks derived by ASM. We also propose an
7518 admission control mechanism for finding the set of tasks that can be served by
7519 the available resources. Furthermore, a disjoint method that separately
7520 allocates the transmission power and the computational resources is proposed as
7521 the baseline of comparison. The optimal solution of the optimization problem is
7522 also derived based on exhaustive search over offloading decisions and utilizing
7523 Karush-Kuhn-Tucker optimality conditions. The simulation results show that the
7524 joint method outperforms the disjoint task offloading and power allocation.
7525 Moreover, simulations show that the performance of the proposed method is
7526 almost equal to that of the optimal solution.
7527 </p>
7528 </description>
7529 <guid isPermaLink="false">oai:arXiv.org:1912.00187</guid>
7530 </item>
7531 <item>
7532 <title>Hierarchical Indian Buffet Neural Networks for Bayesian Continual Learning. (arXiv:1912.02290v4 [stat.ML] UPDATED)</title>
7533 <link>http://fr.arxiv.org/abs/1912.02290</link>
7534 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Kessler_S/0/1/0/all/0/1">Samuel Kessler</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Nguyen_V/0/1/0/all/0/1">Vu Nguyen</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Zohren_S/0/1/0/all/0/1">Stefan Zohren</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Roberts_S/0/1/0/all/0/1">Stephen Roberts</a></p>
7535
7536 <p>We place an Indian Buffet process (IBP) prior over the structure of a
7537 Bayesian Neural Network (BNN), thus allowing the complexity of the BNN to
7538 increase and decrease automatically. We further extend this model such that the
7539 prior on the structure of each hidden layer is shared globally across all
7540 layers, using a Hierarchical-IBP (H-IBP). We apply this model to the problem of
7541 resource allocation in Continual Learning (CL) where new tasks occur and the
7542 network requires extra resources. Our model uses online variational inference
7543 with reparameterisation of the Bernoulli and Beta distributions, which
7544 constitute the IBP and H-IBP priors. As we automatically learn the number of
7545 weights in each layer of the BNN, overfitting and underfitting problems are
7546 largely overcome. We show empirically that our approach offers a competitive
7547 edge over existing methods in CL.
7548 </p>
7549 </description>
7550 <guid isPermaLink="false">oai:arXiv.org:1912.02290</guid>
7551 </item>
7552 <item>
7553 <title>CoSimLex: A Resource for Evaluating Graded Word Similarity in Context. (arXiv:1912.05320v3 [cs.CL] UPDATED)</title>
7554 <link>http://fr.arxiv.org/abs/1912.05320</link>
7555 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Armendariz_C/0/1/0/all/0/1">Carlos Santos Armendariz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Purver_M/0/1/0/all/0/1">Matthew Purver</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ulcar_M/0/1/0/all/0/1">Matej Ul&#x10d;ar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pollak_S/0/1/0/all/0/1">Senja Pollak</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ljubesic_N/0/1/0/all/0/1">Nikola Ljube&#x161;i&#x107;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Robnik_Sikonja_M/0/1/0/all/0/1">Marko Robnik-&#x160;ikonja</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Granroth_Wilding_M/0/1/0/all/0/1">Mark Granroth-Wilding</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vaik_K/0/1/0/all/0/1">Kristiina Vaik</a></p>
7556
7557 <p>State of the art natural language processing tools are built on
7558 context-dependent word embeddings, but no direct method for evaluating these
7559 representations currently exists. Standard tasks and datasets for intrinsic
7560 evaluation of embeddings are based on judgements of similarity, but ignore
7561 context; standard tasks for word sense disambiguation take account of context
7562 but do not provide continuous measures of meaning similarity. This paper
7563 describes an effort to build a new dataset, CoSimLex, intended to fill this
7564 gap. Building on the standard pairwise similarity task of SimLex-999, it
7565 provides context-dependent similarity measures; covers not only discrete
7566 differences in word sense but more subtle, graded changes in meaning; and
7567 covers not only a well-resourced language (English) but a number of
7568 less-resourced languages. We define the task and evaluation metrics, outline
7569 the dataset collection methodology, and describe the status of the dataset so
7570 far.
7571 </p>
7572 </description>
7573 <guid isPermaLink="false">oai:arXiv.org:1912.05320</guid>
7574 </item>
7575 <item>
7576 <title>What it Thinks is Important is Important: Robustness Transfers through Input Gradients. (arXiv:1912.05699v3 [cs.LG] UPDATED)</title>
7577 <link>http://fr.arxiv.org/abs/1912.05699</link>
7578 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chan_A/0/1/0/all/0/1">Alvin Chan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tay_Y/0/1/0/all/0/1">Yi Tay</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ong_Y/0/1/0/all/0/1">Yew-Soon Ong</a></p>
7579
7580 <p>Adversarial perturbations are imperceptible changes to input pixels that can
7581 change the prediction of deep learning models. Learned weights of models robust
7582 to such perturbations are previously found to be transferable across different
7583 tasks but this applies only if the model architecture for the source and target
7584 tasks is the same. Input gradients characterize how small changes at each input
7585 pixel affect the model output. Using only natural images, we show here that
7586 training a student model's input gradients to match those of a robust teacher
7587 model can gain robustness close to a strong baseline that is robustly trained
7588 from scratch. Through experiments in MNIST, CIFAR-10, CIFAR-100 and
7589 Tiny-ImageNet, we show that our proposed method, input gradient adversarial
7590 matching, can transfer robustness across different tasks and even across
7591 different model architectures. This demonstrates that directly targeting the
7592 semantics of input gradients is a feasible way towards adversarial robustness.
7593 </p>
7594 </description>
7595 <guid isPermaLink="false">oai:arXiv.org:1912.05699</guid>
7596 </item>
7597 <item>
7598 <title>ORCA: a Benchmark for Data Web Crawlers. (arXiv:1912.08026v2 [cs.DB] UPDATED)</title>
7599 <link>http://fr.arxiv.org/abs/1912.08026</link>
7600 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Roder_M/0/1/0/all/0/1">Michael R&#xf6;der</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Souza_G/0/1/0/all/0/1">Geraldo de Souza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kuchelev_D/0/1/0/all/0/1">Denis Kuchelev</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Desouki_A/0/1/0/all/0/1">Abdelmoneim Amer Desouki</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ngomo_A/0/1/0/all/0/1">Axel-Cyrille Ngonga Ngomo</a></p>
7601
7602 <p>The number of RDF knowledge graphs available on the Web grows constantly.
7603 Gathering these graphs at large scale for downstream applications hence
7604 requires the use of crawlers. Although Data Web crawlers exist, and general Web
7605 crawlers could be adapted to focus on the Data Web, there is currently no
7606 benchmark to fairly evaluate their performance. Our work closes this gap by
7607 presenting the Orca benchmark. Orca generates a synthetic Data Web, which is
7608 decoupled from the original Web and enables a fair and repeatable comparison of
7609 Data Web crawlers. Our evaluations show that Orca can be used to reveal the
7610 different advantages and disadvantages of existing crawlers. The benchmark is
7611 open-source and available at https://github.com/dice-group/orca.
7612 </p>
7613 </description>
7614 <guid isPermaLink="false">oai:arXiv.org:1912.08026</guid>
7615 </item>
7616 <item>
7617 <title>Deep Automodulators. (arXiv:1912.10321v4 [cs.LG] UPDATED)</title>
7618 <link>http://fr.arxiv.org/abs/1912.10321</link>
7619 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Heljakka_A/0/1/0/all/0/1">Ari Heljakka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hou_Y/0/1/0/all/0/1">Yuxin Hou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kannala_J/0/1/0/all/0/1">Juho Kannala</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Solin_A/0/1/0/all/0/1">Arno Solin</a></p>
7620
7621 <p>We introduce a new category of generative autoencoders called automodulators.
7622 These networks can faithfully reproduce individual real-world input images like
7623 regular autoencoders, but also generate a fused sample from an arbitrary
7624 combination of several such images, allowing instantaneous 'style-mixing' and
7625 other new applications. An automodulator decouples the data flow of decoder
7626 operations from statistical properties thereof and uses the latent vector to
7627 modulate the former by the latter, with a principled approach for mutual
7628 disentanglement of decoder layers. Prior work has explored similar decoder
7629 architecture with GANs, but their focus has been on random sampling. A
7630 corresponding autoencoder could operate on real input images. For the first
7631 time, we show how to train such a general-purpose model with sharp outputs in
7632 high resolution, using novel training techniques, demonstrated on four image
7633 data sets. Besides style-mixing, we show state-of-the-art results in
7634 autoencoder comparison, and visual image quality nearly indistinguishable from
7635 state-of-the-art GANs. We expect the automodulator variants to become a useful
7636 building block for image applications and other data domains.
7637 </p>
7638 </description>
7639 <guid isPermaLink="false">oai:arXiv.org:1912.10321</guid>
7640 </item>
7641 <item>
7642 <title>Statistical Limits of Supervised Quantum Learning. (arXiv:2001.10477v3 [quant-ph] UPDATED)</title>
7643 <link>http://fr.arxiv.org/abs/2001.10477</link>
7644 <description><p>Authors: <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Ciliberto_C/0/1/0/all/0/1">Carlo Ciliberto</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Rocchetto_A/0/1/0/all/0/1">Andrea Rocchetto</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Rudi_A/0/1/0/all/0/1">Alessandro Rudi</a>, <a href="http://fr.arxiv.org/find/quant-ph/1/au:+Wossnig_L/0/1/0/all/0/1">Leonard Wossnig</a></p>
7645
7646 <p>Within the framework of statistical learning theory it is possible to bound
7647 the minimum number of samples required by a learner to reach a target accuracy.
7648 We show that if the bound on the accuracy is taken into account, quantum
7649 machine learning algorithms for supervised learning---for which statistical
7650 guarantees are available---cannot achieve polylogarithmic runtimes in the input
7651 dimension. We conclude that, when no further assumptions on the problem are
7652 made, quantum machine learning algorithms for supervised learning can have at
7653 most polynomial speedups over efficient classical algorithms, even in cases
7654 where quantum access to the data is naturally available.
7655 </p>
7656 </description>
7657 <guid isPermaLink="false">oai:arXiv.org:2001.10477</guid>
7658 </item>
7659 <item>
7660 <title>Can Graph Neural Networks Count Substructures?. (arXiv:2002.04025v4 [cs.LG] UPDATED)</title>
7661 <link>http://fr.arxiv.org/abs/2002.04025</link>
7662 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Z/0/1/0/all/0/1">Zhengdao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_L/0/1/0/all/0/1">Lei Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Villar_S/0/1/0/all/0/1">Soledad Villar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bruna_J/0/1/0/all/0/1">Joan Bruna</a></p>
7663
7664 <p>The ability to detect and count certain substructures in graphs is important
7665 for solving many tasks on graph-structured data, especially in the contexts of
7666 computational chemistry and biology as well as social network analysis.
7667 Inspired by this, we propose to study the expressive power of graph neural
7668 networks (GNNs) via their ability to count attributed graph substructures,
7669 extending recent works that examine their power in graph isomorphism testing
7670 and function approximation. We distinguish between two types of substructure
7671 counting: induced-subgraph-count and subgraph-count, and establish both
7672 positive and negative answers for popular GNN architectures. Specifically, we
7673 prove that Message Passing Neural Networks (MPNNs), 2-Weisfeiler-Lehman (2-WL)
7674 and 2-Invariant Graph Networks (2-IGNs) cannot perform induced-subgraph-count
7675 of substructures consisting of 3 or more nodes, while they can perform
7676 subgraph-count of star-shaped substructures. As an intermediary step, we prove
7677 that 2-WL and 2-IGNs are equivalent in distinguishing non-isomorphic graphs,
7678 partly answering an open problem raised in Maron et al. (2019). We also prove
7679 positive results for k-WL and k-IGNs as well as negative results for k-WL with
7680 a finite number of iterations. We then conduct experiments that support the
7681 theoretical results for MPNNs and 2-IGNs. Moreover, motivated by substructure
7682 counting and inspired by Murphy et al. (2019), we propose the Local Relational
7683 Pooling model and demonstrate that it is not only effective for substructure
7684 counting but also able to achieve competitive performance on molecular
7685 prediction tasks.
7686 </p>
7687 </description>
7688 <guid isPermaLink="false">oai:arXiv.org:2002.04025</guid>
7689 </item>
7690 <item>
7691 <title>An implicit function learning approach for parametric modal regression. (arXiv:2002.06195v2 [stat.ML] UPDATED)</title>
7692 <link>http://fr.arxiv.org/abs/2002.06195</link>
7693 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Pan_Y/0/1/0/all/0/1">Yangchen Pan</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Imani_E/0/1/0/all/0/1">Ehsan Imani</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+White_M/0/1/0/all/0/1">Martha White</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Farahmand_A/0/1/0/all/0/1">Amir-massoud Farahmand</a></p>
7694
7695 <p>For multi-valued functions---such as when the conditional distribution on
7696 targets given the inputs is multi-modal---standard regression approaches are
7697 not always desirable because they provide the conditional mean. Modal
7698 regression algorithms address this issue by instead finding the conditional
7699 mode(s). Most, however, are nonparametric approaches and so can be difficult to
7700 scale. Further, parametric approximators, like neural networks, facilitate
7701 learning complex relationships between inputs and targets. In this work, we
7702 propose a parametric modal regression algorithm. We use the implicit function
7703 theorem to develop an objective, for learning a joint function over inputs and
7704 targets. We empirically demonstrate on several synthetic problems that our
7705 method (i) can learn multi-valued functions and produce the conditional modes,
7706 (ii) scales well to high-dimensional inputs, and (iii) can even be more
7707 effective for certain uni-modal problems, particularly for high-frequency
7708 functions. We demonstrate that our method is competitive in a real-world modal
7709 regression problem and two regular regression datasets.
7710 </p>
7711 </description>
7712 <guid isPermaLink="false">oai:arXiv.org:2002.06195</guid>
7713 </item>
7714 <item>
7715 <title>Learning Global Transparent Models Consistent with Local Contrastive Explanations. (arXiv:2002.08247v4 [cs.LG] UPDATED)</title>
7716 <link>http://fr.arxiv.org/abs/2002.08247</link>
7717 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Pedapati_T/0/1/0/all/0/1">Tejaswini Pedapati</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Balakrishnan_A/0/1/0/all/0/1">Avinash Balakrishnan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shanmugam_K/0/1/0/all/0/1">Karthikeyan Shanmugam</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dhurandhar_A/0/1/0/all/0/1">Amit Dhurandhar</a></p>
7718
7719 <p>There is a rich and growing literature on producing local
7720 contrastive/counterfactual explanations for black-box models (e.g. neural
7721 networks).
7722 </p>
7723 <p>In these methods, for an input, an explanation is in the form of a contrast
7724 point differing in very few features from the original input and lying in a
7725 different class. Other works try to build globally interpretable models like
7726 decision trees and rule lists based on the data using actual labels or based on
7727 the black-box models predictions. Although these interpretable global models
7728 can be useful, they may not be consistent with local explanations from a
7729 specific black-box of choice. In this work, we explore the question: Can we
7730 produce a transparent global model that is simultaneously accurate and
7731 consistent with the local (contrastive) explanations of the black-box model? We
7732 introduce a natural local consistency metric that quantifies if the local
7733 explanations and predictions of the black-box model are also consistent with
7734 the proxy global transparent model. Based on a key insight we propose a novel
7735 method where we create custom boolean features from sparse local contrastive
7736 explanations of the black-box model and then train a globally transparent model
7737 on just these, and showcase empirically that such models have higher local
7738 consistency compared with other known strategies, while still being close in
7739 performance to models that are trained with access to the original data.
7740 </p>
7741 </description>
7742 <guid isPermaLink="false">oai:arXiv.org:2002.08247</guid>
7743 </item>
7744 <item>
7745 <title>A two-stage data-analysis method for total-reflection high-energy positron diffraction (TRHEPD). (arXiv:2002.12165v2 [cond-mat.mtrl-sci] UPDATED)</title>
7746 <link>http://fr.arxiv.org/abs/2002.12165</link>
7747 <description><p>Authors: <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Tanaka_K/0/1/0/all/0/1">Kazuyuki Tanaka</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Mochizuki_I/0/1/0/all/0/1">Izumi Mochizuki</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Hanada_T/0/1/0/all/0/1">Takashi Hanada</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Ichimiya_A/0/1/0/all/0/1">Ayahiko Ichimiya</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Hyodo_T/0/1/0/all/0/1">Toshio Hyodo</a>, <a href="http://fr.arxiv.org/find/cond-mat/1/au:+Hoshi_T/0/1/0/all/0/1">Takeo Hoshi</a></p>
7748
7749 <p>Total-reflection high-energy positron diffraction (TRHEPD) is a novel
7750 experimental method for the determination of surface structure, which has been
7751 extensively developed at the Slow Positron Facility, Institute of Materials
7752 Structure Science, High Energy Accelerator Research Organization (KEK). In this
7753 paper, a two-stage data-analysis method is proposed. The data analysis is based
7754 on an inverse problem in which the atomic positions of a surface structure are
7755 determined from the experimental diffraction data (rocking curves). The
7756 relevant forward problem is solved by the numerical solution of the partial
7757 differential equation for quantum scattering of the positron. In the present
7758 two-stage method, the first stage is a grid-based global search and the second
7759 stage is a local search for the unique candidate for the atomic arrangement.
7760 The numerical problem is solved on a supercomputer
7761 </p>
7762 </description>
7763 <guid isPermaLink="false">oai:arXiv.org:2002.12165</guid>
7764 </item>
7765 <item>
7766 <title>Curriculum By Smoothing. (arXiv:2003.01367v3 [cs.LG] UPDATED)</title>
7767 <link>http://fr.arxiv.org/abs/2003.01367</link>
7768 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sinha_S/0/1/0/all/0/1">Samarth Sinha</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Garg_A/0/1/0/all/0/1">Animesh Garg</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Larochelle_H/0/1/0/all/0/1">Hugo Larochelle</a></p>
7769
7770 <p>Convolutional Neural Networks (CNNs) have shown impressive performance in
7771 computer vision tasks such as image classification, detection, and
7772 segmentation. Moreover, recent work in Generative Adversarial Networks (GANs)
7773 has highlighted the importance of learning by progressively increasing the
7774 difficulty of a learning task [26]. When learning a network from scratch, the
7775 information propagated within the network during the earlier stages of training
7776 can contain distortion artifacts due to noise which can be detrimental to
7777 training. In this paper, we propose an elegant curriculum based scheme that
7778 smoothes the feature embedding of a CNN using anti-aliasing or low-pass
7779 filters. We propose to augment the train-ing of CNNs by controlling the amount
7780 of high frequency information propagated within the CNNs as training
7781 progresses, by convolving the output of a CNN feature map of each layer with a
7782 Gaussian kernel. By decreasing the variance of the Gaussian kernel, we
7783 gradually increase the amount of high-frequency information available within
7784 the network for inference. As the amount of information in the feature maps
7785 increases during training, the network is able to progressively learn better
7786 representations of the data. Our proposed augmented training scheme
7787 significantly improves the performance of CNNs on various vision tasks without
7788 either adding additional trainable parameters or an auxiliary regularization
7789 objective. The generality of our method is demonstrated through empirical
7790 performance gains in CNN architectures across four different tasks: transfer
7791 learning, cross-task transfer learning, and generative models.
7792 </p>
7793 </description>
7794 <guid isPermaLink="false">oai:arXiv.org:2003.01367</guid>
7795 </item>
7796 <item>
7797 <title>Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations. (arXiv:2003.02960v3 [cs.LG] UPDATED)</title>
7798 <link>http://fr.arxiv.org/abs/2003.02960</link>
7799 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Golatkar_A/0/1/0/all/0/1">Aditya Golatkar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Achille_A/0/1/0/all/0/1">Alessandro Achille</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Soatto_S/0/1/0/all/0/1">Stefano Soatto</a></p>
7800
7801 <p>We describe a procedure for removing dependency on a cohort of training data
7802 from a trained deep network that improves upon and generalizes previous methods
7803 to different readout functions and can be extended to ensure forgetting in the
7804 activations of the network. We introduce a new bound on how much information
7805 can be extracted per query about the forgotten cohort from a black-box network
7806 for which only the input-output behavior is observed. The proposed forgetting
7807 procedure has a deterministic part derived from the differential equations of a
7808 linearized version of the model, and a stochastic part that ensures information
7809 destruction by adding noise tailored to the geometry of the loss landscape. We
7810 exploit the connections between the activation and weight dynamics of a DNN
7811 inspired by Neural Tangent Kernels to compute the information in the
7812 activations.
7813 </p>
7814 </description>
7815 <guid isPermaLink="false">oai:arXiv.org:2003.02960</guid>
7816 </item>
7817 <item>
7818 <title>No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks. (arXiv:2003.03824v2 [eess.IV] UPDATED)</title>
7819 <link>http://fr.arxiv.org/abs/2003.03824</link>
7820 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_S/0/1/0/all/0/1">Siqi Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Setio_A/0/1/0/all/0/1">Arnaud Arindra Adiyoso Setio</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ghesu_F/0/1/0/all/0/1">Florin C. Ghesu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gibson_E/0/1/0/all/0/1">Eli Gibson</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Grbic_S/0/1/0/all/0/1">Sasa Grbic</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Georgescu_B/0/1/0/all/0/1">Bogdan Georgescu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Comaniciu_D/0/1/0/all/0/1">Dorin Comaniciu</a></p>
7821
7822 <p>Detecting malignant pulmonary nodules at an early stage can allow medical
7823 interventions which may increase the survival rate of lung cancer patients.
7824 Using computer vision techniques to detect nodules can improve the sensitivity
7825 and the speed of interpreting chest CT for lung cancer screening. Many studies
7826 have used CNNs to detect nodule candidates. Though such approaches have been
7827 shown to outperform the conventional image processing based methods regarding
7828 the detection accuracy, CNNs are also known to be limited to generalize on
7829 under-represented samples in the training set and prone to imperceptible noise
7830 perturbations. Such limitations can not be easily addressed by scaling up the
7831 dataset or the models. In this work, we propose to add adversarial synthetic
7832 nodules and adversarial attack samples to the training data to improve the
7833 generalization and the robustness of the lung nodule detection systems. To
7834 generate hard examples of nodules from a differentiable nodule synthesizer, we
7835 use projected gradient descent (PGD) to search the latent code within a bounded
7836 neighbourhood that would generate nodules to decrease the detector response. To
7837 make the network more robust to unanticipated noise perturbations, we use PGD
7838 to search for noise patterns that can trigger the network to give
7839 over-confident mistakes. By evaluating on two different benchmark datasets
7840 containing consensus annotations from three radiologists, we show that the
7841 proposed techniques can improve the detection performance on real CT data. To
7842 understand the limitations of both the conventional networks and the proposed
7843 augmented networks, we also perform stress-tests on the false positive
7844 reduction networks by feeding different types of artificially produced patches.
7845 We show that the augmented networks are more robust to both under-represented
7846 nodules as well as resistant to noise perturbations.
7847 </p>
7848 </description>
7849 <guid isPermaLink="false">oai:arXiv.org:2003.03824</guid>
7850 </item>
7851 <item>
7852 <title>Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule. (arXiv:2003.03977v4 [cs.LG] UPDATED)</title>
7853 <link>http://fr.arxiv.org/abs/2003.03977</link>
7854 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Iyer_N/0/1/0/all/0/1">Nikhil Iyer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Thejas_V/0/1/0/all/0/1">V Thejas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kwatra_N/0/1/0/all/0/1">Nipun Kwatra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramjee_R/0/1/0/all/0/1">Ramachandran Ramjee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sivathanu_M/0/1/0/all/0/1">Muthian Sivathanu</a></p>
7855
7856 <p>Several papers argue that wide minima generalize better than narrow minima.
7857 In this paper, through detailed experiments that not only corroborate the
7858 generalization properties of wide minima, we also provide empirical evidence
7859 for a new hypothesis that the density of wide minima is likely lower than the
7860 density of narrow minima. Further, motivated by this hypothesis, we design a
7861 novel explore-exploit learning rate schedule. On a variety of image and natural
7862 language datasets, compared to their original hand-tuned learning rate
7863 baselines, we show that our explore-exploit schedule can result in either up to
7864 0.84% higher absolute accuracy using the original training budget or up to 57%
7865 reduced training time while achieving the original reported accuracy. For
7866 example, we achieve state-of-the-art (SOTA) accuracy for IWSLT'14 (DE-EN) and
7867 WMT'14 (DE-EN) datasets by just modifying the learning rate schedule of a high
7868 performing model.
7869 </p>
7870 </description>
7871 <guid isPermaLink="false">oai:arXiv.org:2003.03977</guid>
7872 </item>
7873 <item>
7874 <title>Compressive Isogeometric Analysis. (arXiv:2003.06475v2 [math.NA] UPDATED)</title>
7875 <link>http://fr.arxiv.org/abs/2003.06475</link>
7876 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Brugiapaglia_S/0/1/0/all/0/1">Simone Brugiapaglia</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Tamellini_L/0/1/0/all/0/1">Lorenzo Tamellini</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Tani_M/0/1/0/all/0/1">Mattia Tani</a></p>
7877
7878 <p>This work is motivated by the difficulty in assembling the Galerkin matrix
7879 when solving Partial Differential Equations (PDEs) with Isogeometric Analysis
7880 (IGA) using B-splines of moderate-to-high polynomial degree. To mitigate this
7881 problem, we propose a novel methodology named CossIGA (COmpreSSive IsoGeometric
7882 Analysis), which combines the IGA principle with CORSING, a recently introduced
7883 sparse recovery approach for PDEs based on compressive sensing. CossIGA
7884 assembles only a small portion of a suitable IGA Petrov-Galerkin discretization
7885 and is effective whenever the PDE solution is sufficiently sparse or
7886 compressible, i.e., when most of its coefficients are zero or negligible. The
7887 sparsity of the solution is promoted by employing a multilevel dictionary of
7888 B-splines as opposed to a basis. Thanks to sparsity and the fact that only a
7889 fraction of the full discretization matrix is assembled, the proposed technique
7890 has the potential to lead to significant computational savings. We show the
7891 effectiveness of CossIGA for the solution of the 2D and 3D Poisson equation
7892 over nontrivial geometries by means of an extensive numerical investigation.
7893 </p>
7894 </description>
7895 <guid isPermaLink="false">oai:arXiv.org:2003.06475</guid>
7896 </item>
7897 <item>
7898 <title>Thermodynamic Cost of Edge Detection in Artificial Neural Network(ANN)-Based Processors. (arXiv:2003.08196v2 [eess.IV] UPDATED)</title>
7899 <link>http://fr.arxiv.org/abs/2003.08196</link>
7900 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Barisik_S/0/1/0/all/0/1">Se&#xe7;kin Bar&#x131;&#x15f;&#x131;k</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ercan_I/0/1/0/all/0/1">&#x130;lke Ercan</a></p>
7901
7902 <p>Architecture-based heat dissipation analyses allow us to reveal fundamental
7903 sources of inefficiency in a given processor and thereby provide us with
7904 road-maps to design less dissipative computing schemes independent of
7905 technology-base used to implement them. In this work, we study
7906 architectural-level contributions to energy dissipation in an Artificial Neural
7907 Network (ANN)-based processor that is trained to perform edge-detection task.
7908 We compare the training and information processing cost of ANN to that of
7909 conventional architectures and algorithms using 64-pixel binary image. Our
7910 results reveal the inherent efficiency advantages of an ANN network trained for
7911 specific tasks over general-purpose processors based on von Neumann
7912 architecture. We also compare the proposed performance improvements to that of
7913 Cellular Array Processors (CAPs) and illustrate the reduction in dissipation
7914 for special purpose processors. Lastly, we calculate the change in dissipation
7915 as a result of input data structure and show the effect of randomness on
7916 energetic cost of information processing. The results we obtained provide a
7917 basis for comparison for task-based fundamental energy efficiency analyses for
7918 a range of processors and therefore contribute to the study of
7919 architecture-level descriptions of processors and thermodynamic cost
7920 calculations based on physics of computation.
7921 </p>
7922 </description>
7923 <guid isPermaLink="false">oai:arXiv.org:2003.08196</guid>
7924 </item>
7925 <item>
7926 <title>On Calibration of Mixup Training for Deep Neural Networks. (arXiv:2003.09946v3 [cs.LG] UPDATED)</title>
7927 <link>http://fr.arxiv.org/abs/2003.09946</link>
7928 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Maronas_J/0/1/0/all/0/1">Juan Maro&#xf1;as</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ramos_D/0/1/0/all/0/1">Daniel Ramos</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Paredes_R/0/1/0/all/0/1">Roberto Paredes</a></p>
7929
7930 <p>Deep Neural Networks (DNN) represent the state of the art in many tasks.
7931 However, due to their overparameterization, their generalization capabilities
7932 are in doubt and still a field under study. Consequently, DNN can overfit and
7933 assign overconfident predictions -- effects that have been shown to affect the
7934 calibration of the confidences assigned to unseen data. Data Augmentation (DA)
7935 strategies have been proposed to regularize these models, being Mixup one of
7936 the most popular due to its ability to improve the accuracy, the uncertainty
7937 quantification and the calibration of DNN. In this work however we argue and
7938 provide empirical evidence that, due to its fundamentals, Mixup does not
7939 necessarily improve calibration. Based on our observations we propose a new
7940 loss function that improves the calibration, and also sometimes the accuracy,
7941 of DNN trained with this DA technique. Our loss is inspired by Bayes decision
7942 theory and introduces a new training framework for designing losses for
7943 probabilistic modelling. We provide state-of-the-art accuracy with consistent
7944 improvements in calibration performance. Appendix and code are provided here:
7945 https://github.com/jmaronas/calibration_MixupDNN_ARCLoss.pytorch.git
7946 </p>
7947 </description>
7948 <guid isPermaLink="false">oai:arXiv.org:2003.09946</guid>
7949 </item>
7950 <item>
7951 <title>Unique Chinese Linguistic Phenomena. (arXiv:2004.00499v3 [cs.CL] UPDATED)</title>
7952 <link>http://fr.arxiv.org/abs/2004.00499</link>
7953 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jia_S/0/1/0/all/0/1">Shengbin Jia</a></p>
7954
7955 <p>Linguistics holds unique characteristics of generality, stability, and
7956 nationality, which will affect the formulation of extraction strategies and
7957 should be incorporated into the relation extraction. Chinese open relation
7958 extraction is not well-established, because of the complexity of Chinese
7959 linguistics makes it harder to operate, and the methods for English are not
7960 compatible with that for Chinese. The diversities between Chinese and English
7961 linguistics are mainly reflected in morphology and syntax.
7962 </p>
7963 </description>
7964 <guid isPermaLink="false">oai:arXiv.org:2004.00499</guid>
7965 </item>
7966 <item>
7967 <title>Is Graph Structure Necessary for Multi-hop Question Answering?. (arXiv:2004.03096v2 [cs.CL] UPDATED)</title>
7968 <link>http://fr.arxiv.org/abs/2004.03096</link>
7969 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shao_N/0/1/0/all/0/1">Nan Shao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_Y/0/1/0/all/0/1">Yiming Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_T/0/1/0/all/0/1">Ting Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shijin Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_G/0/1/0/all/0/1">Guoping Hu</a></p>
7970
7971 <p>Recently, attempting to model texts as graph structure and introducing graph
7972 neural networks to deal with it has become a trend in many NLP research areas.
7973 In this paper, we investigate whether the graph structure is necessary for
7974 multi-hop question answering. Our analysis is centered on HotpotQA. We
7975 construct a strong baseline model to establish that, with the proper use of
7976 pre-trained models, graph structure may not be necessary for multi-hop question
7977 answering. We point out that both graph structure and adjacency matrix are
7978 task-related prior knowledge, and graph-attention can be considered as a
7979 special case of self-attention. Experiments and visualized analysis demonstrate
7980 that graph-attention or the entire graph structure can be replaced by
7981 self-attention or Transformers.
7982 </p>
7983 </description>
7984 <guid isPermaLink="false">oai:arXiv.org:2004.03096</guid>
7985 </item>
7986 <item>
7987 <title>Risk-Constrained Linear-Quadratic Regulators. (arXiv:2004.04685v2 [eess.SY] UPDATED)</title>
7988 <link>http://fr.arxiv.org/abs/2004.04685</link>
7989 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Tsiamis_A/0/1/0/all/0/1">Anastasios Tsiamis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kalogerias_D/0/1/0/all/0/1">Dionysios S. Kalogerias</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chamon_L/0/1/0/all/0/1">Luiz F. O. Chamon</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ribeiro_A/0/1/0/all/0/1">Alejandro Ribeiro</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pappas_G/0/1/0/all/0/1">George J. Pappas</a></p>
7990
7991 <p>We propose a new risk-constrained reformulation of the standard Linear
7992 Quadratic Regulator (LQR) problem. Our framework is motivated by the fact that
7993 the classical (risk-neutral) LQR controller, although optimal in expectation,
7994 might be ineffective under relatively infrequent, yet statistically significant
7995 (risky) events. To effectively trade between average and extreme event
7996 performance, we introduce a new risk constraint, which explicitly restricts the
7997 total expected predictive variance of the state penalty by a user-prescribed
7998 level. We show that, under rather minimal conditions on the process noise
7999 (i.e., finite fourth-order moments), the optimal risk-aware controller can be
8000 evaluated explicitly and in closed form. In fact, it is affine relative to the
8001 state, and is always internally stable regardless of parameter tuning. Our new
8002 risk-aware controller: i) pushes the state away from directions where the noise
8003 exhibits heavy tails, by exploiting the third-order moment (skewness) of the
8004 noise; ii) inflates the state penalty in riskier directions, where both the
8005 noise covariance and the state penalty are simultaneously large. The properties
8006 of the proposed risk-aware LQR framework are also illustrated via indicative
8007 numerical examples.
8008 </p>
8009 </description>
8010 <guid isPermaLink="false">oai:arXiv.org:2004.04685</guid>
8011 </item>
8012 <item>
8013 <title>Supervised Contrastive Learning. (arXiv:2004.11362v2 [cs.LG] UPDATED)</title>
8014 <link>http://fr.arxiv.org/abs/2004.11362</link>
8015 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Khosla_P/0/1/0/all/0/1">Prannay Khosla</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Teterwak_P/0/1/0/all/0/1">Piotr Teterwak</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_C/0/1/0/all/0/1">Chen Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sarna_A/0/1/0/all/0/1">Aaron Sarna</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_Y/0/1/0/all/0/1">Yonglong Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Isola_P/0/1/0/all/0/1">Phillip Isola</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Maschinot_A/0/1/0/all/0/1">Aaron Maschinot</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_C/0/1/0/all/0/1">Ce Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Krishnan_D/0/1/0/all/0/1">Dilip Krishnan</a></p>
8016
8017 <p>Contrastive learning applied to self-supervised representation learning has
8018 seen a resurgence in recent years, leading to state of the art performance in
8019 the unsupervised training of deep image models. Modern batch contrastive
8020 approaches subsume or significantly outperform traditional contrastive losses
8021 such as triplet, max-margin and the N-pairs loss. In this work, we extend the
8022 self-supervised batch contrastive approach to the fully-supervised setting,
8023 allowing us to effectively leverage label information. Clusters of points
8024 belonging to the same class are pulled together in embedding space, while
8025 simultaneously pushing apart clusters of samples from different classes. We
8026 analyze two possible versions of the supervised contrastive (SupCon) loss,
8027 identifying the best-performing formulation of the loss. On ResNet-200, we
8028 achieve top-1 accuracy of 81.4% on the ImageNet dataset, which is 0.8% above
8029 the best number reported for this architecture. We show consistent
8030 outperformance over cross-entropy on other datasets and two ResNet variants.
8031 The loss shows benefits for robustness to natural corruptions and is more
8032 stable to hyperparameter settings such as optimizers and data augmentations. In
8033 reduced data settings, it outperforms cross-entropy significantly. Our loss
8034 function is simple to implement, and reference TensorFlow code is released at
8035 https://t.ly/supcon.
8036 </p>
8037 </description>
8038 <guid isPermaLink="false">oai:arXiv.org:2004.11362</guid>
8039 </item>
8040 <item>
8041 <title>An Epidemiological Modelling Approach for Covid19 via Data Assimilation. (arXiv:2004.12130v3 [stat.AP] UPDATED)</title>
8042 <link>http://fr.arxiv.org/abs/2004.12130</link>
8043 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Nadler_P/0/1/0/all/0/1">Philip Nadler</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wang_S/0/1/0/all/0/1">Shuo Wang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Arcucci_R/0/1/0/all/0/1">Rossella Arcucci</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Yang_X/0/1/0/all/0/1">Xian Yang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Guo_Y/0/1/0/all/0/1">Yike Guo</a></p>
8044
8045 <p>The global pandemic of the 2019-nCov requires the evaluation of policy
8046 interventions to mitigate future social and economic costs of quarantine
8047 measures worldwide. We propose an epidemiological model for forecasting and
8048 policy evaluation which incorporates new data in real-time through variational
8049 data assimilation. We analyze and discuss infection rates in China, the US and
8050 Italy. In particular, we develop a custom compartmental SIR model fit to
8051 variables related to the epidemic in Chinese cities, named SITR model. We
8052 compare and discuss model results which conducts updates as new observations
8053 become available. A hybrid data assimilation approach is applied to make
8054 results robust to initial conditions. We use the model to do inference on
8055 infection numbers as well as parameters such as the disease transmissibility
8056 rate or the rate of recovery. The parameterisation of the model is parsimonious
8057 and extendable, allowing for the incorporation of additional data and
8058 parameters of interest. This allows for scalability and the extension of the
8059 model to other locations or the adaption of novel data sources.
8060 </p>
8061 </description>
8062 <guid isPermaLink="false">oai:arXiv.org:2004.12130</guid>
8063 </item>
8064 <item>
8065 <title>Holistic Privacy for Electricity, Water, and Natural Gas Metering in Next Generation Smart Homes. (arXiv:2004.13363v3 [eess.SY] UPDATED)</title>
8066 <link>http://fr.arxiv.org/abs/2004.13363</link>
8067 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Kement_C/0/1/0/all/0/1">Cihan Emre Kement</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Tavli_B/0/1/0/all/0/1">Bulent Tavli</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gultekin_H/0/1/0/all/0/1">Hakan Gultekin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Yanikomeroglu_H/0/1/0/all/0/1">Halim Yanikomeroglu</a></p>
8068
8069 <p>In smart electricity grids, high time granularity (HTG) power consumption
8070 data can be decomposed into individual appliance load signatures via
8071 Nonintrusive Appliance Load Monitoring techniques to expose appliance usage
8072 profiles. Various methods ranging from load shaping to noise addition and data
8073 aggregation have been proposed to mitigate this problem. However, with the
8074 growing scarcity of natural resources, utilities other than electricity (such
8075 as water and natural gas) have also begun to be subject to HTG metering, which
8076 creates privacy issues similar to that of electricity. Therefore, employing
8077 privacy protection countermeasures for only electricity usage is ineffective
8078 for appliances that utilize additional/other metered resources. As such,
8079 existing privacy countermeasures and metrics need to be reevaluated to address
8080 not only electricity, but also any other resource that is metered. Furthermore,
8081 a holistic privacy protection approach for all metered resources must be
8082 adopted as the information leak from any of the resources has a potential to
8083 render the privacy preserving countermeasures for all the other resources
8084 futile. This paper introduces the privacy preservation problem for multiple HTG
8085 metered resources and explores potential solutions for its mitigation.
8086 </p>
8087 </description>
8088 <guid isPermaLink="false">oai:arXiv.org:2004.13363</guid>
8089 </item>
8090 <item>
8091 <title>Geometric group testing. (arXiv:2004.14632v3 [cs.CG] UPDATED)</title>
8092 <link>http://fr.arxiv.org/abs/2004.14632</link>
8093 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Berendsohn_B/0/1/0/all/0/1">Benjamin Aram Berendsohn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kozma_L/0/1/0/all/0/1">L&#xe1;szl&#xf3; Kozma</a></p>
8094
8095 <p>Group testing is concerned with identifying $t$ defective items in a set of
8096 $m$ items, where each test reports whether a specific subset of items contains
8097 at least one defective. In non-adaptive group testing, the subsets to be tested
8098 are fixed in advance. By testing multiple items at once, the required number of
8099 tests can be made much smaller than $m$. In fact, for $t \in \mathcal{O}(1)$,
8100 the optimal number of (non-adaptive) tests is known to be $\Theta(\log{m})$.
8101 </p>
8102 <p>In this paper, we consider the problem of non-adaptive group testing in a
8103 geometric setting, where the items are points in $d$-dimensional Euclidean
8104 space and the tests are axis-parallel boxes (hyperrectangles). We present upper
8105 and lower bounds on the required number of tests under this geometric
8106 constraint. In contrast to the general, combinatorial case, the bounds in our
8107 geometric setting are polynomial in $m$. For instance, our results imply that
8108 identifying a defective pair in a set of $m$ points in the plane always
8109 requires $\Omega(m^{3/5})$ tests, and there exist configurations of $m$ points
8110 for which $\mathcal{O}(m^{2/3})$ tests are sufficient, whereas to identify a
8111 single defective point in the plane, $\Theta(m^{1/2})$ tests are always
8112 necessary and sometimes sufficient.
8113 </p>
8114 </description>
8115 <guid isPermaLink="false">oai:arXiv.org:2004.14632</guid>
8116 </item>
8117 <item>
8118 <title>Minimum Cuts in Geometric Intersection Graphs. (arXiv:2005.00858v2 [cs.CG] UPDATED)</title>
8119 <link>http://fr.arxiv.org/abs/2005.00858</link>
8120 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Cabello_S/0/1/0/all/0/1">Sergio Cabello</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mulzer_W/0/1/0/all/0/1">Wolfgang Mulzer</a></p>
8121
8122 <p>Let $\mathcal{D}$ be a set of $n$ disks in the plane. The disk graph
8123 $G_\mathcal{D}$ for $\mathcal{D}$ is the undirected graph with vertex set
8124 $\mathcal{D}$ in which two disks are joined by an edge if and only if they
8125 intersect. The directed transmission graph $G^{\rightarrow}_\mathcal{D}$ for
8126 $\mathcal{D}$ is the directed graph with vertex set $\mathcal{D}$ in which
8127 there is an edge from a disk $D_1 \in \mathcal{D}$ to a disk $D_2 \in
8128 \mathcal{D}$ if and only if $D_1$ contains the center of $D_2$.
8129 </p>
8130 <p>Given $\mathcal{D}$ and two non-intersecting disks $s, t \in \mathcal{D}$, we
8131 show that a minimum $s$-$t$ vertex cut in $G_\mathcal{D}$ or in
8132 $G^{\rightarrow}_\mathcal{D}$ can be found in $O(n^{3/2}\text{polylog} n)$
8133 expected time. To obtain our result, we combine an algorithm for the maximum
8134 flow problem in general graphs with dynamic geometric data structures to
8135 manipulate the disks.
8136 </p>
8137 <p>As an application, we consider the barrier resilience problem in a
8138 rectangular domain. In this problem, we have a vertical strip $S$ bounded by
8139 two vertical lines, $L_\ell$ and $L_r$, and a collection $\mathcal{D}$ of
8140 disks. Let $a$ be a point in $S$ above all disks of $\mathcal{D}$, and let $b$
8141 a point in $S$ below all disks of $\mathcal{D}$. The task is to find a curve
8142 from $a$ to $b$ that lies in $S$ and that intersects as few disks of
8143 $\mathcal{D}$ as possible. Using our improved algorithm for minimum cuts in
8144 disk graphs, we can solve the barrier resilience problem in
8145 $O(n^{3/2}\text{polylog} n)$ expected time.
8146 </p>
8147 </description>
8148 <guid isPermaLink="false">oai:arXiv.org:2005.00858</guid>
8149 </item>
8150 <item>
8151 <title>Model Creation and Equivalence Proofs of Cellular Automata and Artificial Neural Networks. (arXiv:2005.01192v3 [cs.NE] UPDATED)</title>
8152 <link>http://fr.arxiv.org/abs/2005.01192</link>
8153 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Christen_P/0/1/0/all/0/1">Patrik Christen</a></p>
8154
8155 <p>Computational methods and mathematical models have invaded arguably every
8156 scientific discipline forming its own field of research called computational
8157 science. Mathematical models are the theoretical foundation of computational
8158 science. Since Newton's time, differential equations in mathematical models
8159 have been widely and successfully used to describe the macroscopic or global
8160 behaviour of systems. With spatially inhomogeneous, time-varying, local
8161 element-specific, and often non-linear interactions, the dynamics of complex
8162 systems is in contrast more efficiently described by local rules and thus in an
8163 algorithmic and local or microscopic manner. The theory of mathematical
8164 modelling taking into account these characteristics of complex systems has to
8165 be established still. We recently presented a so-called allagmatic method
8166 including a system metamodel to provide a framework for describing, modelling,
8167 simulating, and interpreting complex systems. Implementations of cellular
8168 automata and artificial neural networks were described and created with that
8169 method. Guidance from philosophy were helpful in these first studies focusing
8170 on programming and feasibility. A rigorous mathematical formalism, however, is
8171 still missing. This would not only more precisely describe and define the
8172 system metamodel, it would also further generalise it and with that extend its
8173 reach to formal treatment in applied mathematics and theoretical aspects of
8174 computational science as well as extend its applicability to other mathematical
8175 and computational models such as agent-based models. Here, a mathematical
8176 definition of the system metamodel is provided. Based on the presented
8177 formalism, model creation and equivalence of cellular automata and artificial
8178 neural networks are proved. It thus provides a formal approach for studying the
8179 creation of mathematical models as well as their structural and operational
8180 comparison.
8181 </p>
8182 </description>
8183 <guid isPermaLink="false">oai:arXiv.org:2005.01192</guid>
8184 </item>
8185 <item>
8186 <title>Analysis of the Symmetric Join the Shortest Orbit Queue. (arXiv:2005.02683v2 [math.PR] UPDATED)</title>
8187 <link>http://fr.arxiv.org/abs/2005.02683</link>
8188 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Dimitriou_I/0/1/0/all/0/1">Ioannis Dimitriou</a></p>
8189
8190 <p>This work introduces the join the shortest queue policy in the retrial
8191 setting. We consider a Markovian single server retrial system with two infinite
8192 capacity orbits. An arriving job finding the server busy, it is forwarded to
8193 the least loaded orbit. Otherwise, it is forwarded to an orbit randomly.
8194 Orbiting jobs of either type retry to access the server independently. We
8195 investigate the stability condition, the stationary tail decay rate, and obtain
8196 the equilibrium distribution by using the compensation method.
8197 </p>
8198 </description>
8199 <guid isPermaLink="false">oai:arXiv.org:2005.02683</guid>
8200 </item>
8201 <item>
8202 <title>Anonymized GCN: A Novel Robust Graph Embedding Method via Hiding Node Position in Noise. (arXiv:2005.03482v2 [cs.LG] UPDATED)</title>
8203 <link>http://fr.arxiv.org/abs/2005.03482</link>
8204 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_A/0/1/0/all/0/1">Ao Liu</a></p>
8205
8206 <p>Graph convolution network (GCN) have achieved state-of-the-art performance in
8207 the task of node prediction in the graph structure. However, with the gradual
8208 various of graph attack methods, there are lack of research on the robustness
8209 of GCN. In this paper, we prove the reason why GCN is vulnerable to attack:
8210 only training another GCN model can find the vulnerability of the target GCN
8211 model. To solve that, we propose a GCN model which is robust to attacks. By
8212 hiding the node's position in the Gaussian noise, the attacker will not be able
8213 to modify the connection information of the graph node, thus immune to the
8214 attack. Considering attackers usually modify the connection to interfere the
8215 prediction results of the target node, so, by hiding the connection of the
8216 graph in the noise through adversarial training, accurate node prediction can
8217 be completed only by the node number rather than its specific position in the
8218 graph, thus let the nodes in the graph are no longer related to the graph
8219 itself, that is to say, make the node anonymous. Specifically, we first
8220 demonstrated the key to determine the embedding of a specific node: the row
8221 corresponding to the node of the eigenmatrix of the Laplace matrix, by target
8222 it as the output of the generator, we take the corresponding noise as input.
8223 The generator will try to find the correct position of the node in the graph.
8224 Then the encoder and decoder are spliced both in discriminator, so that after
8225 adversarial training, the generator and discriminator can cooperate to complete
8226 the node prediction. Finally, All node positions can generated by noise at the
8227 same time, that is to say, the generator will hides all the connection
8228 information of the graph structure. The evaluation shows that we only need to
8229 obtain the initial features and node numbers of the nodes to complete the node
8230 prediction, and the accuracy did not decrease, but increased by 0.0293.
8231 </p>
8232 </description>
8233 <guid isPermaLink="false">oai:arXiv.org:2005.03482</guid>
8234 </item>
8235 <item>
8236 <title>InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs. (arXiv:2005.09635v2 [cs.CV] UPDATED)</title>
8237 <link>http://fr.arxiv.org/abs/2005.09635</link>
8238 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_Y/0/1/0/all/0/1">Yujun Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_C/0/1/0/all/0/1">Ceyuan Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tang_X/0/1/0/all/0/1">Xiaoou Tang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhou_B/0/1/0/all/0/1">Bolei Zhou</a></p>
8239
8240 <p>Although Generative Adversarial Networks (GANs) have made significant
8241 progress in face synthesis, there lacks enough understanding of what GANs have
8242 learned in the latent representation to map a random code to a photo-realistic
8243 image. In this work, we propose a framework called InterFaceGAN to interpret
8244 the disentangled face representation learned by the state-of-the-art GAN models
8245 and study the properties of the facial semantics encoded in the latent space.
8246 We first find that GANs learn various semantics in some linear subspaces of the
8247 latent space. After identifying these subspaces, we can realistically
8248 manipulate the corresponding facial attributes without retraining the model. We
8249 then conduct a detailed study on the correlation between different semantics
8250 and manage to better disentangle them via subspace projection, resulting in
8251 more precise control of the attribute manipulation. Besides manipulating the
8252 gender, age, expression, and presence of eyeglasses, we can even alter the face
8253 pose and fix the artifacts accidentally made by GANs. Furthermore, we perform
8254 an in-depth face identity analysis and a layer-wise analysis to evaluate the
8255 editing results quantitatively. Finally, we apply our approach to real face
8256 editing by employing GAN inversion approaches and explicitly training
8257 feed-forward models based on the synthetic data established by InterFaceGAN.
8258 Extensive experimental results suggest that learning to synthesize faces
8259 spontaneously brings a disentangled and controllable face representation.
8260 </p>
8261 </description>
8262 <guid isPermaLink="false">oai:arXiv.org:2005.09635</guid>
8263 </item>
8264 <item>
8265 <title>Stochastic control liasons: Richard Sinkhorn meets Gaspard Monge on a Schroedinger bridge. (arXiv:2005.10963v2 [math.OC] UPDATED)</title>
8266 <link>http://fr.arxiv.org/abs/2005.10963</link>
8267 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Chen_Y/0/1/0/all/0/1">Yongxin Chen</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Georgiou_T/0/1/0/all/0/1">Tryphon T. Georgiou</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Pavon_M/0/1/0/all/0/1">Michele Pavon</a></p>
8268
8269 <p>In 1931/32, Schroedinger studied a hot gas Gedankenexperiment, an instance of
8270 large deviations of the empirical distribution and an early example of the
8271 so-called maximum entropy inference method. This so-called Schroedinger bridge
8272 problem (SBP) was recently recognized as a regularization of the
8273 Monge-Kantorovich Optimal Mass Transport (OMT), leading to effective
8274 computation of the latter. Specifically, OMT with quadratic cost may be viewed
8275 as a zero-temperature limit of SBP, which amounts to minimization of the
8276 Helmholtz's free energy over probability distributions constrained to possess
8277 given marginals. The problem features a delicate compromise, mediated by a
8278 temperature parameter, between minimizing the internal energy and maximizing
8279 the entropy. These concepts are central to a rapidly expanding area of modern
8280 science dealing with the so-called {\em Sinkhorn algorithm} which appears as a
8281 special case of an algorithm first studied by the French analyst Robert Fortet
8282 in 1938/40 specifically for Schroedinger bridges. Due to the constraint on
8283 end-point distributions, dynamic programming is not a suitable tool to attack
8284 these problems. Instead, Fortet's iterative algorithm and its discrete
8285 counterpart, the Sinkhorn iteration, permit computation by iteratively solving
8286 the so-called {\em Schroedinger system}. In both the continuous as well as the
8287 discrete-time and space settings, {\em stochastic control} provides a
8288 reformulation and dynamic versions of these problems. The formalism behind
8289 these control problems have attracted attention as they lead to a variety of
8290 new applications in spacecraft guidance, control of robot or biological swarms,
8291 sensing, active cooling, network routing as well as in computer and data
8292 science. This multifacet and versatile framework, intertwining SBP and OMT,
8293 provides the substrate for a historical and technical overview of the field
8294 taken up in this paper.
8295 </p>
8296 </description>
8297 <guid isPermaLink="false">oai:arXiv.org:2005.10963</guid>
8298 </item>
8299 <item>
8300 <title>Multivariate Quasi-tight Framelets with High Balancing Orders Derived from Any Compactly Supported Refinable Vector Functions. (arXiv:2005.12451v2 [math.FA] UPDATED)</title>
8301 <link>http://fr.arxiv.org/abs/2005.12451</link>
8302 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Han_B/0/1/0/all/0/1">Bin Han</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Lu_R/0/1/0/all/0/1">Ran Lu</a></p>
8303
8304 <p>Generalizing wavelets by adding desired redundancy and flexibility,framelets
8305 are of interest and importance in many applications such as image processing
8306 and numerical algorithms. Several key properties of framelets are high
8307 vanishing moments for sparse multiscale representation, fast framelet
8308 transforms for numerical efficiency, and redundancy for robustness. However, it
8309 is a challenging problem to study and construct multivariate nonseparable
8310 framelets, mainly due to their intrinsic connections to factorization and
8311 syzygy modules of multivariate polynomial matrices. In this paper, we
8312 circumvent the above difficulties through the approach of quasi-tight
8313 framelets, which behave almost identically to tight framelets. Employing the
8314 popular oblique extension principle (OEP), from an arbitrary compactly
8315 supported $\dm$-refinable vector function $\phi$ with multiplicity greater than
8316 one, we prove that we can always derive from $\phi$ a compactly supported
8317 multivariate quasi-tight framelet such that (i) all the framelet generators
8318 have the highest possible order of vanishing moments;(ii) its associated fast
8319 framelet transform is compact with the highest balancing order.For a refinable
8320 scalar function $\phi$, the above item (ii) often cannot be achieved
8321 intrinsically but we show that we can always construct a compactly supported
8322 OEP-based multivariate quasi-tight framelet derived from $\phi$ satisfying item
8323 (i).This paper provides a comprehensive investigation on OEP-based multivariate
8324 quasi-tight multiframelets and their associated framelet transforms with high
8325 balancing orders. This deepens our theoretical understanding of multivariate
8326 quasi-tight multiframelets and their associated fast multiframelet transforms.
8327 </p>
8328 </description>
8329 <guid isPermaLink="false">oai:arXiv.org:2005.12451</guid>
8330 </item>
8331 <item>
8332 <title>Refining Implicit Argument Annotation for UCCA. (arXiv:2005.12889v2 [cs.CL] UPDATED)</title>
8333 <link>http://fr.arxiv.org/abs/2005.12889</link>
8334 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Cui_R/0/1/0/all/0/1">Ruixiang Cui</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hershcovich_D/0/1/0/all/0/1">Daniel Hershcovich</a></p>
8335
8336 <p>Predicate-argument structure analysis is a central component in meaning
8337 representations of text. The fact that some arguments are not explicitly
8338 mentioned in a sentence gives rise to ambiguity in language understanding, and
8339 renders it difficult for machines to interpret text correctly. However, only
8340 few resources represent implicit roles for NLU, and existing studies in NLP
8341 only make coarse distinctions between categories of arguments omitted from
8342 linguistic form. This paper proposes a typology for fine-grained implicit
8343 argument annotation on top of Universal Conceptual Cognitive Annotation's
8344 foundational layer. The proposed implicit argument categorisation is driven by
8345 theories of implicit role interpretation and consists of six types: Deictic,
8346 Generic, Genre-based, Type-identifiable, Non-specific, and Iterated-set. We
8347 exemplify our design by revisiting part of the UCCA EWT corpus, providing a new
8348 dataset annotated with the refinement layer, and making a comparative analysis
8349 with other schemes.
8350 </p>
8351 </description>
8352 <guid isPermaLink="false">oai:arXiv.org:2005.12889</guid>
8353 </item>
8354 <item>
8355 <title>An Empirical Study of Bots in Software Development -- Characteristics and Challenges from a Practitioner's Perspective. (arXiv:2005.13969v2 [cs.SE] UPDATED)</title>
8356 <link>http://fr.arxiv.org/abs/2005.13969</link>
8357 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Erlenhov_L/0/1/0/all/0/1">Linda Erlenhov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Neto_F/0/1/0/all/0/1">Francisco Gomes de Oliveira Neto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leitner_P/0/1/0/all/0/1">Philipp Leitner</a></p>
8358
8359 <p>Software engineering bots - automated tools that handle tedious tasks - are
8360 increasingly used by industrial and open source projects to improve developer
8361 productivity. Current research in this area is held back by a lack of consensus
8362 of what software engineering bots (DevBots) actually are, what characteristics
8363 distinguish them from other tools, and what benefits and challenges are
8364 associated with DevBot usage. In this paper we report on a mixed-method
8365 empirical study of DevBot usage in industrial practice. We report on findings
8366 from interviewing 21 and surveying a total of 111 developers. We identify three
8367 different personas among DevBot users (focusing on autonomy, chat interfaces,
8368 and "smartness"), each with different definitions of what a DevBot is, why
8369 developers use them, and what they struggle with. We conclude that future
8370 DevBot research should situate their work within our framework, to clearly
8371 identify what type of bot the work targets, and what advantages practitioners
8372 can expect. Further, we find that there currently is a lack of general purpose
8373 "smart" bots that go beyond simple automation tools or chat interfaces. This is
8374 problematic, as we have seen that such bots, if available, can have a
8375 transformative effect on the projects that use them.
8376 </p>
8377 </description>
8378 <guid isPermaLink="false">oai:arXiv.org:2005.13969</guid>
8379 </item>
8380 <item>
8381 <title>Sub-Band Knowledge Distillation Framework for Speech Enhancement. (arXiv:2005.14435v2 [eess.AS] UPDATED)</title>
8382 <link>http://fr.arxiv.org/abs/2005.14435</link>
8383 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wen_S/0/1/0/all/0/1">Shixue Wen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Liu_Y/0/1/0/all/0/1">Yun Liu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gao_G/0/1/0/all/0/1">Guanglai Gao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_X/0/1/0/all/0/1">Xiaofei Li</a></p>
8384
8385 <p>In single-channel speech enhancement, methods based on full-band spectral
8386 features have been widely studied. However, only a few methods pay attention to
8387 non-full-band spectral features. In this paper, we explore a knowledge
8388 distillation framework based on sub-band spectral mapping for single-channel
8389 speech enhancement. Specifically, we divide the full frequency band into
8390 multiple sub-bands and pre-train an elite-level sub-band enhancement model
8391 (teacher model) for each sub-band. These teacher models are dedicated to
8392 processing their own sub-bands. Next, under the teacher models' guidance, we
8393 train a general sub-band enhancement model (student model) that works for all
8394 sub-bands. Without increasing the number of model parameters and computational
8395 complexity, the student model's performance is further improved. To evaluate
8396 our proposed method, we conducted a large number of experiments on an
8397 open-source data set. The final experimental results show that the guidance
8398 from the elite-level teacher models dramatically improves the student model's
8399 performance, which exceeds the full-band model by employing fewer parameters.
8400 </p>
8401 </description>
8402 <guid isPermaLink="false">oai:arXiv.org:2005.14435</guid>
8403 </item>
8404 <item>
8405 <title>SNR-Based Teachers-Student Technique for Speech Enhancement. (arXiv:2005.14441v2 [eess.AS] UPDATED)</title>
8406 <link>http://fr.arxiv.org/abs/2005.14441</link>
8407 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Hao_X/0/1/0/all/0/1">Xiang Hao</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Su_X/0/1/0/all/0/1">Xiangdong Su</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wang_Z/0/1/0/all/0/1">Zhiyu Wang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_Q/0/1/0/all/0/1">Qiang Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Xu_H/0/1/0/all/0/1">Huali Xu</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Gao_G/0/1/0/all/0/1">Guanglai Gao</a></p>
8408
8409 <p>It is very challenging for speech enhancement methods to achieves robust
8410 performance under both high signal-to-noise ratio (SNR) and low SNR
8411 simultaneously. In this paper, we propose a method that integrates an SNR-based
8412 teachers-student technique and time-domain U-Net to deal with this problem.
8413 Specifically, this method consists of multiple teacher models and a student
8414 model. We first train the teacher models under multiple small-range SNRs that
8415 do not coincide with each other so that they can perform speech enhancement
8416 well within the specific SNR range. Then, we choose different teacher models to
8417 supervise the training of the student model according to the SNR of the
8418 training data. Eventually, the student model can perform speech enhancement
8419 under both high SNR and low SNR. To evaluate the proposed method, we
8420 constructed a dataset with an SNR ranging from -20dB to 20dB based on the
8421 public dataset. We experimentally analyzed the effectiveness of the SNR-based
8422 teachers-student technique and compared the proposed method with several
8423 state-of-the-art methods.
8424 </p>
8425 </description>
8426 <guid isPermaLink="false">oai:arXiv.org:2005.14441</guid>
8427 </item>
8428 <item>
8429 <title>A mathematical model for automatic differentiation in machine learning. (arXiv:2006.02080v2 [cs.LG] UPDATED)</title>
8430 <link>http://fr.arxiv.org/abs/2006.02080</link>
8431 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bolte_J/0/1/0/all/0/1">Jerome Bolte</a> (TSE), <a href="http://fr.arxiv.org/find/cs/1/au:+Pauwels_E/0/1/0/all/0/1">Edouard Pauwels</a> (IRIT-ADRIA)</p>
8432
8433 <p>Automatic differentiation, as implemented today, does not have a simple
8434 mathematical model adapted to the needs of modern machine learning. In this
8435 work we articulate the relationships between differentiation of programs as
8436 implemented in practice and differentiation of nonsmooth functions. To this end
8437 we provide a simple class of functions, a nonsmooth calculus, and show how they
8438 apply to stochastic approximation methods. We also evidence the issue of
8439 artificial critical points created by algorithmic differentiation and show how
8440 usual methods avoid these points with probability one.
8441 </p>
8442 </description>
8443 <guid isPermaLink="false">oai:arXiv.org:2006.02080</guid>
8444 </item>
8445 <item>
8446 <title>Convolutional Neural Networks for Global Human Settlements Mapping from Sentinel-2 Satellite Imagery. (arXiv:2006.03267v2 [eess.IV] UPDATED)</title>
8447 <link>http://fr.arxiv.org/abs/2006.03267</link>
8448 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Corbane_C/0/1/0/all/0/1">Christina Corbane</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Syrris_V/0/1/0/all/0/1">Vasileios Syrris</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sabo_F/0/1/0/all/0/1">Filip Sabo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Politis_P/0/1/0/all/0/1">Panagiotis Politis</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Melchiorri_M/0/1/0/all/0/1">Michele Melchiorri</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Pesaresi_M/0/1/0/all/0/1">Martino Pesaresi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Soille_P/0/1/0/all/0/1">Pierre Soille</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kemper_T/0/1/0/all/0/1">Thomas Kemper</a></p>
8449
8450 <p>Spatially consistent and up-to-date maps of human settlements are crucial for
8451 addressing policies related to urbanization and sustainability, especially in
8452 the era of an increasingly urbanized world.The availability of open and free
8453 Sentinel-2 data of the Copernicus Earth Observation program offers a new
8454 opportunity for wall-to-wall mapping of human settlements at a global
8455 scale.This paper presents a deep-learning-based framework for a fully automated
8456 extraction of built-up areas at a spatial resolution of 10 m from a global
8457 composite of Sentinel-2 imagery.A multi-neuro modeling methodology building on
8458 a simple Convolution Neural Networks architecture for pixel-wise image
8459 classification of built-up areas is developed.The core features of the proposed
8460 model are the image patch of size 5 x 5 pixels adequate for describing built-up
8461 areas from Sentinel-2 imagery and the lightweight topology with a total number
8462 of 1,448,578 trainable parameters and 4 2D convolutional layers and 2 flattened
8463 layers.The deployment of the model on the global Sentinel-2 image composite
8464 provides the most detailed and complete map reporting about built-up areas for
8465 reference year 2018. The validation of the results with an independent
8466 reference data-set of building footprints covering 277 sites across the world
8467 establishes the reliability of the built-up layer produced by the proposed
8468 framework and the model robustness.
8469 </p>
8470 </description>
8471 <guid isPermaLink="false">oai:arXiv.org:2006.03267</guid>
8472 </item>
8473 <item>
8474 <title>3D Self-Supervised Methods for Medical Imaging. (arXiv:2006.03829v2 [cs.CV] UPDATED)</title>
8475 <link>http://fr.arxiv.org/abs/2006.03829</link>
8476 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Taleb_A/0/1/0/all/0/1">Aiham Taleb</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Loetzsch_W/0/1/0/all/0/1">Winfried Loetzsch</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Danz_N/0/1/0/all/0/1">Noel Danz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Severin_J/0/1/0/all/0/1">Julius Severin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gaertner_T/0/1/0/all/0/1">Thomas Gaertner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bergner_B/0/1/0/all/0/1">Benjamin Bergner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lippert_C/0/1/0/all/0/1">Christoph Lippert</a></p>
8477
8478 <p>Self-supervised learning methods have witnessed a recent surge of interest
8479 after proving successful in multiple application fields. In this work, we
8480 leverage these techniques, and we propose 3D versions for five different
8481 self-supervised methods, in the form of proxy tasks. Our methods facilitate
8482 neural network feature learning from unlabeled 3D images, aiming to reduce the
8483 required cost for expert annotation. The developed algorithms are 3D
8484 Contrastive Predictive Coding, 3D Rotation prediction, 3D Jigsaw puzzles,
8485 Relative 3D patch location, and 3D Exemplar networks. Our experiments show that
8486 pretraining models with our 3D tasks yields more powerful semantic
8487 representations, and enables solving downstream tasks more accurately and
8488 efficiently, compared to training the models from scratch and to pretraining
8489 them on 2D slices. We demonstrate the effectiveness of our methods on three
8490 downstream tasks from the medical imaging domain: i) Brain Tumor Segmentation
8491 from 3D MRI, ii) Pancreas Tumor Segmentation from 3D CT, and iii) Diabetic
8492 Retinopathy Detection from 2D Fundus images. In each task, we assess the gains
8493 in data-efficiency, performance, and speed of convergence. Interestingly, we
8494 also find gains when transferring the learned representations, by our methods,
8495 from a large unlabeled 3D corpus to a small downstream-specific dataset. We
8496 achieve results competitive to state-of-the-art solutions at a fraction of the
8497 computational expense. We publish our implementations for the developed
8498 algorithms (both 3D and 2D versions) as an open-source library, in an effort to
8499 allow other researchers to apply and extend our methods on their datasets.
8500 </p>
8501 </description>
8502 <guid isPermaLink="false">oai:arXiv.org:2006.03829</guid>
8503 </item>
8504 <item>
8505 <title>Truthful Data Acquisition via Peer Prediction. (arXiv:2006.03992v2 [cs.GT] UPDATED)</title>
8506 <link>http://fr.arxiv.org/abs/2006.03992</link>
8507 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1">Yiling Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_Y/0/1/0/all/0/1">Yiheng Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zheng_S/0/1/0/all/0/1">Shuran Zheng</a></p>
8508
8509 <p>We consider the problem of purchasing data for machine learning or
8510 statistical estimation. The data analyst has a budget to purchase datasets from
8511 multiple data providers. She does not have any test data that can be used to
8512 evaluate the collected data and can assign payments to data providers solely
8513 based on the collected datasets. We consider the problem in the standard
8514 Bayesian paradigm and in two settings: (1) data are only collected once; (2)
8515 data are collected repeatedly and each day's data are drawn independently from
8516 the same distribution. For both settings, our mechanisms guarantee that
8517 truthfully reporting one's dataset is always an equilibrium by adopting
8518 techniques from peer prediction: pay each provider the mutual information
8519 between his reported data and other providers' reported data. Depending on the
8520 data distribution, the mechanisms can also discourage misreports that would
8521 lead to inaccurate predictions. Our mechanisms also guarantee individual
8522 rationality and budget feasibility for certain underlying distributions in the
8523 first setting and for all distributions in the second setting.
8524 </p>
8525 </description>
8526 <guid isPermaLink="false">oai:arXiv.org:2006.03992</guid>
8527 </item>
8528 <item>
8529 <title>Self-consumption for energy communities in Spain: a regional analysis under the new legal framework. (arXiv:2006.06459v3 [eess.SY] UPDATED)</title>
8530 <link>http://fr.arxiv.org/abs/2006.06459</link>
8531 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Gallego_Castillo_C/0/1/0/all/0/1">Cristobal Gallego-Castillo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Heleno_M/0/1/0/all/0/1">Miguel Heleno</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Victoria_M/0/1/0/all/0/1">Marta Victoria</a></p>
8532
8533 <p>European climate polices acknowledge the role that energy communities can
8534 play in the energy transition. Self-consumption installations shared among
8535 those living in the same building are a good example of such energy
8536 communities. In this work, we perform a regional analysis of optimal
8537 self-consumption installations under the new legal framework recently passed in
8538 Spain. Results show that the optimal sizing of the installation leads to
8539 economic savings for self-consumers in all the territory, for both options with
8540 and without remuneration for energy surplus. A sensitivity analysis on
8541 technology costs revealed that batteries still require noticeably cost
8542 reductions to be cost-effective in a behind the meter self-consumption
8543 environment. In addition, solar compensation mechanisms make batteries less
8544 attractive in a scenario of low PV costs, since feeding PV surplus into the
8545 grid, yet less efficient, becomes more cost-effective. An improvement for the
8546 current energy surplus remuneration policy was proposed and analysed. It
8547 consists in the inclusion of the economic value of the avoided power losses in
8548 the remuneration.
8549 </p>
8550 </description>
8551 <guid isPermaLink="false">oai:arXiv.org:2006.06459</guid>
8552 </item>
8553 <item>
8554 <title>Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction. (arXiv:2006.06648v3 [cs.LG] UPDATED)</title>
8555 <link>http://fr.arxiv.org/abs/2006.06648</link>
8556 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Baek_J/0/1/0/all/0/1">Jinheon Baek</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_D/0/1/0/all/0/1">Dong Bok Lee</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hwang_S/0/1/0/all/0/1">Sung Ju Hwang</a></p>
8557
8558 <p>Many practical graph problems, such as knowledge graph construction and
8559 drug-drug interaction prediction, require to handle multi-relational graphs.
8560 However, handling real-world multi-relational graphs with Graph Neural Networks
8561 (GNNs) is often challenging due to their evolving nature, as new entities
8562 (nodes) can emerge over time. Moreover, newly emerged entities often have few
8563 links, which makes the learning even more difficult. Motivated by this
8564 challenge, we introduce a realistic problem of few-shot out-of-graph link
8565 prediction, where we not only predict the links between the seen and unseen
8566 nodes as in a conventional out-of-knowledge link prediction task but also
8567 between the unseen nodes, with only few edges per node. We tackle this problem
8568 with a novel transductive meta-learning framework which we refer to as Graph
8569 Extrapolation Networks (GEN). GEN meta-learns both the node embedding network
8570 for inductive inference (seen-to-unseen) and the link prediction network for
8571 transductive inference (unseen-to-unseen). For transductive link prediction, we
8572 further propose a stochastic embedding layer to model uncertainty in the link
8573 prediction between unseen entities. We validate our model on multiple benchmark
8574 datasets for knowledge graph completion and drug-drug interaction prediction.
8575 The results show that our model significantly outperforms relevant baselines
8576 for out-of-graph link prediction tasks.
8577 </p>
8578 </description>
8579 <guid isPermaLink="false">oai:arXiv.org:2006.06648</guid>
8580 </item>
8581 <item>
8582 <title>Frontiers in Mortar Methods for Isogeometric Analysis. (arXiv:2006.06677v3 [cs.CE] UPDATED)</title>
8583 <link>http://fr.arxiv.org/abs/2006.06677</link>
8584 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hesch_C/0/1/0/all/0/1">Christian Hesch</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Khristenko_U/0/1/0/all/0/1">Ustim Khristenko</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Krause_R/0/1/0/all/0/1">Rolf Krause</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Popp_A/0/1/0/all/0/1">Alexander Popp</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Seitz_A/0/1/0/all/0/1">Alexander Seitz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wall_W/0/1/0/all/0/1">Wolfgang Wall</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wohlmuth_B/0/1/0/all/0/1">Barbara Wohlmuth</a></p>
8585
8586 <p>Complex geometries as common in industrial applications consist of multiple
8587 patches, if spline based parametrizations are used. The requirements for the
8588 generation of analysis-suitable models are increasing dramatically since
8589 isogeometric analysis is directly based on the spline parametrization and
8590 nowadays used for the calculation of higher-order partial differential
8591 equations. The computational, or more general, the engineering analysis
8592 necessitates suitable coupling techniques between the different patches. Mortar
8593 methods have been successfully applied for coupling of patches and for contact
8594 mechanics in recent years to resolve the arising issues within the interface.
8595 We present here current achievements in the design of mortar technologies in
8596 isogeometric analysis within the Priority Program SPP 1748, Reliable Simulation
8597 Techniques in Solid Mechanics. Development of Non-standard Discretisation
8598 Methods, Mechanical and Mathematical Analysis.
8599 </p>
8600 </description>
8601 <guid isPermaLink="false">oai:arXiv.org:2006.06677</guid>
8602 </item>
8603 <item>
8604 <title>Sparse and Continuous Attention Mechanisms. (arXiv:2006.07214v3 [cs.LG] UPDATED)</title>
8605 <link>http://fr.arxiv.org/abs/2006.07214</link>
8606 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Martins_A/0/1/0/all/0/1">Andr&#xe9; F. T. Martins</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Farinhas_A/0/1/0/all/0/1">Ant&#xf3;nio Farinhas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Treviso_M/0/1/0/all/0/1">Marcos Treviso</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Niculae_V/0/1/0/all/0/1">Vlad Niculae</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Aguiar_P/0/1/0/all/0/1">Pedro M. Q. Aguiar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Figueiredo_M/0/1/0/all/0/1">M&#xe1;rio A. T. Figueiredo</a></p>
8607
8608 <p>Exponential families are widely used in machine learning; they include many
8609 distributions in continuous and discrete domains (e.g., Gaussian, Dirichlet,
8610 Poisson, and categorical distributions via the softmax transformation).
8611 Distributions in each of these families have fixed support. In contrast, for
8612 finite domains, there has been recent work on sparse alternatives to softmax
8613 (e.g. sparsemax and alpha-entmax), which have varying support, being able to
8614 assign zero probability to irrelevant categories. This paper expands that work
8615 in two directions: first, we extend alpha-entmax to continuous domains,
8616 revealing a link with Tsallis statistics and deformed exponential families.
8617 Second, we introduce continuous-domain attention mechanisms, deriving efficient
8618 gradient backpropagation algorithms for alpha in {1,2}. Experiments on
8619 attention-based text classification, machine translation, and visual question
8620 answering illustrate the use of continuous attention in 1D and 2D, showing that
8621 it allows attending to time intervals and compact regions.
8622 </p>
8623 </description>
8624 <guid isPermaLink="false">oai:arXiv.org:2006.07214</guid>
8625 </item>
8626 <item>
8627 <title>Neural Estimators for Conditional Mutual Information Using Nearest Neighbors Sampling. (arXiv:2006.07225v2 [cs.IT] UPDATED)</title>
8628 <link>http://fr.arxiv.org/abs/2006.07225</link>
8629 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Molavipour_S/0/1/0/all/0/1">Sina Molavipour</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bassi_G/0/1/0/all/0/1">Germ&#xe1;n Bassi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Skoglund_M/0/1/0/all/0/1">Mikael Skoglund</a></p>
8630
8631 <p>The estimation of mutual information (MI) or conditional mutual information
8632 (CMI) from a set of samples is a long-standing problem. A recent line of work
8633 in this area has leveraged the approximation power of artificial neural
8634 networks and has shown improvements over conventional methods. One important
8635 challenge in this new approach is the need to obtain, given the original
8636 dataset, a different set where the samples are distributed according to a
8637 specific product density function. This is particularly challenging when
8638 estimating CMI.
8639 </p>
8640 <p>In this paper, we introduce a new technique, based on k nearest neighbors
8641 (k-NN), to perform the resampling and derive high-confidence concentration
8642 bounds for the sample average. Then the technique is employed to train a neural
8643 network classifier and the CMI is estimated accordingly. We propose three
8644 estimators using this technique and prove their consistency, make a comparison
8645 between them and similar approaches in the literature, and experimentally show
8646 improvements in estimating the CMI in terms of accuracy and variance of the
8647 estimators.
8648 </p>
8649 </description>
8650 <guid isPermaLink="false">oai:arXiv.org:2006.07225</guid>
8651 </item>
8652 <item>
8653 <title>Learning Latent Space Energy-Based Prior Model. (arXiv:2006.08205v2 [stat.ML] UPDATED)</title>
8654 <link>http://fr.arxiv.org/abs/2006.08205</link>
8655 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Pang_B/0/1/0/all/0/1">Bo Pang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Han_T/0/1/0/all/0/1">Tian Han</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Nijkamp_E/0/1/0/all/0/1">Erik Nijkamp</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Zhu_S/0/1/0/all/0/1">Song-Chun Zhu</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wu_Y/0/1/0/all/0/1">Ying Nian Wu</a></p>
8656
8657 <p>We propose to learn energy-based model (EBM) in the latent space of a
8658 generator model, so that the EBM serves as a prior model that stands on the
8659 top-down network of the generator model. Both the latent space EBM and the
8660 top-down network can be learned jointly by maximum likelihood, which involves
8661 short-run MCMC sampling from both the prior and posterior distributions of the
8662 latent vector. Due to the low dimensionality of the latent space and the
8663 expressiveness of the top-down network, a simple EBM in latent space can
8664 capture regularities in the data effectively, and MCMC sampling in latent space
8665 is efficient and mixes well. We show that the learned model exhibits strong
8666 performances in terms of image and text generation and anomaly detection. The
8667 one-page code can be found in supplementary materials.
8668 </p>
8669 </description>
8670 <guid isPermaLink="false">oai:arXiv.org:2006.08205</guid>
8671 </item>
8672 <item>
8673 <title>Iterative regularization for convex regularizers. (arXiv:2006.09859v2 [stat.ML] UPDATED)</title>
8674 <link>http://fr.arxiv.org/abs/2006.09859</link>
8675 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Molinari_C/0/1/0/all/0/1">Cesare Molinari</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Massias_M/0/1/0/all/0/1">Mathurin Massias</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Rosasco_L/0/1/0/all/0/1">Lorenzo Rosasco</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Villa_S/0/1/0/all/0/1">Silvia Villa</a></p>
8676
8677 <p>We study iterative regularization for linear models, when the bias is convex
8678 but not necessarily strongly convex. We characterize the stability properties
8679 of a primal-dual gradient based approach, analyzing its convergence in the
8680 presence of worst case deterministic noise. As a main example, we specialize
8681 and illustrate the results for the problem of robust sparse recovery. Key to
8682 our analysis is a combination of ideas from regularization theory and
8683 optimization in the presence of errors. Theoretical results are complemented by
8684 experiments showing that state-of-the-art performances can be achieved with
8685 considerable computational speed-ups.
8686 </p>
8687 </description>
8688 <guid isPermaLink="false">oai:arXiv.org:2006.09859</guid>
8689 </item>
8690 <item>
8691 <title>Socially Fair k-Means Clustering. (arXiv:2006.10085v2 [cs.LG] UPDATED)</title>
8692 <link>http://fr.arxiv.org/abs/2006.10085</link>
8693 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ghadiri_M/0/1/0/all/0/1">Mehrdad Ghadiri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Samadi_S/0/1/0/all/0/1">Samira Samadi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vempala_S/0/1/0/all/0/1">Santosh Vempala</a></p>
8694
8695 <p>We show that the popular k-means clustering algorithm (Lloyd's heuristic),
8696 used for a variety of scientific data, can result in outcomes that are
8697 unfavorable to subgroups of data (e.g., demographic groups). Such biased
8698 clusterings can have deleterious implications for human-centric applications
8699 such as resource allocation. We present a fair k-means objective and algorithm
8700 to choose cluster centers that provide equitable costs for different groups.
8701 The algorithm, Fair-Lloyd, is a modification of Lloyd's heuristic for k-means,
8702 inheriting its simplicity, efficiency, and stability. In comparison with
8703 standard Lloyd's, we find that on benchmark datasets, Fair-Lloyd exhibits
8704 unbiased performance by ensuring that all groups have equal costs in the output
8705 k-clustering, while incurring a negligible increase in running time, thus
8706 making it a viable fair option wherever k-means is currently used.
8707 </p>
8708 </description>
8709 <guid isPermaLink="false">oai:arXiv.org:2006.10085</guid>
8710 </item>
8711 <item>
8712 <title>Neutralizing Self-Selection Bias in Sampling for Sortition. (arXiv:2006.10498v2 [cs.GT] UPDATED)</title>
8713 <link>http://fr.arxiv.org/abs/2006.10498</link>
8714 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Flanigan_B/0/1/0/all/0/1">Bailey Flanigan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Golz_P/0/1/0/all/0/1">Paul G&#xf6;lz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Gupta_A/0/1/0/all/0/1">Anupam Gupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Procaccia_A/0/1/0/all/0/1">Ariel Procaccia</a></p>
8715
8716 <p>Sortition is a political system in which decisions are made by panels of
8717 randomly selected citizens. The process for selecting a sortition panel is
8718 traditionally thought of as uniform sampling without replacement, which has
8719 strong fairness properties. In practice, however, sampling without replacement
8720 is not possible since only a fraction of agents is willing to participate in a
8721 panel when invited, and different demographic groups participate at different
8722 rates. In order to still produce panels whose composition resembles that of the
8723 population, we develop a sampling algorithm that restores close-to-equal
8724 representation probabilities for all agents while satisfying meaningful
8725 demographic quotas. As part of its input, our algorithm requires probabilities
8726 indicating how likely each volunteer in the pool was to participate. Since
8727 these participation probabilities are not directly observable, we show how to
8728 learn them, and demonstrate our approach using data on a real sortition panel
8729 combined with information on the general population in the form of publicly
8730 available survey data.
8731 </p>
8732 </description>
8733 <guid isPermaLink="false">oai:arXiv.org:2006.10498</guid>
8734 </item>
8735 <item>
8736 <title>ContraGAN: Contrastive Learning for Conditional Image Generation. (arXiv:2006.12681v2 [cs.CV] UPDATED)</title>
8737 <link>http://fr.arxiv.org/abs/2006.12681</link>
8738 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kang_M/0/1/0/all/0/1">Minguk Kang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Park_J/0/1/0/all/0/1">Jaesik Park</a></p>
8739
8740 <p>Conditional image generation is the task of generating diverse images using
8741 class label information. Although many conditional Generative Adversarial
8742 Networks (GAN) have shown realistic results, such methods consider pairwise
8743 relations between the embedding of an image and the embedding of the
8744 corresponding label (data-to-class relations) as the conditioning losses. In
8745 this paper, we propose ContraGAN that considers relations between multiple
8746 image embeddings in the same batch (data-to-data relations) as well as the
8747 data-to-class relations by using a conditional contrastive loss. The
8748 discriminator of ContraGAN discriminates the authenticity of given samples and
8749 minimizes a contrastive objective to learn the relations between training
8750 images. Simultaneously, the generator tries to generate realistic images that
8751 deceive the authenticity and have a low contrastive loss. The experimental
8752 results show that ContraGAN outperforms state-of-the-art-models by 7.3% and
8753 7.7% on Tiny ImageNet and ImageNet datasets, respectively. Besides, we
8754 experimentally demonstrate that ContraGAN helps to relieve the overfitting of
8755 the discriminator. For a fair comparison, we re-implement twelve
8756 state-of-the-art GANs using the PyTorch library. The software package is
8757 available at https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.
8758 </p>
8759 </description>
8760 <guid isPermaLink="false">oai:arXiv.org:2006.12681</guid>
8761 </item>
8762 <item>
8763 <title>Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization. (arXiv:2006.13258v2 [cs.LG] UPDATED)</title>
8764 <link>http://fr.arxiv.org/abs/2006.13258</link>
8765 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Barde_P/0/1/0/all/0/1">Paul Barde</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Roy_J/0/1/0/all/0/1">Julien Roy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jeon_W/0/1/0/all/0/1">Wonseok Jeon</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pineau_J/0/1/0/all/0/1">Joelle Pineau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pal_C/0/1/0/all/0/1">Christopher Pal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nowrouzezahrai_D/0/1/0/all/0/1">Derek Nowrouzezahrai</a></p>
8766
8767 <p>Adversarial Imitation Learning alternates between learning a discriminator --
8768 which tells apart expert's demonstrations from generated ones -- and a
8769 generator's policy to produce trajectories that can fool this discriminator.
8770 This alternated optimization is known to be delicate in practice since it
8771 compounds unstable adversarial training with brittle and sample-inefficient
8772 reinforcement learning. We propose to remove the burden of the policy
8773 optimization steps by leveraging a novel discriminator formulation.
8774 Specifically, our discriminator is explicitly conditioned on two policies: the
8775 one from the previous generator's iteration and a learnable policy. When
8776 optimized, this discriminator directly learns the optimal generator's policy.
8777 Consequently, our discriminator's update solves the generator's optimization
8778 problem for free: learning a policy that imitates the expert does not require
8779 an additional optimization loop. This formulation effectively cuts by half the
8780 implementation and computational burden of Adversarial Imitation Learning
8781 algorithms by removing the Reinforcement Learning phase altogether. We show on
8782 a variety of tasks that our simpler approach is competitive to prevalent
8783 Imitation Learning methods.
8784 </p>
8785 </description>
8786 <guid isPermaLink="false">oai:arXiv.org:2006.13258</guid>
8787 </item>
8788 <item>
8789 <title>Relative Deviation Margin Bounds. (arXiv:2006.14950v2 [cs.LG] UPDATED)</title>
8790 <link>http://fr.arxiv.org/abs/2006.14950</link>
8791 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Cortes_C/0/1/0/all/0/1">Corinna Cortes</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohri_M/0/1/0/all/0/1">Mehryar Mohri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Suresh_A/0/1/0/all/0/1">Ananda Theertha Suresh</a></p>
8792
8793 <p>We present a series of new and more favorable margin-based learning
8794 guarantees that depend on the empirical margin loss of a predictor. We give two
8795 types of learning bounds, both distribution-dependent and valid for general
8796 families, in terms of the Rademacher complexity or the empirical $\ell_\infty$
8797 covering number of the hypothesis set used. Furthermore, using our relative
8798 deviation margin bounds, we derive distribution-dependent generalization bounds
8799 for unbounded loss functions under the assumption of a finite moment. We also
8800 briefly highlight several applications of these bounds and discuss their
8801 connection with existing results.
8802 </p>
8803 </description>
8804 <guid isPermaLink="false">oai:arXiv.org:2006.14950</guid>
8805 </item>
8806 <item>
8807 <title>Weighted hypersoft configuration model. (arXiv:2007.00124v2 [physics.soc-ph] UPDATED)</title>
8808 <link>http://fr.arxiv.org/abs/2007.00124</link>
8809 <description><p>Authors: <a href="http://fr.arxiv.org/find/physics/1/au:+Voitalov_I/0/1/0/all/0/1">Ivan Voitalov</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Hoorn_P/0/1/0/all/0/1">Pim van der Hoorn</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Kitsak_M/0/1/0/all/0/1">Maksim Kitsak</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Papadopoulos_F/0/1/0/all/0/1">Fragkiskos Papadopoulos</a>, <a href="http://fr.arxiv.org/find/physics/1/au:+Krioukov_D/0/1/0/all/0/1">Dmitri Krioukov</a></p>
8810
8811 <p>Maximum entropy null models of networks come in different flavors that depend
8812 on the type of constraints under which entropy is maximized. If the constraints
8813 are on degree sequences or distributions, we are dealing with configuration
8814 models. If the degree sequence is constrained exactly, the corresponding
8815 microcanonical ensemble of random graphs with a given degree sequence is the
8816 configuration model per se. If the degree sequence is constrained only on
8817 average, the corresponding grand-canonical ensemble of random graphs with a
8818 given expected degree sequence is the soft configuration model. If the degree
8819 sequence is not fixed at all but randomly drawn from a fixed distribution, the
8820 corresponding hypercanonical ensemble of random graphs with a given degree
8821 distribution is the hypersoft configuration model, a more adequate description
8822 of dynamic real-world networks in which degree sequences are never fixed but
8823 degree distributions often stay stable. Here, we introduce the hypersoft
8824 configuration model of weighted networks. The main contribution is a particular
8825 version of the model with power-law degree and strength distributions, and
8826 superlinear scaling of strengths with degrees, mimicking the properties of some
8827 real-world networks. As a byproduct, we generalize the notions of sparse
8828 graphons and their entropy to weighted networks.
8829 </p>
8830 </description>
8831 <guid isPermaLink="false">oai:arXiv.org:2007.00124</guid>
8832 </item>
8833 <item>
8834 <title>Robustness against Relational Adversary. (arXiv:2007.00772v2 [cs.LG] UPDATED)</title>
8835 <link>http://fr.arxiv.org/abs/2007.00772</link>
8836 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yizhen Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Meng_X/0/1/0/all/0/1">Xiaozhu Meng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_K/0/1/0/all/0/1">Ke Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Christodorescu_M/0/1/0/all/0/1">Mihai Christodorescu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jha_S/0/1/0/all/0/1">Somesh Jha</a></p>
8837
8838 <p>Test-time adversarial attacks have posed serious challenges to the robustness
8839 of machine-learning models, and in many settings the adversarial perturbation
8840 need not be bounded by small $\ell_p$-norms. Motivated by the
8841 semantics-preserving attacks in vision and security domain, we investigate
8842 $\textit{relational adversaries}$, a broad class of attackers who create
8843 adversarial examples that are in a reflexive-transitive closure of a logical
8844 relation. We analyze the conditions for robustness and propose
8845 $\textit{normalize-and-predict}$ -- a learning framework with provable
8846 robustness guarantee. We compare our approach with adversarial training and
8847 derive an unified framework that provides benefits of both approaches. Guided
8848 by our theoretical findings, we apply our framework to image classification and
8849 malware detection. Results of both tasks show that attacks using relational
8850 adversaries frequently fool existing models, but our unified framework can
8851 significantly enhance their robustness.
8852 </p>
8853 </description>
8854 <guid isPermaLink="false">oai:arXiv.org:2007.00772</guid>
8855 </item>
8856 <item>
8857 <title>Information Theoretic Lower Bounds for Feed-Forward Fully-Connected Deep Networks. (arXiv:2007.00796v2 [stat.ML] UPDATED)</title>
8858 <link>http://fr.arxiv.org/abs/2007.00796</link>
8859 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Yang_X/0/1/0/all/0/1">Xiaochen Yang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Honorio_J/0/1/0/all/0/1">Jean Honorio</a></p>
8860
8861 <p>In this paper, we study the sample complexity lower bounds for the exact
8862 recovery of parameters and for a positive excess risk of a feed-forward,
8863 fully-connected neural network for binary classification, using
8864 information-theoretic tools. We prove these lower bounds by the existence of a
8865 generative network characterized by a backwards data generating process, where
8866 the input is generated based on the binary output, and the network is
8867 parametrized by weight parameters for the hidden layers. The sample complexity
8868 lower bound for the exact recovery of parameters is $\Omega(d r \log(r) + p )$
8869 and for a positive excess risk is $\Omega(r \log(r) + p )$, where $p$ is the
8870 dimension of the input, $r$ reflects the rank of the weight matrices and $d$ is
8871 the number of hidden layers. To the best of our knowledge, our results are the
8872 first information theoretic lower bounds.
8873 </p>
8874 </description>
8875 <guid isPermaLink="false">oai:arXiv.org:2007.00796</guid>
8876 </item>
8877 <item>
8878 <title>Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning. (arXiv:2007.01293v2 [cs.LG] UPDATED)</title>
8879 <link>http://fr.arxiv.org/abs/2007.01293</link>
8880 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_Z/0/1/0/all/0/1">Zhongzheng Ren</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yeh_R/0/1/0/all/0/1">Raymond A. Yeh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schwing_A/0/1/0/all/0/1">Alexander G. Schwing</a></p>
8881
8882 <p>Existing semi-supervised learning (SSL) algorithms use a single weight to
8883 balance the loss of labeled and unlabeled examples, i.e., all unlabeled
8884 examples are equally weighted. But not all unlabeled data are equal. In this
8885 paper we study how to use a different weight for every unlabeled example.
8886 Manual tuning of all those weights -- as done in prior work -- is no longer
8887 possible. Instead, we adjust those weights via an algorithm based on the
8888 influence function, a measure of a model's dependency on one training example.
8889 To make the approach efficient, we propose a fast and effective approximation
8890 of the influence function. We demonstrate that this technique outperforms
8891 state-of-the-art methods on semi-supervised image and language classification
8892 tasks.
8893 </p>
8894 </description>
8895 <guid isPermaLink="false">oai:arXiv.org:2007.01293</guid>
8896 </item>
8897 <item>
8898 <title>A Framework for Modelling, Verification and Transformation of Concurrent Imperative Programs. (arXiv:2007.02261v2 [cs.LO] UPDATED)</title>
8899 <link>http://fr.arxiv.org/abs/2007.02261</link>
8900 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bortin_M/0/1/0/all/0/1">Maksym Bortin</a></p>
8901
8902 <p>The paper gives a comprehensive presentation of a framework, embedded into
8903 the simply typed higher-order logic, and aimed at providing a sound assistance
8904 in formal reasoning about models of imperative programs with interleaved
8905 computations. As a case study, a model of the Peterson's mutual exclusion
8906 algorithm will be scrutinised in the course of the paper illustrating
8907 applicability of the framework.
8908 </p>
8909 </description>
8910 <guid isPermaLink="false">oai:arXiv.org:2007.02261</guid>
8911 </item>
8912 <item>
8913 <title>Self-Supervised Graph Transformer on Large-Scale Molecular Data. (arXiv:2007.02835v2 [q-bio.BM] UPDATED)</title>
8914 <link>http://fr.arxiv.org/abs/2007.02835</link>
8915 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-bio/1/au:+Rong_Y/0/1/0/all/0/1">Yu Rong</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Bian_Y/0/1/0/all/0/1">Yatao Bian</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Xu_T/0/1/0/all/0/1">Tingyang Xu</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Xie_W/0/1/0/all/0/1">Weiyang Xie</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Wei_Y/0/1/0/all/0/1">Ying Wei</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Huang_W/0/1/0/all/0/1">Wenbing Huang</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Huang_J/0/1/0/all/0/1">Junzhou Huang</a></p>
8916
8917 <p>How to obtain informative representations of molecules is a crucial
8918 prerequisite in AI-driven drug design and discovery. Recent researches abstract
8919 molecules as graphs and employ Graph Neural Networks (GNNs) for molecular
8920 representation learning. Nevertheless, two issues impede the usage of GNNs in
8921 real scenarios: (1) insufficient labeled molecules for supervised training; (2)
8922 poor generalization capability to new-synthesized molecules. To address them
8923 both, we propose a novel framework, GROVER, which stands for Graph
8924 Representation frOm self-superVised mEssage passing tRansformer. With carefully
8925 designed self-supervised tasks in node-, edge- and graph-level, GROVER can
8926 learn rich structural and semantic information of molecules from enormous
8927 unlabelled molecular data. Rather, to encode such complex information, GROVER
8928 integrates Message Passing Networks into the Transformer-style architecture to
8929 deliver a class of more expressive encoders of molecules. The flexibility of
8930 GROVER allows it to be trained efficiently on large-scale molecular dataset
8931 without requiring any supervision, thus being immunized to the two issues
8932 mentioned above. We pre-train GROVER with 100 million parameters on 10 million
8933 unlabelled molecules -- the biggest GNN and the largest training dataset in
8934 molecular representation learning. We then leverage the pre-trained GROVER for
8935 molecular property prediction followed by task-specific fine-tuning, where we
8936 observe a huge improvement (more than 6% on average) from current
8937 state-of-the-art methods on 11 challenging benchmarks. The insights we gained
8938 are that well-designed self-supervision losses and largely-expressive
8939 pre-trained models enjoy the significant potential on performance boosting.
8940 </p>
8941 </description>
8942 <guid isPermaLink="false">oai:arXiv.org:2007.02835</guid>
8943 </item>
8944 <item>
8945 <title>BoxE: A Box Embedding Model for Knowledge Base Completion. (arXiv:2007.06267v2 [cs.AI] UPDATED)</title>
8946 <link>http://fr.arxiv.org/abs/2007.06267</link>
8947 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Abboud_R/0/1/0/all/0/1">Ralph Abboud</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ceylan_I/0/1/0/all/0/1">&#x130;smail &#x130;lkan Ceylan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lukasiewicz_T/0/1/0/all/0/1">Thomas Lukasiewicz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Salvatori_T/0/1/0/all/0/1">Tommaso Salvatori</a></p>
8948
8949 <p>Knowledge base completion (KBC) aims to automatically infer missing facts by
8950 exploiting information already present in a knowledge base (KB). A promising
8951 approach for KBC is to embed knowledge into latent spaces and make predictions
8952 from learned embeddings. However, existing embedding models are subject to at
8953 least one of the following limitations: (1) theoretical inexpressivity, (2)
8954 lack of support for prominent inference patterns (e.g., hierarchies), (3) lack
8955 of support for KBC over higher-arity relations, and (4) lack of support for
8956 incorporating logical rules. Here, we propose a spatio-translational embedding
8957 model, called BoxE, that simultaneously addresses all these limitations. BoxE
8958 embeds entities as points, and relations as a set of hyper-rectangles (or
8959 boxes), which spatially characterize basic logical properties. This seemingly
8960 simple abstraction yields a fully expressive model offering a natural encoding
8961 for many desired logical properties. BoxE can both capture and inject rules
8962 from rich classes of rule languages, going well beyond individual inference
8963 patterns. By design, BoxE naturally applies to higher-arity KBs. We conduct a
8964 detailed experimental analysis, and show that BoxE achieves state-of-the-art
8965 performance, both on benchmark knowledge graphs and on more general KBs, and we
8966 empirically show the power of integrating logical rules.
8967 </p>
8968 </description>
8969 <guid isPermaLink="false">oai:arXiv.org:2007.06267</guid>
8970 </item>
8971 <item>
8972 <title>RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning. (arXiv:2007.06271v2 [cs.CV] UPDATED)</title>
8973 <link>http://fr.arxiv.org/abs/2007.06271</link>
8974 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chiaro_R/0/1/0/all/0/1">Riccardo Del Chiaro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Twardowski_B/0/1/0/all/0/1">Bart&#x142;omiej Twardowski</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bagdanov_A/0/1/0/all/0/1">Andrew D. Bagdanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weijer_J/0/1/0/all/0/1">Joost van de Weijer</a></p>
8975
8976 <p>Research on continual learning has led to a variety of approaches to
8977 mitigating catastrophic forgetting in feed-forward classification networks.
8978 Until now surprisingly little attention has been focused on continual learning
8979 of recurrent models applied to problems like image captioning. In this paper we
8980 take a systematic look at continual learning of LSTM-based models for image
8981 captioning. We propose an attention-based approach that explicitly accommodates
8982 the transient nature of vocabularies in continual image captioning tasks --
8983 i.e. that task vocabularies are not disjoint. We call our method Recurrent
8984 Attention to Transient Tasks (RATT), and also show how to adapt continual
8985 learning approaches based on weight egularization and knowledge distillation to
8986 recurrent continual learning problems. We apply our approaches to incremental
8987 image captioning problem on two new continual learning benchmarks we define
8988 using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is
8989 able to sequentially learn five captioning tasks while incurring no forgetting
8990 of previously learned ones.
8991 </p>
8992 </description>
8993 <guid isPermaLink="false">oai:arXiv.org:2007.06271</guid>
8994 </item>
8995 <item>
8996 <title>Graph Neural Networks for Scalable Radio Resource Management: Architecture Design and Theoretical Analysis. (arXiv:2007.07632v2 [cs.IT] UPDATED)</title>
8997 <link>http://fr.arxiv.org/abs/2007.07632</link>
8998 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shen_Y/0/1/0/all/0/1">Yifei Shen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Shi_Y/0/1/0/all/0/1">Yuanming Shi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jun Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Letaief_K/0/1/0/all/0/1">Khaled B. Letaief</a></p>
8999
9000 <p>Deep learning has recently emerged as a disruptive technology to solve
9001 challenging radio resource management problems in wireless networks. However,
9002 the neural network architectures adopted by existing works suffer from poor
9003 scalability, generalization, and lack of interpretability. A long-standing
9004 approach to improve scalability and generalization is to incorporate the
9005 structures of the target task into the neural network architecture. In this
9006 paper, we propose to apply graph neural networks (GNNs) to solve large-scale
9007 radio resource management problems, supported by effective neural network
9008 architecture design and theoretical analysis. Specifically, we first
9009 demonstrate that radio resource management problems can be formulated as graph
9010 optimization problems that enjoy a universal permutation equivariance property.
9011 We then identify a class of neural networks, named \emph{message passing graph
9012 neural networks} (MPGNNs). It is demonstrated that they not only satisfy the
9013 permutation equivariance property, but also can generalize to large-scale
9014 problems while enjoying a high computational efficiency. For interpretablity
9015 and theoretical guarantees, we prove the equivalence between MPGNNs and a class
9016 of distributed optimization algorithms, which is then used to analyze the
9017 performance and generalization of MPGNN-based methods. Extensive simulations,
9018 with power control and beamforming as two examples, will demonstrate that the
9019 proposed method, trained in an unsupervised manner with unlabeled samples,
9020 matches or even outperforms classic optimization-based algorithms without
9021 domain-specific knowledge. Remarkably, the proposed method is highly scalable
9022 and can solve the beamforming problem in an interference channel with $1000$
9023 transceiver pairs within $6$ milliseconds on a single GPU.
9024 </p>
9025 </description>
9026 <guid isPermaLink="false">oai:arXiv.org:2007.07632</guid>
9027 </item>
9028 <item>
9029 <title>Temporal Pointwise Convolutional Networks for Length of Stay Prediction in the Intensive Care Unit. (arXiv:2007.09483v2 [cs.LG] UPDATED)</title>
9030 <link>http://fr.arxiv.org/abs/2007.09483</link>
9031 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rocheteau_E/0/1/0/all/0/1">Emma Rocheteau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lio_P/0/1/0/all/0/1">Pietro Li&#xf2;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hyland_S/0/1/0/all/0/1">Stephanie Hyland</a></p>
9032
9033 <p>The pressure of ever-increasing patient demand and budget restrictions make
9034 hospital bed management a daily challenge for clinical staff. Most critical is
9035 the efficient allocation of resource-heavy Intensive Care Unit (ICU) beds to
9036 the patients who need life support. Central to solving this problem is knowing
9037 for how long the current set of ICU patients are likely to stay in the unit. In
9038 this work, we propose a new deep learning model based on the combination of
9039 temporal convolution and pointwise (1x1) convolution, to solve the length of
9040 stay prediction task on the eICU critical care dataset. The model - which we
9041 refer to as Temporal Pointwise Convolution (TPC) - is specifically designed to
9042 mitigate for common challenges with Electronic Health Records, such as
9043 skewness, irregular sampling and missing data. In doing so, we have achieved
9044 significant performance benefits of 18-51% (metric dependent) over the commonly
9045 used Long-Short Term Memory (LSTM) network, and the multi-head self-attention
9046 network known as the Transformer.
9047 </p>
9048 </description>
9049 <guid isPermaLink="false">oai:arXiv.org:2007.09483</guid>
9050 </item>
9051 <item>
9052 <title>CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks. (arXiv:2007.10497v3 [cs.HC] UPDATED)</title>
9053 <link>http://fr.arxiv.org/abs/2007.10497</link>
9054 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hassantabar_S/0/1/0/all/0/1">Shayan Hassantabar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stefano_N/0/1/0/all/0/1">Novati Stefano</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ghanakota_V/0/1/0/all/0/1">Vishweshwar Ghanakota</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ferrari_A/0/1/0/all/0/1">Alessandra Ferrari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Nicola_G/0/1/0/all/0/1">Gregory N. Nicola</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bruno_R/0/1/0/all/0/1">Raffaele Bruno</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Marino_I/0/1/0/all/0/1">Ignazio R. Marino</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hamidouche_K/0/1/0/all/0/1">Kenza Hamidouche</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jha_N/0/1/0/all/0/1">Niraj K. Jha</a></p>
9055
9056 <p>The novel coronavirus (SARS-CoV-2) has led to a pandemic. The current testing
9057 regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2
9058 has been unable to keep up with testing demands, and also suffers from a
9059 relatively low positive detection rate in the early stages of the resultant
9060 COVID-19 disease. Hence, there is a need for an alternative approach for
9061 repeated large-scale testing of SARS-CoV-2/COVID-19. We propose a framework
9062 called CovidDeep that combines efficient DNNs with commercially available WMSs
9063 for pervasive testing of the virus. We collected data from 87 individuals,
9064 spanning three cohorts including healthy, asymptomatic, and symptomatic
9065 patients. We trained DNNs on various subsets of the features automatically
9066 extracted from six WMS and questionnaire categories to perform ablation studies
9067 to determine which subsets are most efficacious in terms of test accuracy for a
9068 three-way classification. The highest test accuracy obtained was 98.1%. We also
9069 augmented the real training dataset with a synthetic training dataset drawn
9070 from the same probability distribution to impose a prior on DNN weights and
9071 leveraged a grow-and-prune synthesis paradigm to learn both DNN architecture
9072 and weights. This boosted the accuracy of the various DNNs further and
9073 simultaneously reduced their size and floating-point operations.
9074 </p>
9075 </description>
9076 <guid isPermaLink="false">oai:arXiv.org:2007.10497</guid>
9077 </item>
9078 <item>
9079 <title>The Complete Lasso Tradeoff Diagram. (arXiv:2007.11078v4 [math.ST] UPDATED)</title>
9080 <link>http://fr.arxiv.org/abs/2007.11078</link>
9081 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Wang_H/0/1/0/all/0/1">Hua Wang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Yang_Y/0/1/0/all/0/1">Yachong Yang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Bu_Z/0/1/0/all/0/1">Zhiqi Bu</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Su_W/0/1/0/all/0/1">Weijie J. Su</a></p>
9082
9083 <p>A fundamental problem in the high-dimensional regression is to understand the
9084 tradeoff between type I and type II errors or, equivalently, false discovery
9085 rate (FDR) and power in variable selection. To address this important problem,
9086 we offer the first complete tradeoff diagram that distinguishes all pairs of
9087 FDR and power that can be asymptotically realized by the Lasso with some choice
9088 of its penalty parameter from the remaining pairs, in a regime of linear
9089 sparsity under random designs. The tradeoff between the FDR and power
9090 characterized by our diagram holds no matter how strong the signals are. In
9091 particular, our results improve on the earlier Lasso tradeoff diagram of
9092 <a href="/abs/1511.01957">arXiv:1511.01957</a> by recognizing two simple but fundamental constraints on the
9093 pairs of FDR and power. The improvement is more substantial when the regression
9094 problem is above the Donoho--Tanner phase transition. Finally, we present
9095 extensive simulation studies to confirm the sharpness of the complete Lasso
9096 tradeoff diagram.
9097 </p>
9098 </description>
9099 <guid isPermaLink="false">oai:arXiv.org:2007.11078</guid>
9100 </item>
9101 <item>
9102 <title>Sifting Convolution on the Sphere. (arXiv:2007.12153v2 [cs.IT] UPDATED)</title>
9103 <link>http://fr.arxiv.org/abs/2007.12153</link>
9104 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Roddy_P/0/1/0/all/0/1">Patrick J. Roddy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+McEwen_J/0/1/0/all/0/1">Jason D. McEwen</a></p>
9105
9106 <p>A novel spherical convolution is defined through the sifting property of the
9107 Dirac delta on the sphere. The so-called sifting convolution is defined by the
9108 inner product of one function with a translated version of another, but with
9109 the adoption of an alternative translation operator on the sphere. This
9110 translation operator follows by analogy with the Euclidean translation when
9111 viewed in harmonic space. The sifting convolution satisfies a variety of
9112 desirable properties that are lacking in alternate definitions, namely: it
9113 supports directional kernels; it has an output which remains on the sphere; and
9114 is efficient to compute. An illustration of the sifting convolution on a
9115 topographic map of the Earth demonstrates that it supports directional kernels
9116 to perform anisotropic filtering, while its output remains on the sphere.
9117 </p>
9118 </description>
9119 <guid isPermaLink="false">oai:arXiv.org:2007.12153</guid>
9120 </item>
9121 <item>
9122 <title>Revisiting Locality in Binary-Integer Representations. (arXiv:2007.12159v2 [cs.NE] UPDATED)</title>
9123 <link>http://fr.arxiv.org/abs/2007.12159</link>
9124 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shastri_H/0/1/0/all/0/1">Hrishee Shastri</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Frachtenberg_E/0/1/0/all/0/1">Eitan Frachtenberg</a></p>
9125
9126 <p>Mutation and recombination operators play a key role in determining the speed
9127 and quality of Genetic and Evolutionary Algorithms (GEAs). Prior work has
9128 analyzed the effects of these operators on genotypic variation, often using
9129 locality metrics that measure the sensitivity and stability of
9130 genotype-phenotype representations to these operators.
9131 </p>
9132 <p>In this paper, we focus on an important subset of representations, namely
9133 nonredundant bitstring-to-integer representations, and analyze them through the
9134 lens of Rothlauf's widely used locality metrics. We first define locality
9135 metrics equivalent to Rothlauf's that are tailored to our domain: the
9136 \textit{point locality} for single-bit mutation and \textit{general locality}
9137 for recombination. With these definitions, we derive tight bounds and a closed
9138 form expected value for point locality. For general locality we show that it is
9139 asymptotically equivalent across all representations and operators. We also
9140 recreate three established GEA experiments to understand the predictive power
9141 of point locality on GEA performance, focusing on two popular and often
9142 juxtaposed representations: standard binary and binary reflected Gray.
9143 </p>
9144 <p>We show that standard binary has provably no worse locality than any Gray
9145 encoding, including binary reflected Gray. We discuss this result in the
9146 context of previous studies that found binary reflected Gray to outperform
9147 standard binary, and we argue that locality cannot be the explanation for
9148 strong performance. Finally, we provide empirical evidence that weak point
9149 locality representations can be beneficial to performance in the exploration
9150 phase of the GEA, while strong point locality representations are more
9151 beneficial in the exploitation phase.
9152 </p>
9153 </description>
9154 <guid isPermaLink="false">oai:arXiv.org:2007.12159</guid>
9155 </item>
9156 <item>
9157 <title>YOLOpeds: Efficient Real-Time Single-Shot Pedestrian Detection for Smart Camera Applications. (arXiv:2007.13404v2 [cs.CV] UPDATED)</title>
9158 <link>http://fr.arxiv.org/abs/2007.13404</link>
9159 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kyrkou_C/0/1/0/all/0/1">Christos Kyrkou</a></p>
9160
9161 <p>Deep Learning-based object detectors can enhance the capabilities of smart
9162 camera systems in a wide spectrum of machine vision applications including
9163 video surveillance, autonomous driving, robots and drones, smart factory, and
9164 health monitoring. Pedestrian detection plays a key role in all these
9165 applications and deep learning can be used to construct accurate
9166 state-of-the-art detectors. However, such complex paradigms do not scale easily
9167 and are not traditionally implemented in resource-constrained smart cameras for
9168 on-device processing which offers significant advantages in situations when
9169 real-time monitoring and robustness are vital. Efficient neural networks can
9170 not only enable mobile applications and on-device experiences but can also be a
9171 key enabler of privacy and security allowing a user to gain the benefits of
9172 neural networks without needing to send their data to the server to be
9173 evaluated. This work addresses the challenge of achieving a good trade-off
9174 between accuracy and speed for efficient deployment of deep-learning-based
9175 pedestrian detection in smart camera applications. A computationally efficient
9176 architecture is introduced based on separable convolutions and proposes
9177 integrating dense connections across layers and multi-scale feature fusion to
9178 improve representational capacity while decreasing the number of parameters and
9179 operations. In particular, the contributions of this work are the following: 1)
9180 An efficient backbone combining multi-scale feature operations, 2) a more
9181 elaborate loss function for improved localization, 3) an anchor-less approach
9182 for detection, The proposed approach called YOLOpeds is evaluated using the
9183 PETS2009 surveillance dataset on 320x320 images. Overall, YOLOpeds provides
9184 real-time sustained operation of over 30 frames per second with detection rates
9185 in the range of 86% outperforming existing deep learning models.
9186 </p>
9187 </description>
9188 <guid isPermaLink="false">oai:arXiv.org:2007.13404</guid>
9189 </item>
9190 <item>
9191 <title>Regularization by Denoising via Fixed-Point Projection (RED-PRO). (arXiv:2008.00226v2 [eess.IV] UPDATED)</title>
9192 <link>http://fr.arxiv.org/abs/2008.00226</link>
9193 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Cohen_R/0/1/0/all/0/1">Regev Cohen</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Elad_M/0/1/0/all/0/1">Michael Elad</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Milanfar_P/0/1/0/all/0/1">Peyman Milanfar</a></p>
9194
9195 <p>Inverse problems in image processing are typically cast as optimization
9196 tasks, consisting of data-fidelity and stabilizing regularization terms. A
9197 recent regularization strategy of great interest utilizes the power of
9198 denoising engines. Two such methods are the Plug-and-Play Prior (PnP) and
9199 Regularization by Denoising (RED). While both have shown state-of-the-art
9200 results in various recovery tasks, their theoretical justification is
9201 incomplete. In this paper, we aim to bridge between RED and PnP, enriching the
9202 understanding of both frameworks. Towards that end, we reformulate RED as a
9203 convex optimization problem utilizing a projection (RED-PRO) onto the
9204 fixed-point set of demicontractive denoisers. We offer a simple iterative
9205 solution to this problem, by which we show that PnP proximal gradient method is
9206 a special case of RED-PRO, while providing guarantees for the convergence of
9207 both frameworks to globally optimal solutions. In addition, we present
9208 relaxations of RED-PRO that allow for handling denoisers with limited
9209 fixed-point sets. Finally, we demonstrate RED-PRO for the tasks of image
9210 deblurring and super-resolution, showing improved results with respect to the
9211 original RED framework.
9212 </p>
9213 </description>
9214 <guid isPermaLink="false">oai:arXiv.org:2008.00226</guid>
9215 </item>
9216 <item>
9217 <title>A Matrix Chernoff Bound for Markov Chains and Its Application to Co-occurrence Matrices. (arXiv:2008.02464v2 [stat.ML] UPDATED)</title>
9218 <link>http://fr.arxiv.org/abs/2008.02464</link>
9219 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Qiu_J/0/1/0/all/0/1">Jiezhong Qiu</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Wang_C/0/1/0/all/0/1">Chi Wang</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Liao_B/0/1/0/all/0/1">Ben Liao</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Peng_R/0/1/0/all/0/1">Richard Peng</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Tang_J/0/1/0/all/0/1">Jie Tang</a></p>
9220
9221 <p>We prove a Chernoff-type bound for sums of matrix-valued random variables
9222 sampled via a regular (aperiodic and irreducible) finite Markov chain.
9223 Specially, consider a random walk on a regular Markov chain and a Hermitian
9224 matrix-valued function on its state space. Our result gives exponentially
9225 decreasing bounds on the tail distributions of the extreme eigenvalues of the
9226 sample mean matrix. Our proof is based on the matrix expander (regular
9227 undirected graph) Chernoff bound [Garg et al. STOC '18] and scalar
9228 Chernoff-Hoeffding bounds for Markov chains [Chung et al. STACS '12].
9229 </p>
9230 <p>Our matrix Chernoff bound for Markov chains can be applied to analyze the
9231 behavior of co-occurrence statistics for sequential data, which have been
9232 common and important data signals in machine learning. We show that given a
9233 regular Markov chain with $n$ states and mixing time $\tau$, we need a
9234 trajectory of length $O(\tau (\log{(n)}+\log{(\tau)})/\epsilon^2)$ to achieve
9235 an estimator of the co-occurrence matrix with error bound $\epsilon$. We
9236 conduct several experiments and the experimental results are consistent with
9237 the exponentially fast convergence rate from theoretical analysis. Our result
9238 gives the first bound on the convergence rate of the co-occurrence matrix and
9239 the first sample complexity analysis in graph representation learning.
9240 </p>
9241 </description>
9242 <guid isPermaLink="false">oai:arXiv.org:2008.02464</guid>
9243 </item>
9244 <item>
9245 <title>Integration of the 3D Environment for UAV Onboard Visual Object Tracking. (arXiv:2008.02834v3 [cs.CV] UPDATED)</title>
9246 <link>http://fr.arxiv.org/abs/2008.02834</link>
9247 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Vujasinovic_S/0/1/0/all/0/1">St&#xe9;phane Vujasinovi&#x107;</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Becker_S/0/1/0/all/0/1">Stefan Becker</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Breuer_T/0/1/0/all/0/1">Timo Breuer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bullinger_S/0/1/0/all/0/1">Sebastian Bullinger</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Scherer_Negenborn_N/0/1/0/all/0/1">Norbert Scherer-Negenborn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Arens_M/0/1/0/all/0/1">Michael Arens</a></p>
9248
9249 <p>Single visual object tracking from an unmanned aerial vehicle (UAV) poses
9250 fundamental challenges such as object occlusion, small-scale objects,
9251 background clutter, and abrupt camera motion. To tackle these difficulties, we
9252 propose to integrate the 3D structure of the observed scene into a
9253 detection-by-tracking algorithm. We introduce a pipeline that combines a
9254 model-free visual object tracker, a sparse 3D reconstruction, and a state
9255 estimator. The 3D reconstruction of the scene is computed with an image-based
9256 Structure-from-Motion (SfM) component that enables us to leverage a state
9257 estimator in the corresponding 3D scene during tracking. By representing the
9258 position of the target in 3D space rather than in image space, we stabilize the
9259 tracking during ego-motion and improve the handling of occlusions, background
9260 clutter, and small-scale objects. We evaluated our approach on prototypical
9261 image sequences, captured from a UAV with low-altitude oblique views. For this
9262 purpose, we adapted an existing dataset for visual object tracking and
9263 reconstructed the observed scene in 3D. The experimental results demonstrate
9264 that the proposed approach outperforms methods using plain visual cues as well
9265 as approaches leveraging image-space-based state estimations. We believe that
9266 our approach can be beneficial for traffic monitoring, video surveillance, and
9267 navigation.
9268 </p>
9269 </description>
9270 <guid isPermaLink="false">oai:arXiv.org:2008.02834</guid>
9271 </item>
9272 <item>
9273 <title>Lifted Multiplicity Codes. (arXiv:2008.04717v2 [cs.IT] UPDATED)</title>
9274 <link>http://fr.arxiv.org/abs/2008.04717</link>
9275 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Holzbaur_L/0/1/0/all/0/1">Lukas Holzbaur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Polyanskaya_R/0/1/0/all/0/1">Rina Polyanskaya</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Polyanskii_N/0/1/0/all/0/1">Nikita Polyanskii</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vorobyev_I/0/1/0/all/0/1">Ilya Vorobyev</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yaakobi_E/0/1/0/all/0/1">Eitan Yaakobi</a></p>
9276
9277 <p>Lifted Reed-Solomon codes and multiplicity codes are two classes of
9278 evaluation codes that allow for the design of high-rate codes that can recover
9279 every codeword or information symbol from many disjoint sets. Recently, the
9280 underlying approaches have been combined to construct lifted bi-variate
9281 multiplicity codes, that can further improve on the rate. We continue the study
9282 of these codes by providing lower bounds on the rate and distance for lifted
9283 multiplicity codes obtained from polynomials in an arbitrary number of
9284 variables. Specifically, we investigate a subcode of a lifted multiplicity code
9285 formed by the linear span of $m$-variate monomials whose restriction to an
9286 arbitrary line in $\mathbb{F}_q^m$ is equivalent to a low-degree uni-variate
9287 polynomial. We find the tight asymptotic behavior of the fraction of such
9288 monomials when the number of variables $m$ is fixed and the alphabet size
9289 $q=2^\ell$ is large. For some parameter regimes, lifted multiplicity codes are
9290 then shown to have a better trade-off between redundancy and the number of
9291 disjoint recovering sets for every codeword or information symbol than
9292 previously known constructions. Additionally, we present a local
9293 self-correction algorithm for lifted multiplicity codes.
9294 </p>
9295 </description>
9296 <guid isPermaLink="false">oai:arXiv.org:2008.04717</guid>
9297 </item>
9298 <item>
9299 <title>A Composable Specification Language for Reinforcement Learning Tasks. (arXiv:2008.09293v2 [cs.LG] UPDATED)</title>
9300 <link>http://fr.arxiv.org/abs/2008.09293</link>
9301 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jothimurugan_K/0/1/0/all/0/1">Kishor Jothimurugan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Alur_R/0/1/0/all/0/1">Rajeev Alur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bastani_O/0/1/0/all/0/1">Osbert Bastani</a></p>
9302
9303 <p>Reinforcement learning is a promising approach for learning control policies
9304 for robot tasks. However, specifying complex tasks (e.g., with multiple
9305 objectives and safety constraints) can be challenging, since the user must
9306 design a reward function that encodes the entire task. Furthermore, the user
9307 often needs to manually shape the reward to ensure convergence of the learning
9308 algorithm. We propose a language for specifying complex control tasks, along
9309 with an algorithm that compiles specifications in our language into a reward
9310 function and automatically performs reward shaping. We implement our approach
9311 in a tool called SPECTRL, and show that it outperforms several state-of-the-art
9312 baselines.
9313 </p>
9314 </description>
9315 <guid isPermaLink="false">oai:arXiv.org:2008.09293</guid>
9316 </item>
9317 <item>
9318 <title>Gravilon: Applications of a New Gradient Descent Method to Machine Learning. (arXiv:2008.11370v2 [cs.LG] UPDATED)</title>
9319 <link>http://fr.arxiv.org/abs/2008.11370</link>
9320 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kelterborn_C/0/1/0/all/0/1">Chad Kelterborn</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mazur_M/0/1/0/all/0/1">Marcin Mazur</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Petrenko_B/0/1/0/all/0/1">Bogdan V. Petrenko</a></p>
9321
9322 <p>Gradient descent algorithms have been used in countless applications since
9323 the inception of Newton's method. The explosion in the number of applications
9324 of neural networks has re-energized efforts in recent years to improve the
9325 standard gradient descent method in both efficiency and accuracy. These methods
9326 modify the effect of the gradient in updating the values of the parameters.
9327 These modifications often incorporate hyperparameters: additional variables
9328 whose values must be specified at the outset of the program. We provide, below,
9329 a novel gradient descent algorithm, called Gravilon, that uses the geometry of
9330 the hypersurface to modify the length of the step in the direction of the
9331 gradient. Using neural networks, we provide promising experimental results
9332 comparing the accuracy and efficiency of the Gravilon method against commonly
9333 used gradient descent algorithms on MNIST digit classification.
9334 </p>
9335 </description>
9336 <guid isPermaLink="false">oai:arXiv.org:2008.11370</guid>
9337 </item>
9338 <item>
9339 <title>On the model-based stochastic value gradient for continuous reinforcement learning. (arXiv:2008.12775v2 [cs.LG] UPDATED)</title>
9340 <link>http://fr.arxiv.org/abs/2008.12775</link>
9341 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Amos_B/0/1/0/all/0/1">Brandon Amos</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Stanton_S/0/1/0/all/0/1">Samuel Stanton</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yarats_D/0/1/0/all/0/1">Denis Yarats</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wilson_A/0/1/0/all/0/1">Andrew Gordon Wilson</a></p>
9342
9343 <p>Model-based reinforcement learning approaches add explicit domain knowledge
9344 to agents in hopes of improving the sample-efficiency in comparison to
9345 model-free agents. However, in practice model-based methods are unable to
9346 achieve the same asymptotic performance on challenging continuous control tasks
9347 due to the complexity of learning and controlling an explicit world model. In
9348 this paper we investigate the stochastic value gradient (SVG), which is a
9349 well-known family of methods for controlling continuous systems which includes
9350 model-based approaches that distill a model-based value expansion into a
9351 model-free policy. We consider a variant of the model-based SVG that scales to
9352 larger systems and uses 1) an entropy regularization to help with exploration,
9353 2) a learned deterministic world model to improve the short-horizon value
9354 estimate, and 3) a learned model-free value estimate after the model's rollout.
9355 This SVG variation captures the model-free soft actor-critic method as an
9356 instance when the model rollout horizon is zero, and otherwise uses
9357 short-horizon model rollouts to improve the value estimate for the policy
9358 update. We surpass the asymptotic performance of other model-based methods on
9359 the proprioceptive MuJoCo locomotion tasks from the OpenAI gym, including a
9360 humanoid. We notably achieve these results with a simple deterministic world
9361 model without requiring an ensemble.
9362 </p>
9363 </description>
9364 <guid isPermaLink="false">oai:arXiv.org:2008.12775</guid>
9365 </item>
9366 <item>
9367 <title>Introduction to logistic regression. (arXiv:2008.13567v2 [stat.ME] UPDATED)</title>
9368 <link>http://fr.arxiv.org/abs/2008.13567</link>
9369 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Chung_M/0/1/0/all/0/1">Moo K. Chung</a></p>
9370
9371 <p>For random field theory based multiple comparison corrections In brain
9372 imaging, it is often necessary to compute the distribution of the supremum of a
9373 random field. Unfortunately, computing the distribution of the supremum of the
9374 random field is not easy and requires satisfying many distributional
9375 assumptions that may not be true in real data. Thus, there is a need to come up
9376 with a different framework that does not use the traditional statistical
9377 hypothesis testing paradigm that requires to compute p-values. With this as a
9378 motivation, we can use a different approach called the logistic regression that
9379 does not require computing the p-value and still be able to localize the
9380 regions of brain network differences. Unlike other discriminant and
9381 classification techniques that tried to classify preselected feature vectors,
9382 the method here does not require any preselected feature vectors and performs
9383 the classification at each edge level.
9384 </p>
9385 </description>
9386 <guid isPermaLink="false">oai:arXiv.org:2008.13567</guid>
9387 </item>
9388 <item>
9389 <title>Individuation and Adaptation in Complex Systems. (arXiv:2009.00110v2 [cs.NE] UPDATED)</title>
9390 <link>http://fr.arxiv.org/abs/2009.00110</link>
9391 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Fabbro_O/0/1/0/all/0/1">Olivier Del Fabbro</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Christen_P/0/1/0/all/0/1">Patrik Christen</a></p>
9392
9393 <p>Complex systems have certain characteristics such as network structures of a
9394 large number of individual elements, adaptation, and emergence. While these
9395 characteristics have been studied and described, it is often not so clear where
9396 they exactly come from. There is a focus on concrete system states rather than
9397 the emergence of the computer models themselves used to study these systems. To
9398 better understand typical characteristics of complex systems and their
9399 emergence, we recently presented a system metamodel based on which computer
9400 models can be created from abstract building blocks. In this study we extend
9401 our system metamodel with the concept of adaption in order to integrate
9402 adaptive computation in our so-called allagmatic method - a framework
9403 consisting of the system metamodel but also a way to study the creation of the
9404 computer model itself. Running experiments with cellular automata and
9405 artificial neural networks, we find that the system metamodel integrates
9406 adaptation with an additional operation called adaptation function that
9407 operates on the update function, which encodes the system's dynamics. It allows
9408 the creation of adaptive computations by providing an abstract template for
9409 adaptation and guidance for implementation. Further, the object-oriented and
9410 template meta-programming leads to a creation of computer models comparable to
9411 the individuation of observed systems. It therefore allows to study not only
9412 the behaviour of a running model but also its creation. The development of the
9413 system metamodel was first inspired by concepts of the philosophy of
9414 individuation of Gilbert Simondon. The theoretical background for the concept
9415 of adaptation is taken from the philosophy of organism of Alfred North
9416 Whitehead. In general, through the possibility to follow individuation, the
9417 allagmatic method allows to better understand the emergence of typical
9418 characteristics of complex systems.
9419 </p>
9420 </description>
9421 <guid isPermaLink="false">oai:arXiv.org:2009.00110</guid>
9422 </item>
9423 <item>
9424 <title>Distance Encoding: Design Provably More Powerful Neural Networks for Graph Representation Learning. (arXiv:2009.00142v4 [cs.LG] UPDATED)</title>
9425 <link>http://fr.arxiv.org/abs/2009.00142</link>
9426 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_P/0/1/0/all/0/1">Pan Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yanbang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_H/0/1/0/all/0/1">Hongwei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leskovec_J/0/1/0/all/0/1">Jure Leskovec</a></p>
9427
9428 <p>Learning representations of sets of nodes in a graph is crucial for
9429 applications ranging from node-role discovery to link prediction and molecule
9430 classification. Graph Neural Networks (GNNs) have achieved great success in
9431 graph representation learning. However, expressive power of GNNs is limited by
9432 the 1-Weisfeiler-Lehman (WL) test and thus GNNs generate identical
9433 representations for graph substructures that may in fact be very different.
9434 More powerful GNNs, proposed recently by mimicking higher-order-WL tests, only
9435 focus on representing entire graphs and they are computationally inefficient as
9436 they cannot utilize sparsity of the underlying graph. Here we propose and
9437 mathematically analyze a general class of structure-related features, termed
9438 Distance Encoding (DE). DE assists GNNs in representing any set of nodes, while
9439 providing strictly more expressive power than the 1-WL test. DE captures the
9440 distance between the node set whose representation is to be learned and each
9441 node in the graph. To capture the distance DE can apply various graph-distance
9442 measures such as shortest path distance or generalized PageRank scores. We
9443 propose two ways for GNNs to use DEs (1) as extra node features, and (2) as
9444 controllers of message aggregation in GNNs. Both approaches can utilize the
9445 sparse structure of the underlying graph, which leads to computational
9446 efficiency and scalability. We also prove that DE can distinguish node sets
9447 embedded in almost all regular graphs where traditional GNNs always fail. We
9448 evaluate DE on three tasks over six real networks: structural role prediction,
9449 link prediction, and triangle prediction. Results show that our models
9450 outperform GNNs without DE by up-to 15\% in accuracy and AUROC. Furthermore,
9451 our models also significantly outperform other state-of-the-art methods
9452 especially designed for the above tasks.
9453 </p>
9454 </description>
9455 <guid isPermaLink="false">oai:arXiv.org:2009.00142</guid>
9456 </item>
9457 <item>
9458 <title>Accelerated reactive transport simulations in heterogeneous porous media using Reaktoro and Firedrake. (arXiv:2009.01194v2 [cs.CE] UPDATED)</title>
9459 <link>http://fr.arxiv.org/abs/2009.01194</link>
9460 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kyas_S/0/1/0/all/0/1">Svetlana Kyas</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Volpatto_D/0/1/0/all/0/1">Diego Volpatto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Saar_M/0/1/0/all/0/1">Martin O. Saar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Leal_A/0/1/0/all/0/1">Allan M. M. Leal</a></p>
9461
9462 <p>This work investigates the performance of the on-demand machine learning
9463 (ODML) algorithm introduced in Leal et al. (2020) when applied to different
9464 reactive transport problems in heterogeneous porous media. ODML was devised to
9465 accelerate the computationally expensive geochemical reaction calculations in
9466 reactive transport simulations. We demonstrate that the ODML algorithm speeds
9467 up these calculations by one to three orders of magnitude. Such acceleration,
9468 in turn, significantly accelerates the entire reactive transport simulation.
9469 The numerical experiments are performed by implementing the coupling of two
9470 open-source software packages: Reaktoro (Leal, 2015) and Firedrake (Rathgeber
9471 et al., 2016).
9472 </p>
9473 </description>
9474 <guid isPermaLink="false">oai:arXiv.org:2009.01194</guid>
9475 </item>
9476 <item>
9477 <title>Analysis of Uplink IRS-Assisted NOMA under Nakagami-m Fading via Moments Matching. (arXiv:2009.03133v2 [cs.IT] UPDATED)</title>
9478 <link>http://fr.arxiv.org/abs/2009.03133</link>
9479 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tahir_B/0/1/0/all/0/1">Bashar Tahir</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Schwarz_S/0/1/0/all/0/1">Stefan Schwarz</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rupp_M/0/1/0/all/0/1">Markus Rupp</a></p>
9480
9481 <p>This letter investigates the uplink outage performance of intelligent
9482 reflecting surface (IRS)-assisted non-orthogonal multiple access (NOMA). We
9483 consider the general case where all users have both direct and reflection
9484 links, and all links undergo Nakagami-m fading. We approximate the received
9485 powers of the NOMA users as Gamma random variables via moments matching. This
9486 allows for tractable expressions of the outage under interference cancellation
9487 (IC), while being flexible in modeling various propagation environments. Our
9488 analysis shows that under certain conditions, the presence of an IRS might
9489 degrade the performance of users that have dominant line-of-sight (LOS) to the
9490 base station (BS), while users dominated by non-line-of-sight (NLOS) will
9491 always benefit from it.
9492 </p>
9493 </description>
9494 <guid isPermaLink="false">oai:arXiv.org:2009.03133</guid>
9495 </item>
9496 <item>
9497 <title>Physically Embedded Planning Problems: New Challenges for Reinforcement Learning. (arXiv:2009.05524v2 [cs.AI] UPDATED)</title>
9498 <link>http://fr.arxiv.org/abs/2009.05524</link>
9499 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Mirza_M/0/1/0/all/0/1">Mehdi Mirza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jaegle_A/0/1/0/all/0/1">Andrew Jaegle</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hunt_J/0/1/0/all/0/1">Jonathan J. Hunt</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Guez_A/0/1/0/all/0/1">Arthur Guez</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tunyasuvunakool_S/0/1/0/all/0/1">Saran Tunyasuvunakool</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Muldal_A/0/1/0/all/0/1">Alistair Muldal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Weber_T/0/1/0/all/0/1">Th&#xe9;ophane Weber</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Karkus_P/0/1/0/all/0/1">Peter Karkus</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Racaniere_S/0/1/0/all/0/1">S&#xe9;bastien Racani&#xe8;re</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Buesing_L/0/1/0/all/0/1">Lars Buesing</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lillicrap_T/0/1/0/all/0/1">Timothy Lillicrap</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Heess_N/0/1/0/all/0/1">Nicolas Heess</a></p>
9500
9501 <p>Recent work in deep reinforcement learning (RL) has produced algorithms
9502 capable of mastering challenging games such as Go, chess, or shogi. In these
9503 works the RL agent directly observes the natural state of the game and controls
9504 that state directly with its actions. However, when humans play such games,
9505 they do not just reason about the moves but also interact with their physical
9506 environment. They understand the state of the game by looking at the physical
9507 board in front of them and modify it by manipulating pieces using touch and
9508 fine-grained motor control. Mastering complicated physical systems with
9509 abstract goals is a central challenge for artificial intelligence, but it
9510 remains out of reach for existing RL algorithms. To encourage progress towards
9511 this goal we introduce a set of physically embedded planning problems and make
9512 them publicly available. We embed challenging symbolic tasks (Sokoban,
9513 tic-tac-toe, and Go) in a physics engine to produce a set of tasks that require
9514 perception, reasoning, and motor control over long time horizons. Although
9515 existing RL algorithms can tackle the symbolic versions of these tasks, we find
9516 that they struggle to master even the simplest of their physically embedded
9517 counterparts. As a first step towards characterizing the space of solution to
9518 these tasks, we introduce a strong baseline that uses a pre-trained expert game
9519 player to provide hints in the abstract space to an RL agent's policy while
9520 training it on the full sensorimotor control task. The resulting agent solves
9521 many of the tasks, underlining the need for methods that bridge the gap between
9522 abstract planning and embodied control. See illustrating video at
9523 https://youtu.be/RwHiHlym_1k.
9524 </p>
9525 </description>
9526 <guid isPermaLink="false">oai:arXiv.org:2009.05524</guid>
9527 </item>
9528 <item>
9529 <title>Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses. (arXiv:2009.07165v3 [cs.LG] UPDATED)</title>
9530 <link>http://fr.arxiv.org/abs/2009.07165</link>
9531 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Rawal_K/0/1/0/all/0/1">Kaivalya Rawal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lakkaraju_H/0/1/0/all/0/1">Himabindu Lakkaraju</a></p>
9532
9533 <p>As predictive models are increasingly being deployed in high-stakes
9534 decision-making, there has been a lot of interest in developing algorithms
9535 which can provide recourses to affected individuals. While developing such
9536 tools is important, it is even more critical to analyse and interpret a
9537 predictive model, and vet it thoroughly to ensure that the recourses it offers
9538 are meaningful and non-discriminatory before it is deployed in the real world.
9539 To this end, we propose a novel model agnostic framework called Actionable
9540 Recourse Summaries (AReS) to construct global counterfactual explanations which
9541 provide an interpretable and accurate summary of recourses for the entire
9542 population. We formulate a novel objective which simultaneously optimizes for
9543 correctness of the recourses and interpretability of the explanations, while
9544 minimizing overall recourse costs across the entire population. More
9545 specifically, our objective enables us to learn, with optimality guarantees on
9546 recourse correctness, a small number of compact rule sets each of which capture
9547 recourses for well defined subpopulations within the data. We also demonstrate
9548 theoretically that several of the prior approaches proposed to generate
9549 recourses for individuals are special cases of our framework. Experimental
9550 evaluation with real world datasets and user studies demonstrate that our
9551 framework can provide decision makers with a comprehensive overview of
9552 recourses corresponding to any black box model, and consequently help detect
9553 undesirable model biases and discrimination.
9554 </p>
9555 </description>
9556 <guid isPermaLink="false">oai:arXiv.org:2009.07165</guid>
9557 </item>
9558 <item>
9559 <title>CorDEL: A Contrastive Deep Learning Approach for Entity Linkage. (arXiv:2009.07203v2 [cs.DB] UPDATED)</title>
9560 <link>http://fr.arxiv.org/abs/2009.07203</link>
9561 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1">Zhengyang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sisman_B/0/1/0/all/0/1">Bunyamin Sisman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wei_H/0/1/0/all/0/1">Hao Wei</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_X/0/1/0/all/0/1">Xin Luna Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ji_S/0/1/0/all/0/1">Shuiwang Ji</a></p>
9562
9563 <p>Entity linkage (EL) is a critical problem in data cleaning and integration.
9564 In the past several decades, EL has typically been done by rule-based systems
9565 or traditional machine learning models with hand-curated features, both of
9566 which heavily depend on manual human inputs. With the ever-increasing growth of
9567 new data, deep learning (DL) based approaches have been proposed to alleviate
9568 the high cost of EL associated with the traditional models. Existing
9569 exploration of DL models for EL strictly follows the well-known twin-network
9570 architecture. However, we argue that the twin-network architecture is
9571 sub-optimal to EL, leading to inherent drawbacks of existing models. In order
9572 to address the drawbacks, we propose a novel and generic contrastive DL
9573 framework for EL. The proposed framework is able to capture both syntactic and
9574 semantic matching signals and pays attention to subtle but critical
9575 differences. Based on the framework, we develop a contrastive DL approach for
9576 EL, called CorDEL, with three powerful variants. We evaluate CorDEL with
9577 extensive experiments conducted on both public benchmark datasets and a
9578 real-world dataset. CorDEL outperforms previous state-of-the-art models by 5.2%
9579 on public benchmark datasets. Moreover, CorDEL yields a 2.4% improvement over
9580 the current best DL model on the real-world dataset, while reducing the number
9581 of training parameters by 97.6%.
9582 </p>
9583 </description>
9584 <guid isPermaLink="false">oai:arXiv.org:2009.07203</guid>
9585 </item>
9586 <item>
9587 <title>Autoregressive Knowledge Distillation through Imitation Learning. (arXiv:2009.07253v2 [cs.CL] UPDATED)</title>
9588 <link>http://fr.arxiv.org/abs/2009.07253</link>
9589 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lin_A/0/1/0/all/0/1">Alexander Lin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wohlwend_J/0/1/0/all/0/1">Jeremy Wohlwend</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_H/0/1/0/all/0/1">Howard Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lei_T/0/1/0/all/0/1">Tao Lei</a></p>
9590
9591 <p>The performance of autoregressive models on natural language generation tasks
9592 has dramatically improved due to the adoption of deep, self-attentive
9593 architectures. However, these gains have come at the cost of hindering
9594 inference speed, making state-of-the-art models cumbersome to deploy in
9595 real-world, time-sensitive settings. We develop a compression technique for
9596 autoregressive models that is driven by an imitation learning perspective on
9597 knowledge distillation. The algorithm is designed to address the exposure bias
9598 problem. On prototypical language generation tasks such as translation and
9599 summarization, our method consistently outperforms other distillation
9600 algorithms, such as sequence-level knowledge distillation. Student models
9601 trained with our method attain 1.4 to 4.8 BLEU/ROUGE points higher than those
9602 trained from scratch, while increasing inference speed by up to 14 times in
9603 comparison to the teacher model.
9604 </p>
9605 </description>
9606 <guid isPermaLink="false">oai:arXiv.org:2009.07253</guid>
9607 </item>
9608 <item>
9609 <title>Video based real-time positional tracker. (arXiv:2009.08276v3 [cs.CV] UPDATED)</title>
9610 <link>http://fr.arxiv.org/abs/2009.08276</link>
9611 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Albarracin_D/0/1/0/all/0/1">David Albarrac&#xed;n</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hormigo_J/0/1/0/all/0/1">Jes&#xfa;s Hormigo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fernandez_J/0/1/0/all/0/1">Jos&#xe9; David Fern&#xe1;ndez</a></p>
9612
9613 <p>We propose a system that uses video as the input to track the position of
9614 objects relative to their surrounding environment in real-time. The neural
9615 network employed is trained on a 100% synthetic dataset coming from our own
9616 automated generator. The positional tracker relies on a range of 1 to n video
9617 cameras placed around an arena of choice.
9618 </p>
9619 <p>The system returns the positions of the tracked objects relative to the
9620 broader world by understanding the overlapping matrices formed by the cameras
9621 and therefore these can be extrapolated into real world coordinates.
9622 </p>
9623 <p>In most cases, we achieve a higher update rate and positioning precision than
9624 any of the existing GPS-based systems, in particular for indoor objects or
9625 those occluded from clear sky.
9626 </p>
9627 </description>
9628 <guid isPermaLink="false">oai:arXiv.org:2009.08276</guid>
9629 </item>
9630 <item>
9631 <title>An Embedded Index Code Construction Using Sub-packetization. (arXiv:2009.11329v2 [cs.IT] UPDATED)</title>
9632 <link>http://fr.arxiv.org/abs/2009.11329</link>
9633 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Sasi_S/0/1/0/all/0/1">Shanuja Sasi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Aggarwal_V/0/1/0/all/0/1">Vaneet Aggarwal</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rajan_B/0/1/0/all/0/1">B. Sundar Rajan</a></p>
9634
9635 <p>A variant of the index coding problem (ICP), the embedded index coding
9636 problem (EICP) was introduced in [A. Porter and M. Wootters, "Embedded index
9637 coding," ITW, Sweden, 2019] which was motivated by its application in
9638 distributed computing where every user can act as sender for other users and an
9639 algorithm for code construction was reported. The constructions depends on the
9640 computation of minrank of a matrix, which is computationally intensive. In [A.
9641 Mahesh, N. Sageer Karat and B. S. Rajan, "Min-rank of Embedded Index Coding
9642 Problems," ISIT, 2020], for EICP, a notion of side-information matrix was
9643 introduced and it was proved that the length of an optimal scalar linear index
9644 code is equal to the min-rank of the side-information matrix. The authors have
9645 provided an explicit code construction for a class of EICP -
9646 \textit{Consecutive and Symmetric Embedded Index Coding Problem (CS-EICP)}. We
9647 introduce the idea of sub-packetization of the messages in index coding
9648 problems to provide a novel code construction for CS-EICP in contrast to the
9649 scalar linear solutions provided in the prior works. For CS-EICP, the
9650 normalized rate, which is defined as the number of bits transmitted by all the
9651 users together normalized by the total number of bits of all the messages, for
9652 our construction is lesser than the normalized rate achieved by Mahesh et
9653 al.,for scalar linear codes.
9654 </p>
9655 </description>
9656 <guid isPermaLink="false">oai:arXiv.org:2009.11329</guid>
9657 </item>
9658 <item>
9659 <title>Multi-scale Deep Neural Network (MscaleDNN) Methods for Oscillatory Stokes Flows in Complex Domains. (arXiv:2009.12729v2 [math.NA] UPDATED)</title>
9660 <link>http://fr.arxiv.org/abs/2009.12729</link>
9661 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Wang_B/0/1/0/all/0/1">Bo Wang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Zhang_W/0/1/0/all/0/1">Wenzhong Zhang</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Cai_W/0/1/0/all/0/1">Wei Cai</a></p>
9662
9663 <p>In this paper, we study a multi-scale deep neural network (MscaleDNN) as a
9664 meshless numerical method for computing oscillatory Stokes flows in complex
9665 domains. The MscaleDNN employs a multi-scale structure in the design of its DNN
9666 using radial scalings to convert the approximation of high frequency components
9667 of the highly oscillatory Stokes solution to one of lower frequencies. The
9668 MscaleDNN solution to the Stokes problem is obtained by minimizing a loss
9669 function in terms of L2 normof the residual of the Stokes equation. Three forms
9670 of loss functions are investigated based on vorticity-velocity-pressure,
9671 velocity-stress-pressure, and velocity-gradient of velocity-pressure
9672 formulations of the Stokes equation. We first conduct a systematic study of the
9673 MscaleDNN methods with various loss functions on the Kovasznay flow in
9674 comparison with normal fully connected DNNs. Then, Stokes flows with highly
9675 oscillatory solutions in a 2-D domain with six randomly placed holes are
9676 simulated by the MscaleDNN. The results show that MscaleDNN has faster
9677 convergence and consistent error decays in the simulation of Kovasznay flow for
9678 all four tested loss functions. More importantly, the MscaleDNN is capable of
9679 learning highly oscillatory solutions when the normal DNNs fail to converge.
9680 </p>
9681 </description>
9682 <guid isPermaLink="false">oai:arXiv.org:2009.12729</guid>
9683 </item>
9684 <item>
9685 <title>Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization. (arXiv:2009.12829v3 [cs.CV] UPDATED)</title>
9686 <link>http://fr.arxiv.org/abs/2009.12829</link>
9687 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_H/0/1/0/all/0/1">Haoliang Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">YuFei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wan_R/0/1/0/all/0/1">Renjie Wan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1">Shiqi Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_T/0/1/0/all/0/1">Tie-Qiang Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kot_A/0/1/0/all/0/1">Alex C. Kot</a></p>
9688
9689 <p>Recently, we have witnessed great progress in the field of medical imaging
9690 classification by adopting deep neural networks. However, the recent advanced
9691 models still require accessing sufficiently large and representative datasets
9692 for training, which is often unfeasible in clinically realistic environments.
9693 When trained on limited datasets, the deep neural network is lack of
9694 generalization capability, as the trained deep neural network on data within a
9695 certain distribution (e.g. the data captured by a certain device vendor or
9696 patient population) may not be able to generalize to the data with another
9697 distribution.
9698 </p>
9699 <p>In this paper, we introduce a simple but effective approach to improve the
9700 generalization capability of deep neural networks in the field of medical
9701 imaging classification. Motivated by the observation that the domain
9702 variability of the medical images is to some extent compact, we propose to
9703 learn a representative feature space through variational encoding with a novel
9704 linear-dependency regularization term to capture the shareable information
9705 among medical data collected from different domains. As a result, the trained
9706 neural network is expected to equip with better generalization capability to
9707 the "unseen" medical data. Experimental results on two challenging medical
9708 imaging classification tasks indicate that our method can achieve better
9709 cross-domain generalization capability compared with state-of-the-art
9710 baselines.
9711 </p>
9712 </description>
9713 <guid isPermaLink="false">oai:arXiv.org:2009.12829</guid>
9714 </item>
9715 <item>
9716 <title>Dual Attention Model for Citation Recommendation. (arXiv:2010.00182v4 [cs.IR] UPDATED)</title>
9717 <link>http://fr.arxiv.org/abs/2010.00182</link>
9718 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yang Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ma_Q/0/1/0/all/0/1">Qiang Ma</a></p>
9719
9720 <p>Based on an exponentially increasing number of academic articles, discovering
9721 and citing comprehensive and appropriate resources has become a non-trivial
9722 task. Conventional citation recommender methods suffer from severe information
9723 loss. For example, they do not consider the section of the paper that the user
9724 is writing and for which they need to find a citation, the relatedness between
9725 the words in the local context (the text span that describes a citation), or
9726 the importance on each word from the local context. These shortcomings make
9727 such methods insufficient for recommending adequate citations to academic
9728 manuscripts. In this study, we propose a novel embedding-based neural network
9729 called "dual attention model for citation recommendation (DACR)" to recommend
9730 citations during manuscript preparation. Our method adapts embedding of three
9731 dimensions of semantic information: words in the local context, structural
9732 contexts, and the section on which a user is working. A neural network is
9733 designed to maximize the similarity between the embedding of the three input
9734 (local context words, section and structural contexts) and the target citation
9735 appearing in the context. The core of the neural network is composed of
9736 self-attention and additive attention, where the former aims to capture the
9737 relatedness between the contextual words and structural context, and the latter
9738 aims to learn the importance of them. The experiments on real-world datasets
9739 demonstrate the effectiveness of the proposed approach.
9740 </p>
9741 </description>
9742 <guid isPermaLink="false">oai:arXiv.org:2010.00182</guid>
9743 </item>
9744 <item>
9745 <title>Pretrained Language Model Embryology: The Birth of ALBERT. (arXiv:2010.02480v2 [cs.CL] UPDATED)</title>
9746 <link>http://fr.arxiv.org/abs/2010.02480</link>
9747 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chiang_C/0/1/0/all/0/1">Cheng-Han Chiang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Huang_S/0/1/0/all/0/1">Sung-Feng Huang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Lee_H/0/1/0/all/0/1">Hung-yi Lee</a></p>
9748
9749 <p>While behaviors of pretrained language models (LMs) have been thoroughly
9750 examined, what happened during pretraining is rarely studied. We thus
9751 investigate the developmental process from a set of randomly initialized
9752 parameters to a totipotent language model, which we refer to as the embryology
9753 of a pretrained language model. Our results show that ALBERT learns to
9754 reconstruct and predict tokens of different parts of speech (POS) in different
9755 learning speeds during pretraining. We also find that linguistic knowledge and
9756 world knowledge do not generally improve as pretraining proceeds, nor do
9757 downstream tasks' performance. These findings suggest that knowledge of a
9758 pretrained model varies during pretraining, and having more pretrain steps does
9759 not necessarily provide a model with more comprehensive knowledge. We will
9760 provide source codes and pretrained models to reproduce our results at
9761 https://github.com/d223302/albert-embryology.
9762 </p>
9763 </description>
9764 <guid isPermaLink="false">oai:arXiv.org:2010.02480</guid>
9765 </item>
9766 <item>
9767 <title>Investigating African-American Vernacular English in Transformer-Based Text Generation. (arXiv:2010.02510v2 [cs.CL] UPDATED)</title>
9768 <link>http://fr.arxiv.org/abs/2010.02510</link>
9769 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Groenwold_S/0/1/0/all/0/1">Sophie Groenwold</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ou_L/0/1/0/all/0/1">Lily Ou</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Parekh_A/0/1/0/all/0/1">Aesha Parekh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Honnavalli_S/0/1/0/all/0/1">Samhita Honnavalli</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Levy_S/0/1/0/all/0/1">Sharon Levy</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mirza_D/0/1/0/all/0/1">Diba Mirza</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_W/0/1/0/all/0/1">William Yang Wang</a></p>
9770
9771 <p>The growth of social media has encouraged the written use of African American
9772 Vernacular English (AAVE), which has traditionally been used only in oral
9773 contexts. However, NLP models have historically been developed using dominant
9774 English varieties, such as Standard American English (SAE), due to text corpora
9775 availability. We investigate the performance of GPT-2 on AAVE text by creating
9776 a dataset of intent-equivalent parallel AAVE/SAE tweet pairs, thereby isolating
9777 syntactic structure and AAVE- or SAE-specific language for each pair. We
9778 evaluate each sample and its GPT-2 generated text with pretrained sentiment
9779 classifiers and find that while AAVE text results in more classifications of
9780 negative sentiment than SAE, the use of GPT-2 generally increases occurrences
9781 of positive sentiment for both. Additionally, we conduct human evaluation of
9782 AAVE and SAE text generated with GPT-2 to compare contextual rigor and overall
9783 quality.
9784 </p>
9785 </description>
9786 <guid isPermaLink="false">oai:arXiv.org:2010.02510</guid>
9787 </item>
9788 <item>
9789 <title>Improved Analysis of Clipping Algorithms for Non-convex Optimization. (arXiv:2010.02519v2 [cs.LG] UPDATED)</title>
9790 <link>http://fr.arxiv.org/abs/2010.02519</link>
9791 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_B/0/1/0/all/0/1">Bohang Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Jin_J/0/1/0/all/0/1">Jikai Jin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fang_C/0/1/0/all/0/1">Cong Fang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_L/0/1/0/all/0/1">Liwei Wang</a></p>
9792
9793 <p>Gradient clipping is commonly used in training deep neural networks partly
9794 due to its practicability in relieving the exploding gradient problem.
9795 Recently, \citet{zhang2019gradient} show that clipped (stochastic) Gradient
9796 Descent (GD) converges faster than vanilla GD/SGD via introducing a new
9797 assumption called $(L_0, L_1)$-smoothness, which characterizes the violent
9798 fluctuation of gradients typically encountered in deep neural networks.
9799 However, their iteration complexities on the problem-dependent parameters are
9800 rather pessimistic, and theoretical justification of clipping combined with
9801 other crucial techniques, e.g. momentum acceleration, are still lacking. In
9802 this paper, we bridge the gap by presenting a general framework to study the
9803 clipping algorithms, which also takes momentum methods into consideration. We
9804 provide convergence analysis of the framework in both deterministic and
9805 stochastic setting, and demonstrate the tightness of our results by comparing
9806 them with existing lower bounds. Our results imply that the efficiency of
9807 clipping methods will not degenerate even in highly non-smooth regions of the
9808 landscape. Experiments confirm the superiority of clipping-based methods in
9809 deep learning tasks.
9810 </p>
9811 </description>
9812 <guid isPermaLink="false">oai:arXiv.org:2010.02519</guid>
9813 </item>
9814 <item>
9815 <title>Improving Local Identifiability in Probabilistic Box Embeddings. (arXiv:2010.04831v2 [cs.LG] UPDATED)</title>
9816 <link>http://fr.arxiv.org/abs/2010.04831</link>
9817 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dasgupta_S/0/1/0/all/0/1">Shib Sankar Dasgupta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Boratko_M/0/1/0/all/0/1">Michael Boratko</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1">Dongxu Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vilnis_L/0/1/0/all/0/1">Luke Vilnis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_X/0/1/0/all/0/1">Xiang Lorraine Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+McCallum_A/0/1/0/all/0/1">Andrew McCallum</a></p>
9818
9819 <p>Geometric embeddings have recently received attention for their natural
9820 ability to represent transitive asymmetric relations via containment. Box
9821 embeddings, where objects are represented by n-dimensional hyperrectangles, are
9822 a particularly promising example of such an embedding as they are closed under
9823 intersection and their volume can be calculated easily, allowing them to
9824 naturally represent calibrated probability distributions. The benefits of
9825 geometric embeddings also introduce a problem of local identifiability,
9826 however, where whole neighborhoods of parameters result in equivalent loss
9827 which impedes learning. Prior work addressed some of these issues by using an
9828 approximation to Gaussian convolution over the box parameters, however, this
9829 intersection operation also increases the sparsity of the gradient. In this
9830 work, we model the box parameters with min and max Gumbel distributions, which
9831 were chosen such that space is still closed under the operation of the
9832 intersection. The calculation of the expected intersection volume involves all
9833 parameters, and we demonstrate experimentally that this drastically improves
9834 the ability of such models to learn.
9835 </p>
9836 </description>
9837 <guid isPermaLink="false">oai:arXiv.org:2010.04831</guid>
9838 </item>
9839 <item>
9840 <title>Neural-Symbolic Reasoning on Knowledge Graphs. (arXiv:2010.05446v3 [cs.AI] UPDATED)</title>
9841 <link>http://fr.arxiv.org/abs/2010.05446</link>
9842 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jing Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1">Bo Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_L/0/1/0/all/0/1">Lingxi Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ke_X/0/1/0/all/0/1">Xirui Ke</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ding_H/0/1/0/all/0/1">Haipeng Ding</a></p>
9843
9844 <p>Knowledge graph reasoning is the fundamental component to support machine
9845 learning applications such as information extraction, information retrieval and
9846 recommendation. Since knowledge graph can be viewed as the discrete symbolic
9847 representations of knowledge, reasoning on knowledge graphs can naturally
9848 leverage the symbolic techniques. However, symbolic reasoning is intolerant of
9849 the ambiguous and noisy data. On the contrary, the recent advances of deep
9850 learning promote neural reasoning on knowledge graphs, which is robust to the
9851 ambiguous and noisy data, but lacks interpretability compared to symbolic
9852 reasoning. Considering the advantages and disadvantages of both methodologies,
9853 recent efforts have been made on combining the two reasoning methods. In this
9854 survey, we take a thorough look at the development of the symbolic reasoning,
9855 neural reasoning and the neural-symbolic reasoning on knowledge graphs. We
9856 survey two specific reasoning tasks, knowledge graph completion and question
9857 answering on knowledge graphs, and explain them in a unified reasoning
9858 framework. We also briefly discuss the future directions for knowledge graph
9859 reasoning.
9860 </p>
9861 </description>
9862 <guid isPermaLink="false">oai:arXiv.org:2010.05446</guid>
9863 </item>
9864 <item>
9865 <title>On lattice point counting in $\Delta$-modular polyhedra. (arXiv:2010.05768v2 [cs.CC] UPDATED)</title>
9866 <link>http://fr.arxiv.org/abs/2010.05768</link>
9867 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gribanov_D/0/1/0/all/0/1">D.V. Gribanov</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zolotykh_N/0/1/0/all/0/1">N.Yu. Zolotykh</a></p>
9868
9869 <p>Let a polyhedron $P$ be defined by one of the following ways:
9870 </p>
9871 <p>(i) $P = \{x \in R^n \colon A x \leq b\}$, where $A \in Z^{(n+k) \times n}$,
9872 $b \in Z^{(n+k)}$ and $rank\, A = n$;
9873 </p>
9874 <p>(ii) $P = \{x \in R_+^n \colon A x = b\}$, where $A \in Z^{k \times n}$, $b
9875 \in Z^{k}$ and $rank\, A = k$.
9876 </p>
9877 <p>And let all rank order minors of $A$ be bounded by $\Delta$ in absolute
9878 values. We show that the short rational generating function for the power
9879 series $$ \sum\limits_{m \in P \cap Z^n} x^m $$ can be computed with the
9880 arithmetic complexity $ O\left(T_{SNF}(d) \cdot d^{k} \cdot d^{\log_2
9881 \Delta}\right), $ where $k$ and $\Delta$ are fixed, $d = \dim P$, and
9882 $T_{SNF}(m)$ is the complexity to compute the Smith Normal Form for $m \times
9883 m$ integer matrix. In particular, $d = n$ for the case (i) and $d = n-k$ for
9884 the case (ii).
9885 </p>
9886 <p>The simplest examples of polyhedra that meet conditions (i) or (ii) are the
9887 simplicies, the subset sum polytope and the knapsack or multidimensional
9888 knapsack polytopes.
9889 </p>
9890 <p>We apply these results to parametric polytopes, and show that the step
9891 polynomial representation of the function $c_P(y) = |P_{y} \cap Z^n|$, where
9892 $P_{y}$ is parametric polytope, can be computed by a polynomial time even in
9893 varying dimension if $P_{y}$ has a close structure to the cases (i) or (ii). As
9894 another consequence, we show that the coefficients $e_i(P,m)$ of the Ehrhart
9895 quasi-polynomial $$ \left| mP \cap Z^n\right| = \sum\limits_{j = 0}^n
9896 e_i(P,m)m^j $$ can be computed by a polynomial time algorithm for fixed $k$ and
9897 $\Delta$.
9898 </p>
9899 </description>
9900 <guid isPermaLink="false">oai:arXiv.org:2010.05768</guid>
9901 </item>
9902 <item>
9903 <title>CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations. (arXiv:2010.06351v3 [cs.CL] UPDATED)</title>
9904 <link>http://fr.arxiv.org/abs/2010.06351</link>
9905 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Luo_F/0/1/0/all/0/1">Fuli Luo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yang_P/0/1/0/all/0/1">Pengcheng Yang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_S/0/1/0/all/0/1">Shicheng Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ren_X/0/1/0/all/0/1">Xuancheng Ren</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Sun_X/0/1/0/all/0/1">Xu Sun</a></p>
9906
9907 <p>Pre-trained self-supervised models such as BERT have achieved striking
9908 success in learning sequence representations, especially for natural language
9909 processing. These models typically corrupt the given sequences with certain
9910 types of noise, such as masking, shuffling, or substitution, and then try to
9911 recover the original input. However, such pre-training approaches are prone to
9912 learning representations that are covariant with the noise, leading to the
9913 discrepancy between the pre-training and fine-tuning stage. To remedy this, we
9914 present ContrAstive Pre-Training (CAPT) to learn noise invariant sequence
9915 representations. The proposed CAPT encourages the consistency between
9916 representations of the original sequence and its corrupted version via
9917 unsupervised instance-wise training signals. In this way, it not only
9918 alleviates the pretrain-finetune discrepancy induced by the noise of
9919 pre-training, but also aids the pre-trained model in better capturing global
9920 semantics of the input via more effective sentence-level supervision. Different
9921 from most prior work that focuses on a particular modality, comprehensive
9922 empirical evidence on 11 natural language understanding and cross-modal tasks
9923 illustrates that CAPT is applicable for both language and vision-language
9924 tasks, and obtains surprisingly consistent improvement, including 0.6% absolute
9925 gain on GLUE benchmarks and 0.8% absolute increment on NLVR.
9926 </p>
9927 </description>
9928 <guid isPermaLink="false">oai:arXiv.org:2010.06351</guid>
9929 </item>
9930 <item>
9931 <title>Spherical Knowledge Distillation. (arXiv:2010.07485v2 [cs.LG] UPDATED)</title>
9932 <link>http://fr.arxiv.org/abs/2010.07485</link>
9933 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Guo_J/0/1/0/all/0/1">Jia Guo</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_M/0/1/0/all/0/1">Minghao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hu_Y/0/1/0/all/0/1">Yao Hu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhu_C/0/1/0/all/0/1">Chen Zhu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_X/0/1/0/all/0/1">Xiaofei He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cai_D/0/1/0/all/0/1">Deng Cai</a></p>
9934
9935 <p>Knowledge distillation aims at obtaining a small but effective deep model by
9936 transferring knowledge from a much larger one. The previous approaches try to
9937 reach this goal by simply "logit-supervised" information transferring between
9938 the teacher and student, which somehow can be subsequently decomposed as the
9939 transfer of normalized logits and $l^2$ norm. We argue that the norm of logits
9940 is actually interference, which damages the efficiency in the transfer process.
9941 To address this problem, we propose Spherical Knowledge Distillation (SKD).
9942 Specifically, we project the teacher and the student's logits into a unit
9943 sphere, and then we can efficiently perform knowledge distillation on the
9944 sphere. We verify our argument via theoretical analysis and ablation study.
9945 Extensive experiments have demonstrated the superiority and scalability of our
9946 method over the SOTAs.
9947 </p>
9948 </description>
9949 <guid isPermaLink="false">oai:arXiv.org:2010.07485</guid>
9950 </item>
9951 <item>
9952 <title>Measuring the Dynamic Impact of High-Speed Railways on Urban Interactions in China. (arXiv:2010.08182v3 [cs.SI] UPDATED)</title>
9953 <link>http://fr.arxiv.org/abs/2010.08182</link>
9954 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gong_J/0/1/0/all/0/1">Junfang Gong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Li_S/0/1/0/all/0/1">Shengwen Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ye_X/0/1/0/all/0/1">Xinyue Ye</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Peng_Q/0/1/0/all/0/1">Qiong Peng</a></p>
9955
9956 <p>High-speed rail (HSR) has become an important mode of inter-city
9957 transportation between large cities. Inter-city interaction facilitated by HSR
9958 tends to play a more prominent role in promoting urban and regional economic
9959 integration and development. Quantifying the impact of HSR's interaction on
9960 cities and people is therefore crucial for long-term urban and regional
9961 development planning and policy making. We develop an evaluation framework
9962 using toponym information from social media as a proxy to estimate the dynamics
9963 of such interactions. This paper adopts two types of spatial information:
9964 toponyms from social media posts, and the geographical location information
9965 embedded in social media posts. The framework highlights the asymmetric nature
9966 of social interaction among cities, and proposes a series of metrics to
9967 quantify such impact from multiple perspectives, including interaction
9968 strength, spatial decay, and channel effect. The results show that HSRs not
9969 only greatly expand the uneven distribution of inter-city connections, but also
9970 significantly reshape the interactions that occur along HSR routes through the
9971 channel effect.
9972 </p>
9973 </description>
9974 <guid isPermaLink="false">oai:arXiv.org:2010.08182</guid>
9975 </item>
9976 <item>
9977 <title>Learning Accurate Entropy Model with Global Reference for Image Compression. (arXiv:2010.08321v2 [eess.IV] UPDATED)</title>
9978 <link>http://fr.arxiv.org/abs/2010.08321</link>
9979 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Qian_Y/0/1/0/all/0/1">Yichen Qian</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Tan_Z/0/1/0/all/0/1">Zhiyu Tan</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sun_X/0/1/0/all/0/1">Xiuyu Sun</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Lin_M/0/1/0/all/0/1">Ming Lin</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_D/0/1/0/all/0/1">Dongyang Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Sun_Z/0/1/0/all/0/1">Zhenhong Sun</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_H/0/1/0/all/0/1">Hao Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Jin_R/0/1/0/all/0/1">Rong Jin</a></p>
9980
9981 <p>In recent deep image compression neural networks, the entropy model plays a
9982 critical role in estimating the prior distribution of deep image encodings.
9983 Existing methods combine hyperprior with local context in the entropy
9984 estimation function. This greatly limits their performance due to the absence
9985 of a global vision. In this work, we propose a novel Global Reference Model for
9986 image compression to effectively leverage both the local and the global context
9987 information, leading to an enhanced compression rate. The proposed method scans
9988 decoded latents and then finds the most relevant latent to assist the
9989 distribution estimating of the current latent. A by-product of this work is the
9990 innovation of a mean-shifting GDN module that further improves the performance.
9991 Experimental results demonstrate that the proposed model outperforms the
9992 rate-distortion performance of most of the state-of-the-art methods in the
9993 industry.
9994 </p>
9995 </description>
9996 <guid isPermaLink="false">oai:arXiv.org:2010.08321</guid>
9997 </item>
9998 <item>
9999 <title>A Grid-based Representation for Human Action Recognition. (arXiv:2010.08841v2 [cs.CV] UPDATED)</title>
10000 <link>http://fr.arxiv.org/abs/2010.08841</link>
10001 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Lamghari_S/0/1/0/all/0/1">Soufiane Lamghari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bilodeau_G/0/1/0/all/0/1">Guillaume-Alexandre Bilodeau</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Saunier_N/0/1/0/all/0/1">Nicolas Saunier</a></p>
10002
10003 <p>Human action recognition (HAR) in videos is a fundamental research topic in
10004 computer vision. It consists mainly in understanding actions performed by
10005 humans based on a sequence of visual observations. In recent years, HAR have
10006 witnessed significant progress, especially with the emergence of deep learning
10007 models. However, most of existing approaches for action recognition rely on
10008 information that is not always relevant for this task, and are limited in the
10009 way they fuse the temporal information. In this paper, we propose a novel
10010 method for human action recognition that encodes efficiently the most
10011 discriminative appearance information of an action with explicit attention on
10012 representative pose features, into a new compact grid representation. Our GRAR
10013 (Grid-based Representation for Action Recognition) method is tested on several
10014 benchmark datasets demonstrating that our model can accurately recognize human
10015 actions, despite intra-class appearance variations and occlusion challenges.
10016 </p>
10017 </description>
10018 <guid isPermaLink="false">oai:arXiv.org:2010.08841</guid>
10019 </item>
10020 <item>
10021 <title>What breach? Measuring online awareness of security incidents by studying real-world browsing behavior. (arXiv:2010.09843v2 [cs.CR] UPDATED)</title>
10022 <link>http://fr.arxiv.org/abs/2010.09843</link>
10023 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Bhagavatula_S/0/1/0/all/0/1">Sruti Bhagavatula</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bauer_L/0/1/0/all/0/1">Lujo Bauer</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kapadia_A/0/1/0/all/0/1">Apu Kapadia</a></p>
10024
10025 <p>Awareness about security and privacy risks is important for developing good
10026 security habits. Learning about real-world security incidents and data breaches
10027 can alert people to the ways in which their information is vulnerable online,
10028 thus playing a significant role in encouraging safe security behavior. This
10029 paper examines 1) how often people read about security incidents online, 2) of
10030 those people, whether and to what extent they follow up with an action, e.g.,
10031 by trying to read more about the incident, and 3) what influences the
10032 likelihood that they will read about an incident and take some action. We study
10033 this by quantitatively examining real-world internet-browsing data from 303
10034 participants.
10035 </p>
10036 <p>Our findings present a bleak view of awareness of security incidents. Only
10037 17% of participants visited any web pages related to six widely publicized
10038 large-scale security incidents; few read about one even when an incident was
10039 likely to have affected them (e.g., the Equifax breach almost universally
10040 affected people with Equifax credit reports). We further found that more severe
10041 incidents as well as articles that constructively spoke about the incident
10042 inspired more action. We conclude with recommendations for specific future
10043 research and for enabling useful security incident information to reach more
10044 people.
10045 </p>
10046 </description>
10047 <guid isPermaLink="false">oai:arXiv.org:2010.09843</guid>
10048 </item>
10049 <item>
10050 <title>VarGrad: A Low-Variance Gradient Estimator for Variational Inference. (arXiv:2010.10436v2 [stat.ML] UPDATED)</title>
10051 <link>http://fr.arxiv.org/abs/2010.10436</link>
10052 <description><p>Authors: <a href="http://fr.arxiv.org/find/stat/1/au:+Richter_L/0/1/0/all/0/1">Lorenz Richter</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Boustati_A/0/1/0/all/0/1">Ayman Boustati</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Nusken_N/0/1/0/all/0/1">Nikolas N&#xfc;sken</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Ruiz_F/0/1/0/all/0/1">Francisco J. R. Ruiz</a>, <a href="http://fr.arxiv.org/find/stat/1/au:+Akyildiz_O/0/1/0/all/0/1">&#xd6;mer Deniz Akyildiz</a></p>
10053
10054 <p>We analyse the properties of an unbiased gradient estimator of the ELBO for
10055 variational inference, based on the score function method with leave-one-out
10056 control variates. We show that this gradient estimator can be obtained using a
10057 new loss, defined as the variance of the log-ratio between the exact posterior
10058 and the variational approximation, which we call the $\textit{log-variance
10059 loss}$. Under certain conditions, the gradient of the log-variance loss equals
10060 the gradient of the (negative) ELBO. We show theoretically that this gradient
10061 estimator, which we call $\textit{VarGrad}$ due to its connection to the
10062 log-variance loss, exhibits lower variance than the score function method in
10063 certain settings, and that the leave-one-out control variate coefficients are
10064 close to the optimal ones. We empirically demonstrate that VarGrad offers a
10065 favourable variance versus computation trade-off compared to other
10066 state-of-the-art estimators on a discrete VAE.
10067 </p>
10068 </description>
10069 <guid isPermaLink="false">oai:arXiv.org:2010.10436</guid>
10070 </item>
10071 <item>
10072 <title>A Coarse-To-Fine (C2F) Representation for End-To-End 6-DoF Grasp Detection. (arXiv:2010.10695v2 [cs.RO] UPDATED)</title>
10073 <link>http://fr.arxiv.org/abs/2010.10695</link>
10074 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Jeng_K/0/1/0/all/0/1">Kuang-Yu Jeng</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1">Yueh-Cheng Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Liu_Z/0/1/0/all/0/1">Zhe Yu Liu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_J/0/1/0/all/0/1">Jen-Wei Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chang_Y/0/1/0/all/0/1">Ya-Liang Chang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Su_H/0/1/0/all/0/1">Hung-Ting Su</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hsu_W/0/1/0/all/0/1">Winston Hsu</a></p>
10075
10076 <p>We proposed an end-to-end grasp detection network, Grasp Detection Network
10077 (GDN), cooperated with a novel coarse-to-fine (C2F) grasp representation design
10078 to detect diverse and accurate 6-DoF grasps based on point clouds. Compared to
10079 previous two-stage approaches which sample and evaluate multiple grasp
10080 candidates, our architecture is at least 20 times faster. It is also 8% and 40%
10081 more accurate in terms of the success rate in single object scenes and the
10082 complete rate in clutter scenes, respectively. Our method shows superior
10083 results among settings with different number of views and input points.
10084 Moreover, we propose a new AP-based metric which considers both rotation and
10085 transition errors, making it a more comprehensive evaluation tool for grasp
10086 detection models.
10087 </p>
10088 </description>
10089 <guid isPermaLink="false">oai:arXiv.org:2010.10695</guid>
10090 </item>
10091 <item>
10092 <title>Model selection in reconciling hierarchical time series. (arXiv:2010.10742v2 [cs.LG] UPDATED)</title>
10093 <link>http://fr.arxiv.org/abs/2010.10742</link>
10094 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Abolghasemi_M/0/1/0/all/0/1">Mahdi Abolghasemi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hyndman_R/0/1/0/all/0/1">Rob J Hyndman</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Spiliotis_E/0/1/0/all/0/1">Evangelos Spiliotis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bergmeir_C/0/1/0/all/0/1">Christoph Bergmeir</a></p>
10095
10096 <p>Model selection has been proven an effective strategy for improving accuracy
10097 in time series forecasting applications. However, when dealing with
10098 hierarchical time series, apart from selecting the most appropriate forecasting
10099 model, forecasters have also to select a suitable method for reconciling the
10100 base forecasts produced for each series to make sure they are coherent.
10101 Although some hierarchical forecasting methods like minimum trace are strongly
10102 supported both theoretically and empirically for reconciling the base
10103 forecasts, there are still circumstances under which they might not produce the
10104 most accurate results, being outperformed by other methods. In this paper we
10105 propose an approach for dynamically selecting the most appropriate hierarchical
10106 forecasting method and succeeding better forecasting accuracy along with
10107 coherence. The approach, to be called conditional hierarchical forecasting, is
10108 based on Machine Learning classification methods and uses time series features
10109 as leading indicators for performing the selection for each hierarchy examined
10110 considering a variety of alternatives. Our results suggest that conditional
10111 hierarchical forecasting leads to significantly more accurate forecasts than
10112 standard approaches, especially at lower hierarchical levels.
10113 </p>
10114 </description>
10115 <guid isPermaLink="false">oai:arXiv.org:2010.10742</guid>
10116 </item>
10117 <item>
10118 <title>Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition. (arXiv:2010.10759v3 [cs.SD] UPDATED)</title>
10119 <link>http://fr.arxiv.org/abs/2010.10759</link>
10120 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shi_Y/0/1/0/all/0/1">Yangyang Shi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1">Yongqiang Wang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_C/0/1/0/all/0/1">Chunyang Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Yeh_C/0/1/0/all/0/1">Ching-Feng Yeh</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chan_J/0/1/0/all/0/1">Julian Chan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_F/0/1/0/all/0/1">Frank Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Le_D/0/1/0/all/0/1">Duc Le</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Seltzer_M/0/1/0/all/0/1">Mike Seltzer</a></p>
10121
10122 <p>This paper proposes an efficient memory transformer Emformer for low latency
10123 streaming speech recognition. In Emformer, the long-range history context is
10124 distilled into an augmented memory bank to reduce self-attention's computation
10125 complexity. A cache mechanism saves the computation for the key and value in
10126 self-attention for the left context. Emformer applies a parallelized block
10127 processing in training to support low latency models. We carry out experiments
10128 on benchmark LibriSpeech data. Under average latency of 960 ms, Emformer gets
10129 WER $2.50\%$ on test-clean and $5.62\%$ on test-other. Comparing with a strong
10130 baseline augmented memory transformer (AM-TRF), Emformer gets $4.6$ folds
10131 training speedup and $18\%$ relative real-time factor (RTF) reduction in
10132 decoding with relative WER reduction $17\%$ on test-clean and $9\%$ on
10133 test-other. For a low latency scenario with an average latency of 80 ms,
10134 Emformer achieves WER $3.01\%$ on test-clean and $7.09\%$ on test-other.
10135 Comparing with the LSTM baseline with the same latency and model size, Emformer
10136 gets relative WER reduction $9\%$ and $16\%$ on test-clean and test-other,
10137 respectively.
10138 </p>
10139 </description>
10140 <guid isPermaLink="false">oai:arXiv.org:2010.10759</guid>
10141 </item>
10142 <item>
10143 <title>Large-Scale High PV Power Grid Dynamic Model Development -- A Case Study on the U.S. Eastern Interconnection. (arXiv:2010.11150v2 [eess.SY] UPDATED)</title>
10144 <link>http://fr.arxiv.org/abs/2010.11150</link>
10145 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+You_S/0/1/0/all/0/1">Shutang You</a></p>
10146
10147 <p>Power systems are undergoing a transformation toward a low-carbon
10148 non-synchronous generation portfolio. A major concern for system planners and
10149 operators is the system dynamics in the high renewable penetration future.
10150 Because of the scale of the system and numerous components involved, it is
10151 extremely difficult to develop high PV dynamic models based upon actual power
10152 system models. The main contribution of this paper is providing an example of
10153 developing high PV penetration models based on the validated dynamic model of
10154 an actual large-scale power grid - the U.S. Eastern Interconnection system. The
10155 displacement of conventional generators by PV is realized by optimization.
10156 Combining the PV distribution optimization and the validated dynamic model
10157 information, this approach avoids the uncertainties brought about by
10158 transmission planning. As the existing dynamic models can be validated by
10159 measurements, this approach improves the credibility of the high PV models in
10160 representing future power grids. This generic approach can be applied to
10161 develop high PV dynamic models for other actual large-scale systems.
10162 </p>
10163 </description>
10164 <guid isPermaLink="false">oai:arXiv.org:2010.11150</guid>
10165 </item>
10166 <item>
10167 <title>Build Smart Grids on Artificial Intelligence -- A Real-world Example. (arXiv:2010.11175v2 [eess.SY] UPDATED)</title>
10168 <link>http://fr.arxiv.org/abs/2010.11175</link>
10169 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+You_S/0/1/0/all/0/1">Shutang You</a></p>
10170
10171 <p>Power grid data are going big with the deployment of various sensors. The big
10172 data in power grids creates huge opportunities for applying artificial
10173 intelligence technologies to improve resilience and reliability. This paper
10174 introduces multiple real-world applications based on artificial intelligence to
10175 improve power grid situational awareness and resilience. These applications
10176 include event identification, inertia estimation, event location and magnitude
10177 estimation, data authentication, control, and stability assessment. These
10178 applications are operating on a real-world system called FNET-GridEye, which is
10179 a wide-area measurement network and arguably the world-largest cyber-physical
10180 system that collects power grid big data. These applications showed much better
10181 performance compared with conventional approaches and accomplished new tasks
10182 that are impossible to realized using conventional technologies. These
10183 encouraging results demonstrate that combining power grid big data and
10184 artificial intelligence can uncover and capture the non-linear correlation
10185 between power grid data and its stabilities indices and will potentially enable
10186 many advanced applications that can significantly improve power grid
10187 resilience.
10188 </p>
10189 </description>
10190 <guid isPermaLink="false">oai:arXiv.org:2010.11175</guid>
10191 </item>
10192 <item>
10193 <title>NightOwl: Robotic Platform for Wheeled Service Robot. (arXiv:2010.11505v2 [cs.RO] UPDATED)</title>
10194 <link>http://fr.arxiv.org/abs/2010.11505</link>
10195 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Al_Fahsi_R/0/1/0/all/0/1">Resha Dwika Hefni Al-Fahsi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Winanta_K/0/1/0/all/0/1">Kevin Aldian Winanta</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pradana_F/0/1/0/all/0/1">Fauzan Pradana</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ardiyanto_I/0/1/0/all/0/1">Igi Ardiyanto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Cahyadi_A/0/1/0/all/0/1">Adha Imam Cahyadi</a></p>
10196
10197 <p>NightOwl is a robotic platform designed exclusively for a wheeled service
10198 robot. The robot navigates autonomously in omnidirectional fashion movement and
10199 equipped with LIDAR to sense the surrounding area. The platform itself was
10200 built using the Robot Operating System (ROS) and written in two different
10201 programming languages (C++ and Python). NightOwl is composed of several modular
10202 programs, namely hardware controller, light detection and ranging (LIDAR),
10203 simultaneous localization and mapping (SLAM), world model, path planning, robot
10204 control, communication, and behaviour. The programs run in parallel and
10205 communicate reciprocally to share various information. This paper explains the
10206 role of modular programs in the term of input, process, and output. In
10207 addition, NightOwl provides simulation visualized in both Gazebo and RViz. The
10208 robot in its environment is visualized by Gazebo. Sensor data from LIDAR and
10209 results from SLAM will be visualized by RViz.
10210 </p>
10211 </description>
10212 <guid isPermaLink="false">oai:arXiv.org:2010.11505</guid>
10213 </item>
10214 <item>
10215 <title>Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity. (arXiv:2010.11775v2 [cs.LG] UPDATED)</title>
10216 <link>http://fr.arxiv.org/abs/2010.11775</link>
10217 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Chen_S/0/1/0/all/0/1">Shuxiao Chen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+He_H/0/1/0/all/0/1">Hangfeng He</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Su_W/0/1/0/all/0/1">Weijie J. Su</a></p>
10218
10219 <p>As a popular approach to modeling the dynamics of training overparametrized
10220 neural networks (NNs), the neural tangent kernels (NTK) are known to fall
10221 behind real-world NNs in generalization ability. This performance gap is in
10222 part due to the \textit{label agnostic} nature of the NTK, which renders the
10223 resulting kernel not as \textit{locally elastic} as NNs~\citep{he2019local}. In
10224 this paper, we introduce a novel approach from the perspective of
10225 \emph{label-awareness} to reduce this gap for the NTK. Specifically, we propose
10226 two label-aware kernels that are each a superimposition of a label-agnostic
10227 part and a hierarchy of label-aware parts with increasing complexity of label
10228 dependence, using the Hoeffding decomposition. Through both theoretical and
10229 empirical evidence, we show that the models trained with the proposed kernels
10230 better simulate NNs in terms of generalization ability and local elasticity.
10231 </p>
10232 </description>
10233 <guid isPermaLink="false">oai:arXiv.org:2010.11775</guid>
10234 </item>
10235 <item>
10236 <title>The Polynomial Method is Universal for Distribution-Free Correlational SQ Learning. (arXiv:2010.11925v2 [cs.DS] UPDATED)</title>
10237 <link>http://fr.arxiv.org/abs/2010.11925</link>
10238 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Gollakota_A/0/1/0/all/0/1">Aravind Gollakota</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Karmalkar_S/0/1/0/all/0/1">Sushrut Karmalkar</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Klivans_A/0/1/0/all/0/1">Adam Klivans</a></p>
10239
10240 <p>We consider the problem of distribution-free learning for Boolean function
10241 classes in the PAC and agnostic models. Generalizing a recent beautiful work of
10242 Malach and Shalev-Shwartz (2020) who gave the first tight correlational SQ
10243 (CSQ) lower bounds for learning DNF formulas, we show that lower bounds on the
10244 threshold or approximate degree of any function class directly imply CSQ lower
10245 bounds for PAC or agnostic learning respectively. These match corresponding
10246 positive results using upper bounds on the threshold or approximate degree in
10247 the SQ model for PAC or agnostic learning. Many of these results were implicit
10248 in earlier works of Feldman and Sherstov.
10249 </p>
10250 </description>
10251 <guid isPermaLink="false">oai:arXiv.org:2010.11925</guid>
10252 </item>
10253 <item>
10254 <title>Escape saddle points faster on manifolds via perturbed Riemannian stochastic recursive gradient. (arXiv:2010.12191v2 [math.OC] UPDATED)</title>
10255 <link>http://fr.arxiv.org/abs/2010.12191</link>
10256 <description><p>Authors: <a href="http://fr.arxiv.org/find/math/1/au:+Han_A/0/1/0/all/0/1">Andi Han</a>, <a href="http://fr.arxiv.org/find/math/1/au:+Gao_J/0/1/0/all/0/1">Junbin Gao</a></p>
10257
10258 <p>In this paper, we propose a variant of Riemannian stochastic recursive
10259 gradient method that can achieve second-order convergence guarantee and escape
10260 saddle points using simple perturbation. The idea is to perturb the iterates
10261 when gradient is small and carry out stochastic recursive gradient updates over
10262 tangent space. This avoids the complication of exploiting Riemannian geometry.
10263 We show that under finite-sum setting, our algorithm requires
10264 $\widetilde{\mathcal{O}}\big( \frac{ \sqrt{n}}{\epsilon^2} + \frac{\sqrt{n}
10265 }{\delta^4} + \frac{n}{\delta^3}\big)$ stochastic gradient queries to find a
10266 $(\epsilon, \delta)$-second-order critical point. This strictly improves the
10267 complexity of perturbed Riemannian gradient descent and is superior to
10268 perturbed Riemannian accelerated gradient descent under large-sample settings.
10269 We also provide a complexity of $\widetilde{\mathcal{O}} \big(
10270 \frac{1}{\epsilon^3} + \frac{1}{\delta^3 \epsilon^2} + \frac{1}{\delta^4
10271 \epsilon} \big)$ for online optimization, which is novel on Riemannian manifold
10272 in terms of second-order convergence using only first-order information.
10273 </p>
10274 </description>
10275 <guid isPermaLink="false">oai:arXiv.org:2010.12191</guid>
10276 </item>
10277 <item>
10278 <title>On the mechanical contribution of head stabilization to passive dynamics of anthropometric walkers. (arXiv:2010.12234v2 [cs.RO] UPDATED)</title>
10279 <link>http://fr.arxiv.org/abs/2010.12234</link>
10280 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Benallegue_M/0/1/0/all/0/1">Mehdi Benallegue</a> (AIST), <a href="http://fr.arxiv.org/find/cs/1/au:+Laumond_J/0/1/0/all/0/1">Jean-Paul Laumond</a> (DI-ENS), <a href="http://fr.arxiv.org/find/cs/1/au:+Berthoz_A/0/1/0/all/0/1">Alain Berthoz</a> (CdF (institution))</p>
10281
10282 <p>During the steady gait, humans stabilize their head around the vertical
10283 orientation. While there are sensori-cognitive explanations for this
10284 phenomenon, its mechanical e fect on the body dynamics remains un-explored. In
10285 this study, we take profit from the similarities that human steady gait share
10286 with the locomotion of passive dynamics robots. We introduce a simplified
10287 anthropometric D model to reproduce a broad walking dynamics. In a previous
10288 study, we showed heuristically that the presence of a stabilized head-neck
10289 system significantly influences the dynamics of walking. This paper gives new
10290 insights that lead to understanding this mechanical e fect. In particular, we
10291 introduce an original cart upper-body model that allows to better understand
10292 the mechanical interest of head stabilization when walking, and we study how
10293 this e fect is sensitive to the choice of control parameters.
10294 </p>
10295 </description>
10296 <guid isPermaLink="false">oai:arXiv.org:2010.12234</guid>
10297 </item>
10298 <item>
10299 <title>Exploring task-based query expansion at the TREC-COVID track. (arXiv:2010.12674v2 [cs.IR] UPDATED)</title>
10300 <link>http://fr.arxiv.org/abs/2010.12674</link>
10301 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Schoegje_T/0/1/0/all/0/1">Thomas Schoegje</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Kamphuis_C/0/1/0/all/0/1">Chris Kamphuis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Dercksen_K/0/1/0/all/0/1">Koen Dercksen</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hiemstra_D/0/1/0/all/0/1">Djoerd Hiemstra</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Pieters_T/0/1/0/all/0/1">Toine Pieters</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Vries_A/0/1/0/all/0/1">Arjen de Vries</a></p>
10302
10303 <p>We explore how to generate effective queries based on search tasks. Our
10304 approach has three main steps: 1) identify search tasks based on research
10305 goals, 2) manually classify search queries according to those tasks, and 3)
10306 compare three methods to improve search rankings based on the task context. The
10307 most promising approach is based on expanding the user's query terms using task
10308 terms, which slightly improved the NDCG@20 scores over a BM25 baseline. Further
10309 improvements might be gained if we can identify more specific search tasks.
10310 </p>
10311 </description>
10312 <guid isPermaLink="false">oai:arXiv.org:2010.12674</guid>
10313 </item>
10314 <item>
10315 <title>Adaptive In-network Collaborative Caching for Enhanced Ensemble Deep Learning at Edge. (arXiv:2010.12899v3 [cs.NI] UPDATED)</title>
10316 <link>http://fr.arxiv.org/abs/2010.12899</link>
10317 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Qin_Y/0/1/0/all/0/1">Yana Qin</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_D/0/1/0/all/0/1">Danye Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xu_Z/0/1/0/all/0/1">Zhiwei Xu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Tian_J/0/1/0/all/0/1">Jie Tian</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_Y/0/1/0/all/0/1">Yujun Zhang</a></p>
10318
10319 <p>To enhance the quality and speed of data processing and protect the privacy
10320 and security of the data, edge computing has been extensively applied to
10321 support data-intensive intelligent processing services at edge. Among these
10322 data-intensive services, ensemble learning-based services can in natural
10323 leverage the distributed computation and storage resources at edge devices to
10324 achieve efficient data collection, processing, analysis.
10325 </p>
10326 <p>Collaborative caching has been applied in edge computing to support services
10327 close to the data source, in order to take the limited resources at edge
10328 devices to support high-performance ensemble learning solutions. To achieve
10329 this goal, we propose an adaptive in-network collaborative caching scheme for
10330 ensemble learning at edge. First, an efficient data representation structure is
10331 proposed to record cached data among different nodes. In addition, we design a
10332 collaboration scheme to facilitate edge nodes to cache valuable data for local
10333 ensemble learning, by scheduling local caching according to a summarization of
10334 data representations from different edge nodes. Our extensive simulations
10335 demonstrate the high performance of the proposed collaborative caching scheme,
10336 which significantly reduces the learning latency and the transmission overhead.
10337 </p>
10338 </description>
10339 <guid isPermaLink="false">oai:arXiv.org:2010.12899</guid>
10340 </item>
10341 <item>
10342 <title>Lightning-Fast Gravitational Wave Parameter Inference through Neural Amortization. (arXiv:2010.12931v2 [astro-ph.IM] UPDATED)</title>
10343 <link>http://fr.arxiv.org/abs/2010.12931</link>
10344 <description><p>Authors: <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Delaunoy_A/0/1/0/all/0/1">Arnaud Delaunoy</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Wehenkel_A/0/1/0/all/0/1">Antoine Wehenkel</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Hinderer_T/0/1/0/all/0/1">Tanja Hinderer</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Nissanke_S/0/1/0/all/0/1">Samaya Nissanke</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Weniger_C/0/1/0/all/0/1">Christoph Weniger</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Williamson_A/0/1/0/all/0/1">Andrew R. Williamson</a>, <a href="http://fr.arxiv.org/find/astro-ph/1/au:+Louppe_G/0/1/0/all/0/1">Gilles Louppe</a></p>
10345
10346 <p>Gravitational waves from compact binaries measured by the LIGO and Virgo
10347 detectors are routinely analyzed using Markov Chain Monte Carlo sampling
10348 algorithms. Because the evaluation of the likelihood function requires
10349 evaluating millions of waveform models that link between signal shapes and the
10350 source parameters, running Markov chains until convergence is typically
10351 expensive and requires days of computation. In this extended abstract, we
10352 provide a proof of concept that demonstrates how the latest advances in neural
10353 simulation-based inference can speed up the inference time by up to three
10354 orders of magnitude -- from days to minutes -- without impairing the
10355 performance. Our approach is based on a convolutional neural network modeling
10356 the likelihood-to-evidence ratio and entirely amortizes the computation of the
10357 posterior. We find that our model correctly estimates credible intervals for
10358 the parameters of simulated gravitational waves.
10359 </p>
10360 </description>
10361 <guid isPermaLink="false">oai:arXiv.org:2010.12931</guid>
10362 </item>
10363 <item>
10364 <title>A Survey on Churn Analysis. (arXiv:2010.13119v2 [cs.LG] UPDATED)</title>
10365 <link>http://fr.arxiv.org/abs/2010.13119</link>
10366 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Ahn_J/0/1/0/all/0/1">Jaehuyn Ahn</a></p>
10367
10368 <p>In this paper, I present churn prediction techniques that have been released
10369 so far. Churn prediction is used in the fields of Internet services, games,
10370 insurance, and management. However, since it has been used intensively to
10371 increase the predictability of various industry/academic fields, there is a big
10372 difference in its definition and utilization. In this paper, I collected the
10373 definitions of churn used in the fields of business administration, marketing,
10374 IT, telecommunications, newspapers, insurance and psychology, and described
10375 their differences. Based on this, I classified and explained churn loss,
10376 feature engineering, and prediction models. Our study can be used to select the
10377 definition of churn and its associated models suitable for the service field
10378 that researchers are most interested in by integrating fragmented churn studies
10379 in industry/academic fields.
10380 </p>
10381 </description>
10382 <guid isPermaLink="false">oai:arXiv.org:2010.13119</guid>
10383 </item>
10384 <item>
10385 <title>Geometric Exploration for Online Control. (arXiv:2010.13178v2 [cs.LG] UPDATED)</title>
10386 <link>http://fr.arxiv.org/abs/2010.13178</link>
10387 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Plevrakis_O/0/1/0/all/0/1">Orestis Plevrakis</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Hazan_E/0/1/0/all/0/1">Elad Hazan</a></p>
10388
10389 <p>We study the control of an \emph{unknown} linear dynamical system under
10390 general convex costs. The objective is minimizing regret vs. the class of
10391 disturbance-feedback-controllers, which encompasses all stabilizing
10392 linear-dynamical-controllers. In this work, we first consider the case of known
10393 cost functions, for which we design the first polynomial-time algorithm with
10394 $n^3\sqrt{T}$-regret, where $n$ is the dimension of the state plus the
10395 dimension of control input. The $\sqrt{T}$-horizon dependence is optimal, and
10396 improves upon the previous best known bound of $T^{2/3}$. The main component of
10397 our algorithm is a novel geometric exploration strategy: we adaptively
10398 construct a sequence of barycentric spanners in the policy space. Second, we
10399 consider the case of bandit feedback, for which we give the first
10400 polynomial-time algorithm with $poly(n)\sqrt{T}$-regret, building on Stochastic
10401 Bandit Convex Optimization.
10402 </p>
10403 </description>
10404 <guid isPermaLink="false">oai:arXiv.org:2010.13178</guid>
10405 </item>
10406 <item>
10407 <title>Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach. (arXiv:2010.13273v2 [cs.IR] UPDATED)</title>
10408 <link>http://fr.arxiv.org/abs/2010.13273</link>
10409 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Dong_Y/0/1/0/all/0/1">Yuyang Dong</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Takeoka_K/0/1/0/all/0/1">Kunihiro Takeoka</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Xiao_C/0/1/0/all/0/1">Chuan Xiao</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Oyamada_M/0/1/0/all/0/1">Masafumi Oyamada</a></p>
10410
10411 <p>Finding joinable tables in data lakes is key procedure in many applications
10412 such as data integration, data augmentation, data analysis, and data market.
10413 Traditional approaches that find equi-joinable tables are unable to deal with
10414 misspellings and different formats, nor do they capture any semantic joins. In
10415 this paper, we propose PEXESO, a framework for joinable table discovery in data
10416 lakes. We embed textual values as high-dimensional vectors and join columns
10417 under similarity predicates on high-dimensional vectors, hence to address the
10418 limitations of equi-join approaches and identify more meaningful results. To
10419 efficiently find joinable tables with similarity, we propose a block-and-verify
10420 method that utilizes pivot-based filtering. A partitioning technique is
10421 developed to cope with the case when the data lake is large and the index
10422 cannot fit in main memory. An experimental evaluation on real datasets shows
10423 that our solution identifies substantially more tables than equi-joins and
10424 outperforms other similarity-based options, and the join results are useful in
10425 data enrichment for machine learning tasks. The experiments also demonstrate
10426 the efficiency of the proposed method.
10427 </p>
10428 </description>
10429 <guid isPermaLink="false">oai:arXiv.org:2010.13273</guid>
10430 </item>
10431 <item>
10432 <title>Malicious Requests Detection with Improved Bidirectional Long Short-term Memory Neural Networks. (arXiv:2010.13285v2 [cs.LG] UPDATED)</title>
10433 <link>http://fr.arxiv.org/abs/2010.13285</link>
10434 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Li_W/0/1/0/all/0/1">Wenhao Li</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_B/0/1/0/all/0/1">Bincheng Zhang</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1">Jiajie Zhang</a></p>
10435
10436 <p>Detecting and intercepting malicious requests are one of the most widely used
10437 ways against attacks in the network security. Most existing detecting
10438 approaches, including matching blacklist characters and machine learning
10439 algorithms have all shown to be vulnerable to sophisticated attacks. To address
10440 the above issues, a more general and rigorous detection method is required. In
10441 this paper, we formulate the problem of detecting malicious requests as a
10442 temporal sequence classification problem, and propose a novel deep learning
10443 model namely Convolutional Neural Network-Bidirectional Long Short-term
10444 Memory-Convolutional Neural Network (CNN-BiLSTM-CNN). By connecting the shadow
10445 and deep feature maps of the convolutional layers, the malicious feature
10446 extracting ability is improved on more detailed functionality. Experimental
10447 results on HTTP dataset CSIC 2010 have demonstrated the effectiveness of the
10448 proposed method when compared with the state-of-the-arts.
10449 </p>
10450 </description>
10451 <guid isPermaLink="false">oai:arXiv.org:2010.13285</guid>
10452 </item>
10453 <item>
10454 <title>Recent Developments on ESPnet Toolkit Boosted by Conformer. (arXiv:2010.13956v2 [eess.AS] UPDATED)</title>
10455 <link>http://fr.arxiv.org/abs/2010.13956</link>
10456 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Guo_P/0/1/0/all/0/1">Pengcheng Guo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Boyer_F/0/1/0/all/0/1">Florian Boyer</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Chang_X/0/1/0/all/0/1">Xuankai Chang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Hayashi_T/0/1/0/all/0/1">Tomoki Hayashi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Higuchi_Y/0/1/0/all/0/1">Yosuke Higuchi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Inaguma_H/0/1/0/all/0/1">Hirofumi Inaguma</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Kamo_N/0/1/0/all/0/1">Naoyuki Kamo</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Li_C/0/1/0/all/0/1">Chenda Li</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Garcia_Romero_D/0/1/0/all/0/1">Daniel Garcia-Romero</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shi_J/0/1/0/all/0/1">Jiatong Shi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Shi_J/0/1/0/all/0/1">Jing Shi</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Watanabe_S/0/1/0/all/0/1">Shinji Watanabe</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Wei_K/0/1/0/all/0/1">Kun Wei</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_W/0/1/0/all/0/1">Wangyou Zhang</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zhang_Y/0/1/0/all/0/1">Yuekai Zhang</a></p>
10457
10458 <p>In this study, we present recent developments on ESPnet: End-to-End Speech
10459 Processing toolkit, which mainly involves a recently proposed architecture
10460 called Conformer, Convolution-augmented Transformer. This paper shows the
10461 results for a wide range of end-to-end speech processing applications, such as
10462 automatic speech recognition (ASR), speech translations (ST), speech separation
10463 (SS) and text-to-speech (TTS). Our experiments reveal various training tips and
10464 significant performance benefits obtained with the Conformer on different
10465 tasks. These results are competitive or even outperform the current
10466 state-of-art Transformer models. We are preparing to release all-in-one recipes
10467 using open source and publicly available corpora for all the above tasks with
10468 pre-trained models. Our aim for this work is to contribute to our research
10469 community by reducing the burden of preparing state-of-the-art research
10470 environments usually requiring high resources.
10471 </p>
10472 </description>
10473 <guid isPermaLink="false">oai:arXiv.org:2010.13956</guid>
10474 </item>
10475 <item>
10476 <title>Simultaenous Sieves: A Deterministic Streaming Algorithm for Non-Monotone Submodular Maximization. (arXiv:2010.14367v2 [cs.DS] UPDATED)</title>
10477 <link>http://fr.arxiv.org/abs/2010.14367</link>
10478 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Kuhnle_A/0/1/0/all/0/1">Alan Kuhnle</a></p>
10479
10480 <p>In this work, we present a combinatorial, deterministic single-pass streaming
10481 algorithm for the problem of maximizing a submodular function, not necessarily
10482 monotone, with respect to a cardinality constraint (SMCC). In the case the
10483 function is monotone, our algorithm reduces to the optimal streaming algorithm
10484 of Badanidiyuru et al. (2014). In general, our algorithm achieves ratio $\alpha
10485 / (1 + \alpha) - \varepsilon$, for any $\varepsilon &gt; 0$, where $\alpha$ is the
10486 ratio of an offline (deterministic) algorithm for SMCC used for
10487 post-processing. Thus, if exponential computation time is allowed, our
10488 algorithm deterministically achieves nearly the optimal $1/2$ ratio. These
10489 results nearly match those of a recently proposed, randomized streaming
10490 algorithm that achieves the same ratios in expectation. For a deterministic,
10491 single-pass streaming algorithm, our algorithm achieves in polynomial time an
10492 improvement of the best approximation factor from $1/9$ of previous literature
10493 to $\approx 0.2689$.
10494 </p>
10495 </description>
10496 <guid isPermaLink="false">oai:arXiv.org:2010.14367</guid>
10497 </item>
10498 <item>
10499 <title>Memory Optimization for Deep Networks. (arXiv:2010.14501v2 [cs.LG] UPDATED)</title>
10500 <link>http://fr.arxiv.org/abs/2010.14501</link>
10501 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Shah_A/0/1/0/all/0/1">Aashaka Shah</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Wu_C/0/1/0/all/0/1">Chao-Yuan Wu</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mohan_J/0/1/0/all/0/1">Jayashree Mohan</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Chidambaram_V/0/1/0/all/0/1">Vijay Chidambaram</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Krahenbuhl_P/0/1/0/all/0/1">Philipp Kr&#xe4;henb&#xfc;hl</a></p>
10502
10503 <p>Deep learning is slowly, but steadily, hitting a memory bottleneck. While the
10504 tensor computation in top-of-the-line GPUs increased by 32x over the last five
10505 years, the total available memory only grew by 2.5x. This prevents researchers
10506 from exploring larger architectures, as training large networks requires more
10507 memory for storing intermediate outputs. In this paper, we present MONeT, an
10508 automatic framework that minimizes both the memory footprint and computational
10509 overhead of deep networks. MONeT jointly optimizes the checkpointing schedule
10510 and the implementation of various operators. MONeT is able to outperform all
10511 prior hand-tuned operations as well as automated checkpointing. MONeT reduces
10512 the overall memory requirement by 3x for various PyTorch models, with a 9-16%
10513 overhead in computation. For the same computation cost, MONeT requires 1.2-1.8x
10514 less memory than current state-of-the-art automated checkpointing frameworks.
10515 Our code is available at https://github.com/utsaslab/MONeT.
10516 </p>
10517 </description>
10518 <guid isPermaLink="false">oai:arXiv.org:2010.14501</guid>
10519 </item>
10520 <item>
10521 <title>Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus. (arXiv:2010.14571v2 [cs.CL] UPDATED)</title>
10522 <link>http://fr.arxiv.org/abs/2010.14571</link>
10523 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Caswell_I/0/1/0/all/0/1">Isaac Caswell</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Breiner_T/0/1/0/all/0/1">Theresa Breiner</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Esch_D/0/1/0/all/0/1">Daan van Esch</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bapna_A/0/1/0/all/0/1">Ankur Bapna</a></p>
10524
10525 <p>Large text corpora are increasingly important for a wide variety of Natural
10526 Language Processing (NLP) tasks, and automatic language identification (LangID)
10527 is a core technology needed to collect such datasets in a multilingual context.
10528 LangID is largely treated as solved in the literature, with models reported
10529 that achieve over 90% average F1 on as many as 1,366 languages. We train LangID
10530 models on up to 1,629 languages with comparable quality on held-out test sets,
10531 but find that human-judged LangID accuracy for web-crawl text corpora created
10532 using these models is only around 5% for many lower-resource languages,
10533 suggesting a need for more robust evaluation. Further analysis revealed a
10534 variety of error modes, arising from domain mismatch, class imbalance, language
10535 similarity, and insufficiently expressive models. We propose two classes of
10536 techniques to mitigate these errors: wordlist-based tunable-precision filters
10537 (for which we release curated lists in about 500 languages) and
10538 transformer-based semi-supervised LangID models, which increase median dataset
10539 precision from 5.5% to 71.2%. These techniques enable us to create an initial
10540 data set covering 100K or more relatively clean sentences in each of 500+
10541 languages, paving the way towards a 1,000-language web text corpus.
10542 </p>
10543 </description>
10544 <guid isPermaLink="false">oai:arXiv.org:2010.14571</guid>
10545 </item>
10546 <item>
10547 <title>Predicting Themes within Complex Unstructured Texts: A Case Study on Safeguarding Reports. (arXiv:2010.14584v2 [cs.CL] UPDATED)</title>
10548 <link>http://fr.arxiv.org/abs/2010.14584</link>
10549 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Edwards_A/0/1/0/all/0/1">Aleksandra Edwards</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Rogers_D/0/1/0/all/0/1">David Rogers</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Camacho_Collados_J/0/1/0/all/0/1">Jose Camacho-Collados</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Ribaupierre_H/0/1/0/all/0/1">H&#xe9;l&#xe8;ne de Ribaupierre</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Preece_A/0/1/0/all/0/1">Alun Preece</a></p>
10550
10551 <p>The task of text and sentence classification is associated with the need for
10552 large amounts of labelled training data. The acquisition of high volumes of
10553 labelled datasets can be expensive or unfeasible, especially for
10554 highly-specialised domains for which documents are hard to obtain. Research on
10555 the application of supervised classification based on small amounts of training
10556 data is limited. In this paper, we address the combination of state-of-the-art
10557 deep learning and classification methods and provide an insight into what
10558 combination of methods fit the needs of small, domain-specific, and
10559 terminologically-rich corpora. We focus on a real-world scenario related to a
10560 collection of safeguarding reports comprising learning experiences and
10561 reflections on tackling serious incidents involving children and vulnerable
10562 adults. The relatively small volume of available reports and their use of
10563 highly domain-specific terminology makes the application of automated
10564 approaches difficult. We focus on the problem of automatically identifying the
10565 main themes in a safeguarding report using supervised classification
10566 approaches. Our results show the potential of deep learning models to simulate
10567 subject-expert behaviour even for complex tasks with limited labelled data.
10568 </p>
10569 </description>
10570 <guid isPermaLink="false">oai:arXiv.org:2010.14584</guid>
10571 </item>
10572 <item>
10573 <title>Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient. (arXiv:2010.14771v2 [cs.LG] UPDATED)</title>
10574 <link>http://fr.arxiv.org/abs/2010.14771</link>
10575 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Tosatto_S/0/1/0/all/0/1">Samuele Tosatto</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Carvalho_J/0/1/0/all/0/1">Jo&#xe3;o Carvalho</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Peters_J/0/1/0/all/0/1">Jan Peters</a></p>
10576
10577 <p>Off-policy Reinforcement Learning (RL) holds the promise of better data
10578 efficiency as it allows sample reuse and potentially enables safe interaction
10579 with the environment. Current off-policy policy gradient methods either suffer
10580 from high bias or high variance, delivering often unreliable estimates. The
10581 price of inefficiency becomes evident in real-world scenarios such as
10582 interaction-driven robot learning, where the success of RL has been rather
10583 limited, and a very high sample cost hinders straightforward application. In
10584 this paper, we propose a nonparametric Bellman equation, which can be solved in
10585 closed form. The solution is differentiable w.r.t the policy parameters and
10586 gives access to an estimation of the policy gradient. In this way, we avoid the
10587 high variance of importance sampling approaches, and the high bias of
10588 semi-gradient methods. We empirically analyze the quality of our gradient
10589 estimate against state-of-the-art methods, and show that it outperforms the
10590 baselines in terms of sample efficiency on classical control tasks.
10591 </p>
10592 </description>
10593 <guid isPermaLink="false">oai:arXiv.org:2010.14771</guid>
10594 </item>
10595 <item>
10596 <title>Transferable Universal Adversarial Perturbations Using Generative Models. (arXiv:2010.14919v2 [cs.CV] UPDATED)</title>
10597 <link>http://fr.arxiv.org/abs/2010.14919</link>
10598 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Hashemi_A/0/1/0/all/0/1">Atiye Sadat Hashemi</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Bar_A/0/1/0/all/0/1">Andreas B&#xe4;r</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Mozaffari_S/0/1/0/all/0/1">Saeed Mozaffari</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Fingscheidt_T/0/1/0/all/0/1">Tim Fingscheidt</a></p>
10599
10600 <p>Deep neural networks tend to be vulnerable to adversarial perturbations,
10601 which by adding to a natural image can fool a respective model with high
10602 confidence. Recently, the existence of image-agnostic perturbations, also known
10603 as universal adversarial perturbations (UAPs), were discovered. However,
10604 existing UAPs still lack a sufficiently high fooling rate, when being applied
10605 to an unknown target model. In this paper, we propose a novel deep learning
10606 technique for generating more transferable UAPs. We utilize a perturbation
10607 generator and some given pretrained networks so-called source models to
10608 generate UAPs using the ImageNet dataset. Due to the similar feature
10609 representation of various model architectures in the first layer, we propose a
10610 loss formulation that focuses on the adversarial energy only in the respective
10611 first layer of the source models. This supports the transferability of our
10612 generated UAPs to any other target model. We further empirically analyze our
10613 generated UAPs and demonstrate that these perturbations generalize very well
10614 towards different target models. Surpassing the current state of the art in
10615 both, fooling rate and model-transferability, we can show the superiority of
10616 our proposed approach. Using our generated non-targeted UAPs, we obtain an
10617 average fooling rate of 93.36% on the source models (state of the art: 82.16%).
10618 Generating our UAPs on the deep ResNet-152, we obtain about a 12% absolute
10619 fooling rate advantage vs. cutting-edge methods on VGG-16 and VGG-19 target
10620 models.
10621 </p>
10622 </description>
10623 <guid isPermaLink="false">oai:arXiv.org:2010.14919</guid>
10624 </item>
10625 <item>
10626 <title>Estimating Multiplicative Relations in Neural Networks. (arXiv:2010.15003v2 [cs.LG] UPDATED)</title>
10627 <link>http://fr.arxiv.org/abs/2010.15003</link>
10628 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Goel_B/0/1/0/all/0/1">Bhaavan Goel</a></p>
10629
10630 <p>Universal approximation theorem suggests that a shallow neural network can
10631 approximate any function. The input to neurons at each layer is a weighted sum
10632 of previous layer neurons and then an activation is applied. These activation
10633 functions perform very well when the output is a linear combination of input
10634 data. When trying to learn a function which involves product of input data, the
10635 neural networks tend to overfit the data to approximate the function. In this
10636 paper we will use properties of logarithmic functions to propose a pair of
10637 activation functions which can translate products into linear expression and
10638 learn using backpropagation. We will try to generalize this approach for some
10639 complex arithmetic functions and test the accuracy on a disjoint distribution
10640 with the training set.
10641 </p>
10642 </description>
10643 <guid isPermaLink="false">oai:arXiv.org:2010.15003</guid>
10644 </item>
10645 <item>
10646 <title>Benchmarking Parallelism in FaaS Platforms. (arXiv:2010.15032v2 [cs.DC] UPDATED)</title>
10647 <link>http://fr.arxiv.org/abs/2010.15032</link>
10648 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Barcelona_Pons_D/0/1/0/all/0/1">Daniel Barcelona-Pons</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Garcia_Lopez_P/0/1/0/all/0/1">Pedro Garc&#xed;a-L&#xf3;pez</a></p>
10649
10650 <p>Serverless computing has seen a myriad of work exploring its potential. Some
10651 systems tackle Function-as-a-Service (FaaS) properties on automatic elasticity
10652 and scale to run highly-parallel computing jobs. However, they focus on
10653 specific platforms and convey that their ideas can be extrapolated to any FaaS
10654 runtime.
10655 </p>
10656 <p>An important question arises: do all FaaS platforms fit parallel
10657 computations? In this paper, we argue that not all of them provide the
10658 necessary means to host highly-parallel applications. To validate our
10659 hypothesis, we create a comparative framework and categorize the architectures
10660 of four cloud FaaS offerings, with emphasis on parallel performance. We attest
10661 and extend this description with an empirical experiment that consists in
10662 plotting in deep detail the evolution of a parallel computing job on each
10663 service.
10664 </p>
10665 <p>The analysis of our results evinces that FaaS is not inherently good for
10666 parallel computations and architectural differences across platforms are
10667 decisive to categorize their performance. A key insight is the importance of
10668 virtualization technologies and the scheduling approach of FaaS platforms.
10669 Parallelism improves with lighter virtualization and proactive scheduling due
10670 to finer resource allocation and faster elasticity. This causes some platforms
10671 like AWS and IBM to perform well for highly-parallel computations, while others
10672 such as Azure present difficulties to achieve the required parallelism degree.
10673 Consequently, the information in this paper becomes of special interest to help
10674 users choose the most adequate infrastructure for their parallel applications.
10675 </p>
10676 </description>
10677 <guid isPermaLink="false">oai:arXiv.org:2010.15032</guid>
10678 </item>
10679 <item>
10680 <title>Measuring non-trivial compositionality in emergent communication. (arXiv:2010.15058v2 [cs.NE] UPDATED)</title>
10681 <link>http://fr.arxiv.org/abs/2010.15058</link>
10682 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Korbak_T/0/1/0/all/0/1">Tomasz Korbak</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Zubek_J/0/1/0/all/0/1">Julian Zubek</a>, <a href="http://fr.arxiv.org/find/cs/1/au:+Raczaszek_Leonardi_J/0/1/0/all/0/1">Joanna R&#x105;czaszek-Leonardi</a></p>
10683
10684 <p>Compositionality is an important explanatory target in emergent communication
10685 and language evolution. The vast majority of computational models of
10686 communication account for the emergence of only a very basic form of
10687 compositionality: trivial compositionality. A compositional protocol is
10688 trivially compositional if the meaning of a complex signal (e.g. blue circle)
10689 boils down to the intersection of meanings of its constituents (e.g. the
10690 intersection of the set of blue objects and the set of circles). A protocol is
10691 non-trivially compositional (NTC) if the meaning of a complex signal (e.g.
10692 biggest apple) is a more complex function of the meanings of their
10693 constituents. In this paper, we review several metrics of compositionality used
10694 in emergent communication and experimentally show that most of them fail to
10695 detect NTC - i.e. they treat non-trivial compositionality as a failure of
10696 compositionality. The one exception is tree reconstruction error, a metric
10697 motivated by formal accounts of compositionality. These results emphasise
10698 important limitations of emergent communication research that could hamper
10699 progress on modelling the emergence of NTC.
10700 </p>
10701 </description>
10702 <guid isPermaLink="false">oai:arXiv.org:2010.15058</guid>
10703 </item>
10704 <item>
10705 <title>The fundamental equations of change in statistical ensembles and biological populations. (arXiv:2010.14544v1 [q-bio.PE] CROSS LISTED)</title>
10706 <link>http://fr.arxiv.org/abs/2010.14544</link>
10707 <description><p>Authors: <a href="http://fr.arxiv.org/find/q-bio/1/au:+Frank_S/0/1/0/all/0/1">Steven A. Frank</a>, <a href="http://fr.arxiv.org/find/q-bio/1/au:+Bruggeman_F/0/1/0/all/0/1">Frank J. Bruggeman</a></p>
10708
10709 <p>A recent article in Nature Physics unified key results from thermodynamics,
10710 statistics, and information theory. The unification arose from a general
10711 equation for the rate of change in the information content of a system. The
10712 general equation describes the change in the moments of an observable quantity
10713 over a probability distribution. One term in the equation describes the change
10714 in the probability distribution. The other term describes the change in the
10715 observable values for a given state. We show the equivalence of this general
10716 equation for moment dynamics with the widely known Price equation from
10717 evolutionary theory, named after George Price. We introduce the Price equation
10718 from its biological roots, review a mathematically abstract form of the
10719 equation, and discuss the potential for this equation to unify diverse
10720 mathematical theories from different disciplines. The new work in Nature
10721 Physics and many applications in biology show that this equation also provides
10722 the basis for deriving many novel theoretical results within each discipline.
10723 </p>
10724 </description>
10725 <guid isPermaLink="false">oai:arXiv.org:2010.14544</guid>
10726 </item>
10727 <item>
10728 <title>Generalized eigen, singular value, and partial least squares decompositions: The GSVD package. (arXiv:2010.14734v2 [cs.MS] CROSS LISTED)</title>
10729 <link>http://fr.arxiv.org/abs/2010.14734</link>
10730 <description><p>Authors: <a href="http://fr.arxiv.org/find/cs/1/au:+Beaton_D/0/1/0/all/0/1">Derek Beaton</a> (1) ((1) Rotman Research Institute, Baycrest Health Sciences)</p>
10731
10732 <p>The generalized singular value decomposition (GSVD, a.k.a. "SVD triplet",
10733 "duality diagram" approach) provides a unified strategy and basis to perform
10734 nearly all of the most common multivariate analyses (e.g., principal
10735 components, correspondence analysis, multidimensional scaling, canonical
10736 correlation, partial least squares). Though the GSVD is ubiquitous, powerful,
10737 and flexible, it has very few implementations. Here I introduce the GSVD
10738 package for R. The general goal of GSVD is to provide a small set of accessible
10739 functions to perform the GSVD and two other related decompositions (generalized
10740 eigenvalue decomposition, generalized partial least squares-singular value
10741 decomposition). Furthermore, GSVD helps provide a more unified conceptual
10742 approach and nomenclature to many techniques. I first introduce the concept of
10743 the GSVD, followed by a formal definition of the generalized decompositions.
10744 Next I provide some key decisions made during development, and then a number of
10745 examples of how to use GSVD to implement various statistical techniques. These
10746 examples also illustrate one of the goals of GSVD: how others can (or should)
10747 build analysis packages that depend on GSVD. Finally, I discuss the possible
10748 future of GSVD.
10749 </p>
10750 </description>
10751 <guid isPermaLink="false">oai:arXiv.org:2010.14734</guid>
10752 </item>
10753 <item>
10754 <title>Continuous Chaotic Nonlinear System and Lyapunov controller Optimization using Deep Learning. (arXiv:2010.14746v1 [eess.SY] CROSS LISTED)</title>
10755 <link>http://fr.arxiv.org/abs/2010.14746</link>
10756 <description><p>Authors: <a href="http://fr.arxiv.org/find/eess/1/au:+Mahmoud_A/0/1/0/all/0/1">Amr Mahmoud</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Ismaeil_Y/0/1/0/all/0/1">Youmna Ismaeil</a>, <a href="http://fr.arxiv.org/find/eess/1/au:+Zohdy_M/0/1/0/all/0/1">Mohamed Zohdy</a></p>
10757
10758 <p>The introduction of unexpected system disturbances and new system dynamics
10759 does not allow initially selected static system and controller parameters to
10760 guarantee continued system stability and performance. In this research we
10761 present a novel approach for detecting early failure indicators of non-linear
10762 highly chaotic system and accordingly predict the best parameter calibrations
10763 to offset such instability using deep machine learning regression model. The
10764 approach proposed continuously monitors the system and controller signals. The
10765 Re-calibration of the system and controller parameters is triggered according
10766 to a set of conditions designed to maintain system stability without compromise
10767 to the system speed, intended outcome or required processing power. The deep
10768 neural model predicts the parameter values that would best counteract the
10769 expected system in-stability. To demonstrate the effectiveness of the proposed
10770 approach, it is applied to the non-linear complex combination of Duffing Van
10771 der pol oscillators. The approach is also tested under different scenarios the
10772 system and controller parameters are initially chosen incorrectly or the system
10773 parameters are changed while running or new system dynamics are introduced
10774 while running to measure effectiveness and reaction time.
10775 </p>
10776 </description>
10777 <guid isPermaLink="false">oai:arXiv.org:2010.14746</guid>
10778 </item>
10779 </channel>
10780