[HN Gopher] Neural Network Architecture Beyond Width and Depth
___________________________________________________________________
Neural Network Architecture Beyond Width and Depth
Author : StrauXX
Score : 37 points
Date : 2023-05-21 16:22 UTC (6 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| moosedev wrote:
| Paper was submitted almost exactly 1 year ago, and last revised
| in Jan 2023.
|
| Not sure if title needs a (2022), just pointing out the above in
| case anyone else like me read "19 May" and mistakenly thought it
| was a 2 day old paper :)
| godelski wrote:
| Probably not. The paper was accepted into NIPS 2022[0]. In case
| anyone is wondering I did a diff (add "diff" afer "arxiv" and
| before ".com") on V3 (16 Oct 2022) and V4 (latest: 14 Jan 2023)
| and the changes are just a few typos and a sign flip in the
| appendix (page 17: v3 has f - phi, now reversed)
|
| > just pointing out the above in case anyone else like me read
| "19 May" and mistakenly thought it was a 2 day old paper
|
| Is this common? Maybe because oldest is on top? But read dates
| at bottom for best results (obviously year helps too).
|
| The almost exactly 1 year is probably because NIPS '23
| submissions just closed (supp still open btw).
|
| [0] https://openreview.net/forum?id=36-xl1wdyu
| neodypsis wrote:
| > We propose the nested network architecture since it shares the
| parameters via repetitions of sub-network activation functions.
| In other words, a NestNet can provide a special parameter-sharing
| scheme. This is the key reason why the NestNet has much better
| approximation power than the standard network.
|
| It would be interesting to see an experiment that compares their
| CNN2 model with other parameter-sharing schemes such as networks
| using hyper-convolutions [0][1][2].
|
| [0] Ma, T., Wang, A. Q., Dalca, A. V., & Sabuncu, M. R. (2022).
| Hyper-Convolutions via Implicit Kernels for Medical Imaging.
| arXiv preprint arXiv:2202.02701.
|
| [1] Chang, O., Flokas, L., & Lipson, H. (2019, September).
| Principled weight initialization for hypernetworks. In
| International Conference on Learning Representations.
|
| [2] Ukai, K., Matsubara, T., & Uehara, K. (2018, November).
| Hypernetwork-based implicit posterior estimation and model
| averaging of cnn. In Asian Conference on Machine Learning (pp.
| 176-191). PMLR.
| revskill wrote:
| So we could have n-dimensional Neuron Network to process train
| data. In theory it should work.
| fwlr wrote:
| They introduce "height" as another architectural dimension,
| alongside the usual width and depth. If you imagine the usual
| diagram of a neural network, the difference when a neural net is
| of height 2 is that in the middle layers, each individual node
| contains another network inside it, and that inner network has
| the same structure as the top-level network. For height 3, each
| node has an inner network, and each of those inner networks is
| composed of nodes that have their own inner networks as well. And
| so on, recursively, for greater heights. There's a diagram on
| page 3.
| loehnsberg wrote:
| Couldn't the fully connected 3x3x3 network in Figure 2c simply
| be reformulated as an equivalent 9x9 network that is not fully
| connected? Instead connections between layers of the sub-
| networks would only occur every 3 layers?
| fwlr wrote:
| Yes, it could. I believe the paper does mention this.
___________________________________________________________________
(page generated 2023-05-21 23:00 UTC)