[HN Gopher] Neural Network Architecture Beyond Width and Depth
       ___________________________________________________________________
        
       Neural Network Architecture Beyond Width and Depth
        
       Author : StrauXX
       Score  : 37 points
       Date   : 2023-05-21 16:22 UTC (6 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | moosedev wrote:
       | Paper was submitted almost exactly 1 year ago, and last revised
       | in Jan 2023.
       | 
       | Not sure if title needs a (2022), just pointing out the above in
       | case anyone else like me read "19 May" and mistakenly thought it
       | was a 2 day old paper :)
        
         | godelski wrote:
         | Probably not. The paper was accepted into NIPS 2022[0]. In case
         | anyone is wondering I did a diff (add "diff" afer "arxiv" and
         | before ".com") on V3 (16 Oct 2022) and V4 (latest: 14 Jan 2023)
         | and the changes are just a few typos and a sign flip in the
         | appendix (page 17: v3 has f - phi, now reversed)
         | 
         | > just pointing out the above in case anyone else like me read
         | "19 May" and mistakenly thought it was a 2 day old paper
         | 
         | Is this common? Maybe because oldest is on top? But read dates
         | at bottom for best results (obviously year helps too).
         | 
         | The almost exactly 1 year is probably because NIPS '23
         | submissions just closed (supp still open btw).
         | 
         | [0] https://openreview.net/forum?id=36-xl1wdyu
        
       | neodypsis wrote:
       | > We propose the nested network architecture since it shares the
       | parameters via repetitions of sub-network activation functions.
       | In other words, a NestNet can provide a special parameter-sharing
       | scheme. This is the key reason why the NestNet has much better
       | approximation power than the standard network.
       | 
       | It would be interesting to see an experiment that compares their
       | CNN2 model with other parameter-sharing schemes such as networks
       | using hyper-convolutions [0][1][2].
       | 
       | [0] Ma, T., Wang, A. Q., Dalca, A. V., & Sabuncu, M. R. (2022).
       | Hyper-Convolutions via Implicit Kernels for Medical Imaging.
       | arXiv preprint arXiv:2202.02701.
       | 
       | [1] Chang, O., Flokas, L., & Lipson, H. (2019, September).
       | Principled weight initialization for hypernetworks. In
       | International Conference on Learning Representations.
       | 
       | [2] Ukai, K., Matsubara, T., & Uehara, K. (2018, November).
       | Hypernetwork-based implicit posterior estimation and model
       | averaging of cnn. In Asian Conference on Machine Learning (pp.
       | 176-191). PMLR.
        
       | revskill wrote:
       | So we could have n-dimensional Neuron Network to process train
       | data. In theory it should work.
        
       | fwlr wrote:
       | They introduce "height" as another architectural dimension,
       | alongside the usual width and depth. If you imagine the usual
       | diagram of a neural network, the difference when a neural net is
       | of height 2 is that in the middle layers, each individual node
       | contains another network inside it, and that inner network has
       | the same structure as the top-level network. For height 3, each
       | node has an inner network, and each of those inner networks is
       | composed of nodes that have their own inner networks as well. And
       | so on, recursively, for greater heights. There's a diagram on
       | page 3.
        
         | loehnsberg wrote:
         | Couldn't the fully connected 3x3x3 network in Figure 2c simply
         | be reformulated as an equivalent 9x9 network that is not fully
         | connected? Instead connections between layers of the sub-
         | networks would only occur every 3 layers?
        
           | fwlr wrote:
           | Yes, it could. I believe the paper does mention this.
        
       ___________________________________________________________________
       (page generated 2023-05-21 23:00 UTC)