[HN Gopher] Optimization Without Derivatives: Prima Fortran Vers...
       ___________________________________________________________________
        
       Optimization Without Derivatives: Prima Fortran Version and
       Inclusion in SciPy
        
       Author : zaikunzhang
       Score  : 145 points
       Date   : 2023-05-16 10:49 UTC (12 hours ago)
        
 (HTM) web link (fortran-lang.discourse.group)
 (TXT) w3m dump (fortran-lang.discourse.group)
        
       | zaikunzhang wrote:
       | Who was Powell?
       | 
       | Michael James David Powell FRS was "a British numerical analyst
       | who was among the pioneers of computational mathematics" [1]. He
       | was the inventor/early contributor of Quasi-Newton method, Trust-
       | Region method, Augmented Lagrangian method, and SQP method [1,2].
       | Each of them is a pillar of modern numerical optimization. He
       | also made significant contributions to approximation theory and
       | methods [1,2,3].
       | 
       | Among numerous honors, Powell was one of the first two recipients
       | of the Dantzig Prize [4] from the Mathematical Programming
       | Society/Society for Industrial and Applied Mathematics (SIAM).
       | This is considered the highest award in optimization.
       | 
       | See also his Google Scholar profile [5] and the memorial page at
       | [6].
       | 
       | [1]
       | https://royalsocietypublishing.org/doi/full/10.1098/rsbm.201...
       | 
       | [2] https://en.wikipedia.org/wiki/Michael_J._D._Powell
       | 
       | [3]
       | https://www.cambridge.org/highereducation/books/approximatio...
       | 
       | [4] https://en.wikipedia.org/wiki/Dantzig_Prize
       | 
       | [5] https://scholar.google.com/citations?user=p3kDmy0AAAAJ&hl
       | 
       | [6] https://www.zhangzk.net/powell.html
        
         | onos wrote:
         | Cool, thanks so much for this info. I only knew the name from
         | scipy, where one of the optimization method options carries his
         | name. Good to learn more ... and wow what a contribution.
        
       | genphy1976 wrote:
       | [dead]
        
       | djoldman wrote:
       | Papers for the algos:
       | 
       | [50] M. J. D. Powell. A direct search optimization method that
       | models the objective and constraint functions by linear
       | interpolation. In S. Gomez and J. P. Hennart, editors, Advances
       | in Optimization and Numerical Analysis, pages 51-67, Dordrecht,
       | NL, 1994.
       | 
       | [53] M. J. D. Powell. UOBYQA: unconstrained optimization by
       | quadratic approximation. Math. Program., 92:555-582, 2002.
       | 
       | [56] M. J. D. Powell. The NEWUOA software for unconstrained
       | optimization without derivatives. In G. Di Pillo and M. Roma,
       | editors, Large-Scale Nonlinear Optimization, volume 83 of
       | Nonconvex Optimization and Its Applications, pages 255-297,
       | Boston, MA, USA, 2006.
       | 
       | [58] M. J. D. Powell. The BOBYQA algorithm for bound constrained
       | optimization without derivatives. Technical Report DAMTP
       | 2009/NA06, Department of Applied Mathematics and Theoretical
       | Physics, University of Cambridge, Cambridge, UK, 2009.
        
         | domcoltd wrote:
         | Good coffee is needed to understand Powell's papers on these
         | algorithms. If you don't have, section 3 of the following paper
         | is a good place to start with:
         | 
         | https://arxiv.org/pdf/2302.13246.pdf
        
       | enriquto wrote:
       | I'm gradually moving to (modern) Fortran, and it's just lovely to
       | see all this activity. Fortran really reaches a sweet spot if all
       | you want to do is write straightforward numerical algorithms that
       | run reasonably fast.
        
         | archermarks wrote:
         | I use fortran in my day to day work and most of my work has
         | been modernizing and refactoring an oldish code. Its amazing
         | how nice it can be to work in once you've removed and
         | reorganized the legacy cruft.
        
         | tpoacher wrote:
         | How have you found the experience so far?
         | 
         | I kinda tried this a few years back, but found the
         | documentation/lack of modern tutorials appalling ... but
         | there's been a nice explosion of activity over the last few
         | years (e.g. fpm, lFortran), which makes me think this may have
         | changed considerably.
         | 
         | Also, any tips on nice ways/resources that helped you get
         | (back) into it?
        
           | sampo wrote:
           | > found the documentation/lack of modern tutorials appalling
           | 
           | There isn't good modern Fortran textbooks/courses on the web.
           | 
           | But there are some textbooks that cover the language well.
           | Especially: Metcalf, Reid, Cohen: Modern Fortran Explained,
           | 2018.
        
         | m_mueller wrote:
         | _reasonably fast_ is a bit of an understatement IMO. I don 't
         | know another language that is as machine friendly AND at the
         | same time allows for reasonably high productivity and can be
         | understood by (non-Computer-) scientists.
        
           | wiz21c wrote:
           | How fast ? I gather, for example, that Intel's Fortran is a
           | frontend to LLVM. So the speed of that Fortran can't be much
           | different than anything compiled by LLVM such as rust or C.
           | Correct ?
        
             | m_mueller wrote:
             | the main question is: how fast after how much learnings and
             | optimizations that went into the 'naive' version of the
             | application.
             | 
             | Just an example on how to declaring a couple of input float
             | pointers in a fully optimized way, avoiding aliases and
             | declaring it readonly (if I remember correctly, it's been a
             | few years:
             | 
             | Fortran:                   real(32), intent(in) :: foo,
             | bar, baz
             | 
             | C:                   const float *const restrict foo
             | const float *const restrict bar         const float *const
             | restrict baz
             | 
             | of course what happens then is that people will do a
             | typedef and hide it away... and then every application
             | invents its own standards and becomes harder to
             | interoperate on a common basis. in Fortran it's just baked
             | in.
             | 
             | Another example: multidimensional arrays. In C/C++ it
             | either doesn't exist, is not flat memory space (and thus
             | for grid applications very slow), or is an external library
             | (again restricting your interop with other scientific
             | applications). In Fortran:                   integer,
             | parameter :: n = 10, m = 5         real(32), dimension(n,
             | m) :: foo, bar, baz
             | 
             | Again, reasonably simple to understand, and it's already
             | reasonably close to what you want - there are still ways to
             | make it better like memory alignment, but I'd claim you're
             | already at 80% of optimal as long as you understand memory
             | layout and its impact on caching when accessing it (which
             | is usually a very low hanging fruit, you just have to know
             | which order to loop it over).
        
               | messe wrote:
               | > Another example: multidimensional arrays. In C/C++ it
               | [...] is not flat memory space (and thus for grid
               | applications very slow)
               | 
               | Can you clarify what you mean by this? A naively defined
               | multidimensional array,                   float
               | foo[32][32] = ...
               | 
               | is stored in sizeof(float) * 32 * 32 = 4096 consecutive
               | bytes of memory. There's no in-built support for array
               | copies or broadcasting, I'll give you that, but I'm still
               | trying to understand what you mean by "not flat memory
               | space".
        
               | simiones wrote:
               | C++ doesn't support something like:                 int
               | a, b;            int arr[a][b]; // ERROR: all array
               | dimensions except for the first must be constant
               | //or even:             auto arr = new int[a][b]; //
               | ERROR:  error: array size in new-expression must be
               | constant
               | 
               | All you can do if you want a dynamic multidimensional
               | array is:                 int a, b;            auto arr =
               | new int*[a];       for (int i = 0; i < a; i++) {
               | arr[i] = new int[b];       }
               | 
               | But now it's a jagged array, it's no longer flat in
               | memory.
        
               | int_19h wrote:
               | I think they meant dynamically sized multidimensional
               | arrays (and arrays of pointers often used to emulate
               | those).
               | 
               | However, even then it's not quite true given C99 VLA.
        
               | simiones wrote:
               | I should note that VLAs are no longer a guaranteed
               | feature of even a standards-compliant C compiler (they
               | were made an optional feature in C2011).
        
               | int_19h wrote:
               | Indeed, but they're still in the Standard, so if an
               | implementation does them, it does them in a portable way
               | - and gcc and Clang both support them, so there's no
               | practical issue with access to the feature.
        
               | m_mueller wrote:
               | Ok, I was wrong about it not being flat, but due to lack
               | of support from my experience it is usually not used to
               | represent grids in a simulation, as dealing with it tends
               | to involve loops, which may or may not be optimised away.
               | In Fortran multidim arrays are supported so much that
               | they are used everywhere. Even debuggers allow you to
               | display arbitrary slices of multidim arrays in local
               | scope.
        
             | cscheid wrote:
             | The language semantics matter greatly.
             | 
             | GHC with `-fllvm` is not going make Haskell any easier to
             | compile just because it's targeting LLVM. Fortran is
             | (relatively!) easy to make fast because the language
             | semantics allow it to. Lack of pointer aliasing is one of
             | the canonical examples; C's pointer aliasing makes systems
             | programming easier, but high-performance code harder.
        
               | pklausler wrote:
               | Pointers in Fortran can alias with each other and with
               | valid pointer targets. What makes Fortran potentially
               | easier to optimize are its rules against dummy argument
               | aliasing in most (but not all) cases.
               | 
               | (See https://github.com/llvm/llvm-
               | project/blob/main/flang/docs/Al... for the full story.)
        
               | cscheid wrote:
               | > dummy argument aliasing
               | 
               | Thanks for the precise Fortran terminology; that's what I
               | meant to say in my head but you're correct. From the
               | linked website for everyone else:
               | 
               | "Fortran famously passes actual arguments by reference,
               | and forbids callers from associating multiple arguments
               | on a call to conflicting storage when doing so would
               | cause the called subprogram to write to a bit of that
               | storage by means of one dummy argument and read or write
               | that same bit by means of another."
        
             | GeompMankle wrote:
             | Negative. Fortran is in a better position to guarantee
             | certain facts about sequential access in arrays and solid
             | information about pointer aliasing that is generally not
             | available in C or C++ unless the author of the C/C++ is
             | extremely aware for compiler quirks and pragma. Fortran has
             | a "just works" attitude to high speed array processing
             | where as other languages are focused on edge cases that are
             | good for PhD thesis on general computing optimization but
             | rarely work in the easiest case without extensive pragmas.
             | 
             | See also CUDA. Sure you can write a C to CUDA converter
             | auto-vectorizer but its likely to have all sorts of bugs
             | and usually never work right except in rare hand-tune
             | cases. May as well just write CUDA from scratch if it is to
             | be performant. Same for array processing, wanna array
             | process? Use compiler for array processing like Fortran or
             | ISPC.
        
           | enriquto wrote:
           | > reasonably fast is a bit of an understatement IMO.
           | 
           | I mean that a naive loop that traverses your data does not
           | take several minutes like in Python. But there's a guy in my
           | lab who takes my stupid loops, avxs the fuck out of them, and
           | they become 8 times faster. Now, _that 's_ what you would
           | call unreasonably fast (short of rewriting the whole thing
           | for the GPU).
        
       | ZeroCool2u wrote:
       | Wow, just a few weeks ago I wanted an optimization function in
       | Python and was already using SciPy, so I reached for
       | scipy.optimize and ran into quite a few of the bugs described
       | first hand, which had me kind of rattled. I just assumed whatever
       | optimizer SciPy uses would be reliable even if it's not
       | completely optimal or cutting edge.
       | 
       | I love the dedication these folks have to their old colleague
       | too. They took a great deal of care in relicensing, so that SciPy
       | could use their derived code even though the original author had
       | passed. Nothing but admiration for them!
       | 
       | Anyways, I was lucky enough that I could restrict my search space
       | on some assumptions and just brute force the problem in very
       | reasonable time, but this is quite timely for me and great to
       | see!
        
         | whimsicalism wrote:
         | From my experience with this (in 2018), scipys non-gradient
         | optimizers are not very good/
        
           | domcoltd wrote:
           | Indeed, the author mentions many bugs of the SciPy gradient-
           | free solvers in the readme of the repo of Prima.
        
         | NwtnsMthd wrote:
         | The optimization libraries in SciPy are surprisingly bad. I've
         | found Mystic to be a great alternative.
         | 
         | https://github.com/uqfoundation/mystic
        
       | domcoltd wrote:
       | GitHub repo of the project: https://github.com/libprima/prima
        
       | whinvik wrote:
       | Going to the Github issue page and reading the exchange is
       | fascinating. Most of the discussion is about licensing and
       | through the thread you see it getting resolved!
       | 
       | Also the author's dedication to Prof. Powell seems amazing. I
       | know nothing about Prof. Powell but I am sure he would have much
       | appreciated it.
        
       | quickthrower2 wrote:
       | Why might it be goos to not use derivatives? Is it faster,
       | easier? Are there some functions whose derivatives are not known
       | or are too complicated?
        
         | ok123456 wrote:
         | If you are using a derivative based optimization, smooth
         | derivatives of your function need to exist.
         | 
         | Trying to use a derivative based optimization where a smooth
         | derivative doesn't exist will likely result in getting stuck at
         | a cliff where there's no change to either the function value or
         | to the gradient.
        
         | 6gvONxR4sf7o wrote:
         | I recently wrote some straightforward typescript code, and I
         | want to find the inputs that make one of the outputs exactly
         | zero. My options are to a) rewrite it in an autograd tool b)
         | manually calculate the derivatives and write a new function for
         | them, or c) just use a root-finder that doesn't need
         | derivatives. I have the cpu cycles to spare, so it will save a
         | huge amount of time to just do c.
         | 
         | And as a tangential gripe, I wish js had operator overloading.
         | In python, I could just call my existing function with a
         | symbolic argument and get a symbolic output to differentiate,
         | because my `x+y*z` code would work on both kinds of arguments.
        
         | Nevermark wrote:
         | I am not familiar with these algorithms, but:
         | 
         | Many common functions have non-derivative friendly
         | discontinuities. Logic functions, rounding, conditional
         | clauses, sawtooth functions, etc.)
         | 
         | Other non-derivative functions have flat constant zero-
         | derivative areas. Like logic functions & rounding (again), etc.
         | 
         | Some functions have derivatives but have so many minima (or
         | maxima) that non-optimal or non-acceptable minima are likely to
         | trap learning. Sine and cosine waves, for instance.
         | 
         | Some functions even have repeating discontinuities going
         | through +/-infinity. Tangent and secant.
         | 
         | Or, in some cases a model may exist of the system (or part of
         | the system) to be optimized, which can be used to calculate the
         | system response, but it contains relationships that are not
         | easily accessible, so accurate derivatives are difficult or
         | time costly to estimate. Since there is no symbolic expression
         | to allow for efficient calculation of the derivatives.
        
         | kxyvr wrote:
         | I work professionally in computational optimization, so here
         | are some of my observations coming from the world of mechanics.
         | 
         | In my experience, it's because the client is trying to reuse an
         | existing simulation code for something that requires
         | optimization. This could be something like imaging (parameter
         | estimation), calibration, or control. For example, say someone
         | has something like a PDE solver that can simulate sound waves
         | going through a media. If you can match the output of this
         | simulation to some collected data, the parameters for the
         | simulation will tell you what material is where.
         | 
         | Anyway, the issue is that these existing codes are rarely
         | instrumented or derived in a way to make it easy to get
         | derivatives. Management doesn't want pay for a rewrite, so they
         | demand that existing equity (the simulator) be used. Then, a
         | derivative free optimizer (DFO) gets slapped on as a first
         | attempt. To be clear, this doesn't work particularly well, but
         | it tends to be the place where I most often see these
         | algorithms used.
         | 
         | As to why it doesn't work very well, derivative based
         | optimizers can scale to the order of hundreds of millions of
         | variables without too many issues (modulo the kind of problem,
         | but it works very well for a lot of problems of value). DFOs
         | tend to have issues past a few dozen variables.
         | 
         | Now, for anyone not wanting to end up in this conundrum, when
         | you write your original simulation code parameterize on the
         | floating point type. Basically, don't write the code with
         | double, use some kind of generic, template parameter, or
         | whatever on something we'll call MyFloat. The purpose here is
         | to allow the code to be more easily compatible with automatic
         | differentiation (AD) tools, which will give a real and machine
         | precision exact derivative. Don't do finite differences. Don't
         | think reinstrumenting later is easy. It's not. It's a pain.
         | It's expensive. And, to be clear, AD can be done poorly and it
         | can be slow. However, I've never seen a code instrumented with
         | AD and a good gradient based optimizer do more poorly than a
         | derivative free optimization code. They've always done
         | dramatically better. Beyond just giving the derivative, I'll
         | also mention that even if the client wants to recode the
         | simulator to give an analytic derivative for speed, it is
         | _vastly_ easier to do so if the results can be checked with a
         | simulator instrumented with AD. There 's also other tricks of
         | value that can be used in a properly instrumented code such as
         | interval arithmetic for some kinds of sensitivity analysis.
         | 
         | Anyway, there are other reasons to use derivative free
         | optimization. Some functions just flat out don't have
         | derivatives. However, in the world of mechanics, most do and
         | DFOs gets abused due to poor prior planning. In either case,
         | please, for the love of math, just parameterize on your
         | floating point type. It would make my job much easier.
        
           | domcoltd wrote:
           | Agreed. If (approximate) 1st-order information can be
           | obtained in any way (even by * _carefully*_ deployed finite
           | difference), then gradient-based methods should be used. It
           | is wrong to use DFO in such cases.
           | 
           | > DFOs tend to have issues past a few dozens variables.
           | 
           | It seems that the PRIMA git repo
           | https://github.com/libprima/prima shows results on problems
           | of hundreds of variables. Not sure how to interpret the
           | results.
        
             | domcoltd wrote:
             | > DFOs tend to have issues past a few dozens variables.
             | 
             | It is interesting to check the results in section 5.1 of
             | the following paper:
             | 
             | https://arxiv.org/pdf/2302.13246.pdf
             | 
             | where these algorithms are compared with finite-difference
             | CG and BFGS included in SciPy.
        
         | bloaf wrote:
         | Imagine if you were trying to optimize a real physical system
         | you didn't have a model of. These methods will still work; you
         | could implement them "by hand" if necessary.
        
         | chpatrick wrote:
         | Quite often these optimizers "just work", and all you have to
         | do is pass a numeric function you want to optimize.
        
         | adornedCupcake wrote:
         | Here people discuss that for a chaotic system automatic
         | differentiation will diverge.
         | https://discourse.julialang.org/t/when-forwarddiff-ing-throu...
         | 
         | I'm trying to come up with a simple example. I think that for
         | bunch of flying hard spheres in a box, any measurement that
         | requires sufficiently time spaced samples of their positions
         | will have trouble being differentiated.
        
         | efavdb wrote:
         | Yes sometimes it's hard to measure a derivative. Eg when doing
         | hyperparameter tuning in ML, you can read out a metric at a
         | given choice of parameters, but it's generally not easy to get
         | a gradient.
         | 
         | Shameless plug: I happen to have recently written a package for
         | the opposite limit! It finds roots when you can only measure
         | the derivative.
         | 
         | https://github.com/EFavDB/inchwormrf
        
           | enriquto wrote:
           | > It finds roots when you can only measure the derivative.
           | 
           | How is that even possible? The function f(x)=x-7 has a single
           | root at x=7. Its derivative is f'(x)=1. How do you recover 7
           | from 1 ?
        
             | CorrectHorseBat wrote:
             | You need one initial value to start from
        
               | [deleted]
        
             | [deleted]
        
           | imurray wrote:
           | You might be interested in David MacKay's old conjugate
           | gradient implementation in C that also only uses derivatives:
           | https://www.inference.org.uk/mackay/c/macopt.html
           | 
           | (It was written in another time, don't expect it to be nice.
           | My 20-year-old Octave/Matlab wrappers on that page have
           | almost certainly bit-rotted, don't expect them to work.)
        
             | efavdb wrote:
             | Thank you! I'll definitely check this out.
        
           | quickthrower2 wrote:
           | I assume backprop is still most efficient for the neural net
           | itself, rather than one of these algorithms? I am doing a
           | course and I am aware that there are other algorithms than
           | gradient descent but haven't seen the details yet on that.
           | 
           | I know things like cross entropy loss are designed to be easy
           | to differentiate and probably more efficient to do that than
           | something that approximates it.
        
             | efavdb wrote:
             | Yeah, backprop gives efficient gradient calcs to train at a
             | given net architecture. To find the best architecture
             | though people try various other methods, eg random search
             | or Gaussian processes, which don't evaluate derivatives wrt
             | architecture Params.
        
         | wenc wrote:
         | I work in this field. Derivatives (and in fact second
         | derivatives or a Hessian matrix), when they are reliable and
         | when you have them, makes your search faster.
         | 
         | Unfortunately for many problems, they are either unavailable or
         | expensive to compute. (Imagine a case when your "function" is a
         | full CFD simulation or a 2 hr job -- every time an optimizer
         | iterates it has to run an entire simulation. You can't see the
         | function and you can't really calculate the derivatives --
         | except crudely through finite differences).
         | 
         | Derivative free optimizers choose to forgo derivative
         | information and instead try to efficiently search the space to
         | optimize the objective. Their convergence is often much slower
         | than derivative based optimizers but they are also much simpler
         | to implement and tend to not get stuck in valleys. All they
         | really are are a way to cleverly evaluate f(X) to find the
         | optimum, without needing f'(X) or f''(X).
         | 
         | Caveat: derivative free optimizers are terrible for constrained
         | optimization however (especially if your constraints are
         | anything but simple bounds). You can use barrier methods but
         | still they're not great.
         | 
         | Also there's an element of luck involved. It's hard to
         | benchmark derivative free methods because the methods
         | themselves don't matter too much (Nelder Mead Simplex method
         | outperforms Powell's method in some cases but not others.
         | Powell's method is well known but isn't categorically better
         | than any other algorithm) because when you don't use
         | derivatives, other higher order effects like initial point and
         | shape of terrain dominate over the the choice of algorithm. I
         | wouldn't be too hung up on picking the right algorithm --
         | instead I would focus my efforts on heuristics to get to a good
         | initial point cheaply.
        
           | ggraphilia wrote:
           | `they are also much simpler to implement ...'
           | 
           | According to the papers metnioned by others, the algos under
           | discussion do not seem to support this point. The author of
           | the project mentiones that the FORTRAN 77 implementation is
           | `nontrivial to understand', which motivates this work.
           | 
           | `Caveat: derivative free optimizers are terrible for
           | constrained optimization however (especially if your
           | constraints are anything but simple bounds). You can use
           | barrier methods but still they're not great.'
           | 
           | Lincoa and cobyla in the package can handle problems with
           | nontrivial constriants according to the documentation of the
           | git repository, though I do not know how well they work.
        
             | [deleted]
        
         | kergonath wrote:
         | All of the above. Some functions might not have derivatives in
         | some places (where the function is discontinuous), the
         | derivatives can be very difficult or impossible to calculate
         | even if they exist.
        
         | sampo wrote:
         | > Why might it be goos to not use derivatives?
         | 
         | Sometimes you really don't know how to calculate the
         | derivatives.
         | 
         | But perhaps more often, even if in principle you could do the
         | additional work, formulate how to calculate derivatives, and
         | add the required extra code, you just don't want to invest the
         | additional resources, when you can just call a derivative-free
         | optimization library. The optimization will run slower, but
         | perhaps you have some extra seconds so spare.
        
       | certik wrote:
       | This is a very important effort: if SciPy accepts this modern
       | Fortran codebase, then I think the tide for Fortran will change.
       | 
       | My own focus is to compile all of SciPy with LFortran, we can
       | already fully compile the Minpack package, here is our recent
       | status update: https://lfortran.org/blog/2023/05/lfortran-
       | breakthrough-now-...
       | 
       | And we have funding to compile all of SciPy. If anyone here is
       | interested to help out, please get in touch with me. Once
       | LFortran can compile SciPy, it will greatly help with
       | maintenance, such as on Windows, WASM, Python wrappers, even
       | modernization later on.
        
         | Loic wrote:
         | I contacted you on your email address from your personal
         | website.
        
         | zaikunzhang wrote:
         | Thank you @certik for your encouragement!
         | 
         | I cherish the following comment from you:
         | 
         | > if SciPy accepts this modern Fortran codebase, then I think
         | the tide for Fortran will change.
         | 
         | I believe that you and the LFortran project are truly turning
         | the tide on Fortran. I wish to see the production release of
         | LFortran as soon as possible.
         | 
         | If PRIMA can also contribute to "turning the tide on Fortran",
         | I will feel honored. All the pain, depression, and frustration
         | that I have experienced in the past three years will then be
         | partially compensated.
         | 
         | Let us the community work together to get PRIMA included in
         | SciPy and see whether the tide on Fortran will improve.
         | 
         | Many thanks.
        
         | ggraphilia wrote:
         | LFortran sounds to be a genenius project. The interactivity is
         | a critical block that is missing from FORTRAN ecosystem. Hope
         | things will get better with LFortran!
        
         | kergonath wrote:
         | I can't really contribute, but it's great to see developments
         | around LFortran!
        
       | RockyMcNuts wrote:
       | I think there are a few derivative-free global optimizers,
       | curious what sets this one apart? see e.g.
       | https://www.microprediction.com/blog/optimize . If the current
       | scipy-native one is unmaintained and this one is superior it
       | certainly makes sense to update.
        
         | icegreentea2 wrote:
         | The proposal isn't a "new" algorithm, rather a rewrite (plus
         | bug fixes) of the existing scipy implementation of "COBYLA"
         | (which I don't think it tested in your linked comparison).
        
         | ckmiller wrote:
         | These algorithms work best on smooth functions with mid-to-low
         | 100's of variables. They are very sample efficient, often
         | obtaining superlinear convergence using just a linear number of
         | evaluations of the objective function.
        
       | ckmiller wrote:
       | I'm very glad to see these algorithms get the maintenance they
       | deserve. I pulled them apart and used lots of their pieces and
       | theory for my PhD; MJD Powell's original code was subtle and
       | difficult to understand. Hats off to those doing to the hard work
       | of keeping them alive. These algorithms are the absolute best in
       | their domain, and should be more widely known.
        
       | domcoltd wrote:
       | The extensive tests made by GitHub actions are simply amazing.
        
       | V1ndaar wrote:
       | I'm surprised not to see a single mention of NLopt [0] in the
       | discussion. I was under the impression most algorithms
       | implemented in NLopt are in C and not using any Fortran77. But
       | I'm sure the author knows something I don't (I've never really
       | looked at NLopt's sources, but I've been using it for years).
       | 
       | Anyhow, the more alternatives the merrier I suppose!
       | 
       | [0]: https://nlopt.readthedocs.io/en/latest/
        
         | zaikunzhang wrote:
         | Extraction from https://github.com/libprima/prima
         | 
         | "There do exist "translations" of Powell's Fortran 77 code into
         | other languages. For example, NLopt contains a C version of
         | COBYLA, NEWUOA, and BOBYQA, but the C code in NLopt is
         | translated from the Fortran 77 code straightforwardly, if not
         | automatically by f2c, and hence inherits the style, structure,
         | and probably bugs of the original Fortran 77 implementation."
         | 
         | See also https://github.com/stevengj/nlopt/issues/501 , where
         | the author of NLopt talks.
        
           | V1ndaar wrote:
           | Ah, thanks!
        
       ___________________________________________________________________
       (page generated 2023-05-16 23:01 UTC)