[HN Gopher] Optuna - A Hyperparameter Optimization Framework
       ___________________________________________________________________
        
       Optuna - A Hyperparameter Optimization Framework
        
       Author : tosh
       Score  : 62 points
       Date   : 2024-04-06 08:26 UTC (14 hours ago)
        
 (HTM) web link (optuna.org)
 (TXT) w3m dump (optuna.org)
        
       | jgalt212 wrote:
       | I'd be curious to check this out. Our shop has a strong bias for
       | models with the least number of hyperparameters.
       | 
       | I will pass along to our ML people.
        
       | gillesjacobs wrote:
       | I have extensively tested Optuna and Weights and Biases (WandB)
       | for hyperparameter tuning on multiple task-specific transformer
       | models back in 2020.
       | 
       | Optuna lost out by a long mile back then in feature parity and
       | dashboarding. Optuna did not have hyperband optimization which
       | was and still is one of the best search algos for hyperopt. It
       | looks like it is possible to implement hyperband yourself now,
       | but in the loosely coupled architecture between Sampler and
       | Pruner it's a bit baroque [1].
       | 
       | Anyway back then it was clear WandB was the far superior choice
       | for features, ease of use, experiment tracking and dashboarding.
       | We went with WandB for our lab.
       | 
       | Could be Optuna caught up, but WandB has seen significant
       | development too. Looking at their dashboard docs, it looks meagre
       | compared to what you can do with WandB.
       | 
       | 1. https://tech.preferred.jp/en/blog/how-we-implement-
       | hyperband...
        
         | zwaps wrote:
         | sadly, w&B means you have to upload it to the cloud, which is
         | not possible in every case :(
        
           | michaelmior wrote:
           | What is the "it" you're referring to? You don't need to
           | upload your model or the weights. You do need to upload the
           | hyperparameters you're optimizing and your target values, but
           | those seem unlikely to be sensitive. (Although I'm sure there
           | are still some legitimate reasons why someone might not want
           | to do so.)
        
           | gillesjacobs wrote:
           | You need to upload your time series (loss, performance
           | metrics) to the cloud. Not your weights or models.
           | 
           | Reliance on cloud services is a legitimate worry though for
           | privacy, IP, process control, reliability, etc.
           | 
           | The comparison between Optuna and WandB was not apples to
           | apples. Optuna is completely self-hosted and local. It also
           | focuses on hyperopt narrowly with flexible design unlike
           | WandB that now assumes to be capture a large part of cloud-
           | based MLOPS workflow.
           | 
           | It would be more fair to compare Optuna to Hyperopt. And I
           | think Optuna was the better choice there, but I did simple
           | PoCing and have no strong opinions.
        
         | tkellogg wrote:
         | looks like they have it now
         | 
         | https://optuna.readthedocs.io/en/stable/reference/generated/...
        
         | 3abiton wrote:
         | When did you do this test? I am curious about the current
         | performance gap.
        
       | ansgri wrote:
       | Used it for general blackbox optimization some 3 years ago,
       | switched to it from comparatively ancient NOMAD [1]. Worked well
       | and easy enough to suggest it as a default choice for similar
       | problems at the time.
       | 
       | [1] https://www.gerad.ca/en/software/nomad/
        
       | nickpsecurity wrote:
       | Another thing you can do is try to optimize hyper parameters with
       | techniques that have fewer or easier hyper parameters. I found
       | several papers where people were using either simulated annealing
       | or differential evolution to optimize the NN's themselves or the
       | hyperparameters. Some claimed to get good results but with
       | higher, computational cost for that component.
       | 
       | I think even a simple NN with few layers could probably pull it
       | off if you already had categorized the types of data you were
       | training the main model with.
        
       | bbstats wrote:
       | HEBO >>>> everything else
        
       ___________________________________________________________________
       (page generated 2024-04-06 23:01 UTC)