[HN Gopher] What Happens to Relicensed Open Source Projects and ...
       ___________________________________________________________________
        
       What Happens to Relicensed Open Source Projects and Their Forks?
        
       Author : zdw
       Score  : 94 points
       Date   : 2024-12-28 22:25 UTC (3 days ago)
        
 (HTM) web link (thenewstack.io)
 (TXT) w3m dump (thenewstack.io)
        
       | msla wrote:
       | Related: "Fear of Forking" by Rick Moen
       | 
       | http://linuxmafia.com/faq/Licensing_and_Law/forking.html
        
         | mschuster91 wrote:
         | > That's why forking is uncommon in open-source code, and even
         | more so in (specifically) GPLed code: The improvements one
         | group makes in its would-be "fork" are freely available to the
         | main community.
         | 
         | Unfortunately, in the smartphone world this just isn't reality.
         | Trying to obtain code dumps is hard enough for major brands,
         | outright impossible for the myriad of cheap clones. And
         | embedded is even worse, almost _no one_ cares about
         | distributing the GPL code of the BSP, mainly due to fear of
         | violating chipset vendor NDAs.
        
           | graemep wrote:
           | > Trying to obtain code dumps is hard enough for major
           | brands, outright impossible for the myriad of cheap clones.
           | And embedded is even worse, almost no one cares about
           | distributing the GPL code of the BSP, mainly due to fear of
           | violating chipset vendor NDAs.
           | 
           | Sounds like an opportunity for the copyright holders to make
           | some money by suing and dual licensing.
        
           | bluGill wrote:
           | for a short time only improvements in one are available. Then
           | the two diverge andchangesecannot merge. khtml couldn't bring
           | in any changes from apples fork
        
           | cube2222 wrote:
           | > The improvements one group makes in its would-be "fork" are
           | freely available to the main community.
           | 
           | IANAL, but there's a caveat here, which is that a lot of
           | these forks are due to companies relicensing to source-
           | available licenses, which generally means they require a CLA
           | (and full copyright license) from each of their contributors,
           | so that they can relicense the codebase at will.
           | 
           | The code committed to the fork can't be pulled by the
           | relicensed project in this case, unless it's the original
           | contributor making a contribution to both, because such code
           | would only be covered by the fork's license, not by the new
           | license nor CLA.
        
             | mschuster91 wrote:
             | My biggest gripes are u-boot and the Linux kernel. Both are
             | _clearly_ GPL only, you _must_ provide sourcecode for your
             | modifications including drivers as a vendor, and yet so
             | many make one jump through hoops it 's not even close to
             | funny any more. Or they don't fulfill their obligations at
             | all.
        
           | simne wrote:
           | > almost no one cares about distributing the GPL code of the
           | BSP, mainly due to fear of violating chipset vendor NDAs
           | 
           | Most problem of BSP programming is it is really complicated,
           | because need to fit within limits of hardware and need to
           | have deep knowledge of DSP environment.
           | 
           | So it is very interest question, who will do complicated
           | things for free, or who will dive deep for free.
           | 
           | Unfortunately, too many people compare apples with carrots,
           | in this case compare definitively shallow frontend/full-stack
           | programming vs hardcore embedded.
           | 
           | And returning to question, in real life, nobody want to
           | rewrite all core code for BSP, but really use huge chunks of
           | ready made code, provided by vendor, so, sure they have very
           | tight coupling to vendor copyrights.
           | 
           | Sometimes, things are even worse with hardware limitations,
           | which just made impossible to write other way than does
           | vendor.
           | 
           | Other problem, regulations - using vendor code you
           | automatically obey laws, or to be honest, you shift
           | responsibility to vendor, but if write your own code from
           | scratch, will need someway create proof that people could
           | trust to your code.
        
       | alhirzel wrote:
       | The diversity measures used in this study are a fascinating
       | window into some unusually measurable communities. The count of
       | contributors and volume of their contributions are proxies to
       | many things, including project popularity, ease of contribution,
       | number of approachable fixes (tantamount to how many simple/low-
       | hanging bugs there are), and diversity of use cases exposing the
       | product to new situations (i.e. potential growth of project
       | scope). These things need to align for diversity of contributors'
       | motivations to arise and contributors to approach the project
       | initially, but different things need to arise to sustain
       | involvement: introduction of new bugs, need for completely new
       | features (i.e. a growing project scope), continuing need for
       | refinement of otherwise battle-tested code (i.e. performance
       | gains remaining on the table), and continued relevance as other
       | alternative packages and paradigms rise and fall. I can't wait to
       | read their future work, and I hope it includes measures of
       | project maturity (in the senses of feature-completeness/code
       | quality as well as whether functional scope is growing or not).
       | Surely there are projects that lack contributors for the simple
       | reason that the projects are "done", and surely there are
       | projects where engagement looks like disproportionately many
       | shallow contributions due to immaturity of the product, and
       | surely there are projects that have wider or shallower pools of
       | scope to draw from (as well as management ethoses that readily
       | take on new scope or are avoidant of the same).
       | 
       | Some opposing examples are the Linux kernel (eternally growing
       | scope, with huge motivation by many user communities) and libpng
       | (which is relatively fixed in scope, with desirements like
       | security increasing the bar for contributions to an already
       | mature and popular product).
        
       | tym83 wrote:
       | Hm, I was expecting more business point of view in this article.
       | Right now we are looking information about financial results from
       | relicensing open source. Unfortunately, it is about repositories
       | health. But the article still interesting.
        
         | jamietanna wrote:
         | There's some more discussion https://thenewstack.io/why-open-
         | source-forking-is-a-hot-butt... that may be interesting.
         | 
         | Are you in the UK by any chance? I'm sure OpenUK would be
         | interested to chat more (given they've been working on research
         | and impact analysis in this area)
        
       | gtirloni wrote:
       | An equally important question is how much of the lack of
       | organizational diversity in forked projects was due to
       | constraints/roadblocks to contributions imposed by the
       | controlling companies.
        
       | noodletheworld wrote:
       | > It is still too early to understand the ultimate success or
       | failure of these projects -- both the original and the fork.
       | 
       | Mm. This is more "here are some projects and their forks" than
       | "what happens to..."
       | 
       | Ie. TLDR; they're both going fine in all cases, so far.
       | 
       | Guess we wait and see eh?
        
       | alexey-salmin wrote:
       | The topic is indeed very interesting but before studying commit
       | author diversity it would be useful to understand the volume and
       | traction. Statistically most of the forks are dead ends, even if
       | maintained by a few enthusiasts for some time.
       | 
       | I'm sure opensearch won't die until it's a commercial offering of
       | AWS but how is going? Any new features coming, a product roadmap
       | exists? Or it's mainly bugfixes and maintenance? What about
       | Opentofu?
       | 
       | Even something basic like a graph of LOC changed over time, with
       | a fork in the middle would help to put the article into
       | perspective.
        
         | cube2222 wrote:
         | OpenTofu is doing really well I'd say, and only picking up
         | steam as it's going.
         | 
         | Product roadmap-wise, the team has made some big improvements
         | that have been requested by the community for years, with
         | another big release coming very soon (I believe next week or
         | the one after), here's some of the major ones:
         | 
         | - End-to-End State Encryption - lets you encrypt your state-
         | file end-to-end, either with a key management system like AWS
         | KMS, or static keys.
         | 
         | - Early Evaluation - the ability to parameterize initialiation-
         | time values, like module versions and sources, backend
         | configuration parameters, etc. and keep them DRY.
         | 
         | - (Coming in 1.9) - provider iteration, which lets you use
         | for_each with providers, e.g. create one provider per region,
         | something that currently requires a bunch of copy-paste, or
         | tools like Terragrunt
         | 
         | - (Coming in 1.9) - -exclude flag, which is the opposite of the
         | -target flag, letting you skip planning/applying certain
         | resources.
         | 
         | Probably the best way to see a summary is check out the release
         | blog posts for 1.7[0], 1.8[1], and 1.9-beta[2]. Many of those
         | required non-trivial changes to existing parts of the codebase.
         | 
         | One of the biggest Terraform contributors has also joined
         | Spacelift a couple months ago to work on OpenTofu. All things
         | considered, I'm very confident that the team will be able to
         | handle any feature it sets their minds to, and that those
         | improvements will keep coming. There's a ranking of top-voted
         | issues which is probably the best way to loosely see what will
         | be tackled next[3].
         | 
         | [0]: https://opentofu.org/blog/opentofu-1-7-0/
         | 
         | [1]: https://opentofu.org/blog/opentofu-1-8-0/
         | 
         | [2]: https://opentofu.org/blog/opentofu-1-9-0-beta1/
         | 
         | [3]: https://github.com/opentofu/opentofu/issues/1496
         | 
         | Disclaimer: I am involved in the OpenTofu project and was
         | previously its tech lead.
        
           | gioazzi wrote:
           | Provider iteration is a really nice one - I had a big
           | monorepo that would deploy some baseline services in many AWS
           | accounts, across multiple regions, generating tf.json files
           | for each provider to match all accounts that were created.
           | 
           | However, what really broke this model at some point was the
           | fact that we were running so many providers instances that
           | our Terraform Cloud would go out of memory! Since each
           | provider instance in tf is really launching a new process it
           | really adds up... At some point I was thinking since the
           | engine and the providers use gRPC to communicate, it MAY be
           | possible to distribute providers across machines, but I never
           | investigated it further... I'm pretty sure there was a notice
           | in the tf plugin SDK stating that it was not possible to
           | connect them over a network... but why not? -\\_(tsu)_/-
        
             | cube2222 wrote:
             | Yeah, esp. the AWS provider is pretty memory-intensive.
             | 
             | I believe someone on the team did some investigation into
             | this (running providers remotely) but it's not really a
             | priority (if it is for you, feel free to voice that on the
             | issue tracker!).
             | 
             | Frankly though, with pricing for cloud instances being
             | generally linear wrt to the CPU/memory size of the
             | instance, I don't think there's much reason to prefer many
             | smaller machines over just using a larger single one and
             | avoiding all this added complexity.
        
         | bhaak wrote:
         | We're just migrating to OpenSearch from Xapian and it was one
         | of the question we had. We didn't want to go from a solution in
         | maintenance mode to one which could be a dead end soon.
         | 
         | From what we've seen there is lots of active development going
         | on with many features being added.
         | 
         | But we don't yet use any fancy features and could easily switch
         | to ElasticSearch if need be. So we got a backup plan.
        
           | chippiewill wrote:
           | > But we don't yet use any fancy features and could easily
           | switch to ElasticSearch if need be.
           | 
           | I think the decision paralysis is the big deal here. I have
           | the exact same situation with OpenTofu and Terraform, they're
           | diverging rapidly yet it's not entirely clear which way the
           | wind is blowing. They both now have compelling and
           | interesting features that the other doesn't have.
           | 
           | So the outcome is that I'm now not using any new features.
        
             | cube2222 wrote:
             | I think many people have been in this situation.
             | 
             | In practice, and I'm extremely biased here, I'd consider
             | the most risk-averse option to be going with OpenTofu but
             | not using any of its exclusive new features. With this you
             | get dependency updates and the widest competitive range of
             | vendors in case you ever want to use a commercial
             | orchestrator service for it.
             | 
             | However, it seems to me folks at companies of all sizes are
             | increasingly deciding to bite the bullet and migrate, esp.
             | since the last release a couple months ago. E.g. see the
             | talk by Fidelity[0] on OpenTofu Day at Kubecon.
             | 
             | [0]: https://youtu.be/7Ypulc2GyoE
             | 
             | Disclaimer: I am involved in the OpenTofu project and was
             | previously its tech lead.
        
       | antirez wrote:
       | Every time I see some post where commits are taken as
       | contribution metrics, I remember when, after working for months
       | at it, I merged Redis Cluster into Redis as a single commit, and
       | saw the pale green square appearing for that day in my GitHub
       | contributions chart. Now it's 1.5 months that I work 10h/day at
       | Redis Vector Sets and they will also be a single commit. It's
       | very simple to do better than that, as a metric: different
       | developers have different habits. For me, a stream of changes of
       | early-day design just pollute the contribution history. Starting
       | from a solid and advanced beta, then yes, history is great to
       | have.
        
         | sgc wrote:
         | I think the authors agree with you. They tried to look at lines
         | of code added / deleted (eg "they consistently made over 95% of
         | the lines added to and deleted from Elasticsearch") - although
         | the language in the article flops between that and just saying
         | 'commits', so it's not sure what they were actually looking at
         | for the write up. In their scraping code / dataset linked at
         | the start of the article, they are logging `commits_list =
         | [commit_date, dels, adds, oid, author]`.
         | 
         | This is also just a blog summary of a preliminary study:
         | 
         | > "This is the first step in a much larger research project
         | underway [...] we're working toward including more repositories
         | and additional metrics to better understand the project health
         | dynamics within these projects."
         | 
         | Project activity will remain inherently fuzzy. Just about
         | everybody who programs extensively has spent a couple days to
         | change a line or two of code at some point in their life. No
         | metric can capture that unless we are all journaling and
         | publishing our life activities.
         | 
         | Nonetheless we can do better than commits, as you said. If you
         | review most anything online, there is a global score and then
         | 3-5 categories with subscores. Surely the same should be true
         | here. Freshness of LOC changes, average freshness of the
         | overall codebase as a percent, issues satisfactorily resolved
         | (and not closed because they are blown off, which should be a
         | negative indicator), merged pull requests, to think offhand of
         | a few.
         | 
         | What would be your top 5 categories to evaluate the "health" of
         | a code base, admitting that any evaluation will remain a very
         | fuzzy approximation at best?
        
           | antirez wrote:
           | The data about Redis can be true only if they mean "commits".
           | This is why I believe they checked Github contributions
           | numbers.
        
             | sgc wrote:
             | In that case they did not evaluate it with enough care,
             | given they gathered more information than that. Hopefully
             | they correct that as they progress.
             | 
             | I am quite curious as to your take on a few metrics that
             | would help evaluate the health of a code base. It's a dirty
             | job, but we all have to do it every time we look for
             | something new.
        
       | drdrek wrote:
       | As with many more modern "Open source" projects, the openness is
       | more of a "You can try for free, and once you need production
       | level SLA come call us". While the source code is available to
       | look at it's more of a facade as almost no one change it but the
       | owning company. Limited by complexity of the project, maintainer
       | politics and IP around supporting tools and materials.
       | 
       | OSS has basically theseus shiped into something completely
       | different. Not a criticism just an observation.
        
         | Arrowmaster wrote:
         | One key element often overlooked but mentioned in the recent
         | post from Fusion Auth is the Business Continuity aspect. If
         | your provider suddenly shuts down but the product was open
         | source and self hostable, you can pay someone else to keep it
         | working while you work on a migration plan on your own
         | timeline.
         | 
         | The more open the license, the more options available.
        
           | Galxeagle wrote:
           | Also from a license negotiation perspective, gives the buying
           | company the option to threaten to self-host and/or fork. Even
           | if they never do (and I'm sure the source company is very
           | careful to balance the value story), it can act as a ceiling
           | for rate increases or other annoying business practices.
           | 
           | Why would a company ever open-source their product then?
           | Giving up that complete leverage can be a selling point
           | during the purchasing process, making buyers more comfortable
           | that they won't be (completely) locked in, and be a net
           | positive on revenue through faster sales.
        
       | p0w3n3d wrote:
       | I'm scared of people measuring code in "lines added" and "lines
       | deleted". Tbh sometimes good fix removes 10 lines and adds one,
       | but good. I can also imagine that all of the merge requests are
       | approved by the company "owning" the opensource project, hence
       | after rebase the author will be always from this company...
       | 
       | I understand that the companies probably did majority of the
       | work, However I can't put my finger on this comparison... Sounds
       | strange and inaccurate
        
       ___________________________________________________________________
       (page generated 2024-12-31 23:01 UTC)