[HN Gopher] OpenFeature - a vendor-agnostic, community-driven AP...
       ___________________________________________________________________
        
       OpenFeature - a vendor-agnostic, community-driven API for feature
       flagging
        
       Author : gjvc
       Score  : 184 points
       Date   : 2024-10-25 01:26 UTC (21 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | cmckn wrote:
       | Looks like an interesting project. Really cute logo. :)
       | 
       | How much does the flagd sidecar cost? Seems like that could be a
       | lot of overhead for this one bit of functionality.
        
       | sgammon wrote:
       | java version embeds lombok symbols lol
        
         | abrahms wrote:
         | Forgive my ignorance, but what should it be doing instead?
        
           | martypitt wrote:
           | Lombok is a very divisive framework in Java, with strong
           | opinions on both sides.
           | 
           | Given that, it's a bold choice to include Lombok in a library
           | that other developers will pull into their stack - it's
           | likely to make this a non-starter from those in the 'no'
           | camp.
           | 
           | As Lombok is just compiler sugar, when building an SDK for
           | other developers, it's probably less alienating to just write
           | the boilerplate that Lombok saves you from.
        
             | ahtihn wrote:
             | Lombok is a compile-time dependency. Consumers of a library
             | using lombok don't need to depend on lombok, so I don't see
             | why it would matter?
        
       | sabedevops wrote:
       | Where is the tldr? Anyone familiar...what does this do and why do
       | we care about it being standards based?
        
         | MeteorMarc wrote:
         | https://openfeature.dev/docs/reference/intro/
        
         | taveras wrote:
         | This is a "standard" SDK for feature flags, allowing you to
         | avoid vendor lock-in.
         | 
         | i.e., using feature flag SaaS ABC but want to try out XYZ? if
         | you're using ABC's own DDK, refactor your codebase.
         | 
         | I appreciate that you can use the OpenFeature SDK with
         | environment variables, and move into a SaaS (or custom)
         | solution when you're ready.
        
         | gjvc wrote:
         | the laziness on this site never ceases to amaze
        
           | gjvc wrote:
           | and the use of "we" to somehow give the impression that this
           | person speaks for everyone
        
       | MeteorMarc wrote:
       | Martin Fowler about feature flags:
       | https://martinfowler.com/articles/feature-toggles.html
        
         | hobofan wrote:
         | No, that is Pete Hudgson on martinfowler.com. Most articles on
         | martinfowler.com haven't been written by Martin Fowler himself
         | in years. It's best thought of as a publishing venue for
         | Thoughtworks.
        
           | jonathannorris wrote:
           | Pete is a great guy, also on the OpenFeature governance board
           | :)
        
       | fire_lake wrote:
       | I don't get it. Why is this needed above and beyond the standard
       | ways of configuring deployed services?
        
         | ygouzerh wrote:
         | Do you mean feature flags? This enable you to change the
         | configuration at the runtime. Ex: A/B Testing and changing a
         | behavior for a subset of users, disable feature when you want
         | it (particularly useful when you are in Trunk Based Development
         | and don't want to deploy a beta feature to everyone for
         | example).
        
           | fire_lake wrote:
           | No I mean an entire framework and set of software components
           | for doing feature flags.
        
             | GiorgioG wrote:
             | Clearly you haven't worked at an org that uses something
             | like this extensively (LaunchDarkly for example.)
        
             | skeeter2020 wrote:
             | If you have a single database than maybe you can (and
             | should?) just start with a basic, single table approach,
             | but as you grow in size and complexity FF management can
             | become a challenge, with reporting gaps and feature release
             | management. I usually see two charateristics with the
             | former approach: growth in the # of FFs over time and a
             | messy Excel report for what they are, do and if anyone
             | still hits the old code. This might be fine for a while, or
             | forever, but often gets painful.
        
           | echoangle wrote:
           | But why do you need an external service for that? Isn't that
           | basically a single DB table with a name and an on/off value
           | for each flag (or maybe an integer for multiple options)?
        
             | hobofan wrote:
             | In it's simplest incarnation, yes it could be just a single
             | DB table with boolean flags.
             | 
             | However there are a lot of connected needs that most real
             | world-usages run into:
             | 
             | - Per-user toggles of configuration values
             | 
             | - Per-user dynamic evaluation based on a set of rules
             | 
             | - Change history, to see what the flag value was at time of
             | an incident
             | 
             | - A/B testing of features and associated setting of
             | tracking parameters
             | 
             | - Should be controllable by e.g. a marketing/product
             | manager and not only software engineers
             | 
             | That can quickly grow into something where it's a lot
             | easier to reach for an existing well thought out solutions
             | rather than trying to home-grow it.
        
               | random_kris wrote:
               | In microservice world. Do you want to track features in
               | each service or have a source of truth using flagd I
               | prefer central source of truth
        
               | jacobr1 wrote:
               | In a microservice world obviously you'd have a feature-
               | flag service. But you still have a build/buy
               | consideration.
        
               | j45 wrote:
               | Great summary. The more parties involved with more and
               | more configurations the more management of details is
               | needed.
        
       | taveras wrote:
       | As someone who's been thinking about feature toggles and
       | continuous delivery often lately, OpenFeature has been helpful.
       | 
       | Kudos to the team!
        
       | chromanoid wrote:
       | I can see that this might be very useful, since it is more some
       | kind of application configuration specification that goes far
       | beyond simple flags. In the end the common provider that works
       | securely across all services and clients is probably the real
       | problem.
        
       | random_kris wrote:
       | Nice! Sometime ago i made a small poc with usage on Frontend
       | (nextjs) Backcend (js) Flag provider (flagd + flag api that
       | serves json flags from db)
       | 
       | Cool stuff
       | 
       | https://github.com/grmkris/openfeature-flagd-hono-nextjs
        
       | adontz wrote:
       | A little story from personal experience.
       | 
       | Most people think of feature flags as boolean on/off switches,
       | maybe per user on/off switches.
       | 
       | If one is testing shades of colors for a "But Now!" button that
       | may be OK. Regarding more complex tests my experience is that
       | there are not a lot of users who tolerate experiments. Our
       | solution was to represent feature flags as thresholds. We
       | assigned a decimal number [0.0, 1.0) to each user (we called it
       | courage) and a decimal number [0.0, 1.0] to a feature flags (we
       | called it threshold). That way not only we enabled more
       | experimental features for most experiment tolerant users, but
       | these were the same users, so we could observe interaction
       | between experimental features too. Also deploying a feature was
       | as simple as rising it's threshold up to 1.0. User courage was
       | 0.95 initially and could be updated manually. We tried to
       | regenerate it daily based on surveys, but without much success.
        
         | vault wrote:
         | From your naming, I would have done the opposite :) Start with
         | courage 0.05 and show experiments whenever it is greater than
         | the threshold. To enable a feature for everybody, you lower the
         | threshold to 0.
         | 
         | How did you measure "experiment tolerance"?
        
           | adontz wrote:
           | Yeah, naming was bad. Courage 0 is maximum courage.
           | 
           | >> How did you measure "experiment tolerance"?
           | 
           | Feedback from CS mostly. No formal method. We tried to survey
           | clients to calculate courage metric, but failed to come up
           | with anything useful.
        
         | withinboredom wrote:
         | Interesting. At one place I worked, employees were excluded
         | from experiments (they had to enable the flag personally to see
         | them) by default. At one point, we had so many experiments that
         | literally nobody except employees were using "that version" of
         | the software. Everyone else was using some slightly different
         | version (if you counted each feature as a version), and there
         | were thousands and thousands of versions in total.
         | 
         | We ended up creating just ~100 versions of our app (~100
         | experiment buckets), and then you could join a bucket. Teams
         | could even reserve sets of buckets for exclusive
         | experimentation purposes. We also ended up reserving a set of
         | buckets that always got the control group.
         | 
         | You've approached it a different way, and probably a more
         | sustainable way. It's interesting. How do you deal with the
         | bias from your 'more courageous' people?
        
           | tauntz wrote:
           | > At one point, we had so many experiments that literally
           | nobody except employees were using "that version" of the
           | software. Everyone else was using some slightly different
           | version
           | 
           | Was this at Spotify by any chance? :)
        
             | withinboredom wrote:
             | No.
        
           | adontz wrote:
           | >> How do you deal with the bias from your 'more courageous'
           | people?
           | 
           | That's a great question. We had no general solution for that.
           | We tried to survey people, but results were inconclusive, not
           | statistically significant.
        
             | withinboredom wrote:
             | I mean that "courageous" people are more likely to take
             | risks and accept new features and thus probably more likely
             | to be attracted to novelty (see: Novelty Effect) and
             | require longer experiments to understand the actual impact.
        
         | vasco wrote:
         | > User courage was 0.95 initially and could be updated
         | manually. We tried to regenerate it daily based on surveys, but
         | without much success.
         | 
         | Based on this ending, the courage bit sounds clever but is
         | misguided. It adds complexity in a whole other variable, yet
         | you have no way of measuring it or even do a good assessment.
         | 
         | I thought you were going to describe how you calculated courage
         | based on the statistical usage of new features vs old features
         | when exposed to them to update courage, meaning people who
         | still keep using the product when it changes have more courage
         | so they see more changes more often. But surveying for courage
         | (or how easy they deal with change) is probably the worse way
         | to assess it.
         | 
         | But even that I don't know what purpose would have because now
         | you destroyed your A/B test by selecting a very specific sub
         | population, so your experiment / feature results won't be good.
         | I'm assuming here a product experimentation approach being
         | used, not just "does it work or not" flags.
        
           | adontz wrote:
           | Mostly functional changes. Like deploying a new parser, which
           | may not support all the old files. There were users which
           | will contact customer support in panic stating that their
           | life is ruined by this change and there were users who's like
           | that fixed by next quarter.
        
             | j45 wrote:
             | What's important is if it worked for you and your audience.
             | 
             | There's no standard requiring something to work for
             | everyone, and it being less value if it isn't.
        
         | siva7 wrote:
         | Sounds like an overengineered solution to something that can be
         | solved as simple as with a checkbox "i would like to get access
         | to experimental features" in the UI.
        
           | jasfi wrote:
           | I'd go with that option too. I don't think users want to be
           | surprised with being experimented on. Some users could take
           | it worse than others.
        
           | adontz wrote:
           | I respectfully disagree. Depends on number and severity of
           | experiments. Comparing two decimals is really not harder than
           | checking a boolean, still a single "if". I do not see much
           | over-engineering here.
        
           | j45 wrote:
           | Getting one or a few new features is one thing, getting too
           | many might be too much.
           | 
           | Some granularity and agency for the user is valuable. Maybe
           | let them pick everything as a whole or a few features at a
           | time.
        
         | skeeter2020 wrote:
         | This seems really complex, specifically in the area where I
         | find product, CS & marketing least likely to want it: targeting
         | and controlling their audience. Sounds like a cool thought
         | experiment, fun and challening to implement and not really
         | practical or useful.
         | 
         | If you have a huge userbase and deploy very frequently FFs are
         | great for experiments, but for the rest of us they're primarily
         | a way to decouple deploys from releases. They help with the
         | disconnect between "Marketing wants to make every release a big
         | event; Engineering wants to make it a non-event". I also find
         | treating FFs as different from client toggles is very important
         | for lifecycle management and proper use.
         | 
         | More than the binary nature I think the bigger challenge is FFs
         | are almost always viewed as a one-way path "Off->On->Out" but
         | what if you need to turn them off and then back on again? It
         | can be very hard to do properly if a feature is more than UI,
         | that might cause data to be created or updated that the old
         | code then clobbers, or issues between subsystems, like
         | microservices that aren't as "pure" as you thought.
        
           | adontz wrote:
           | Yes, it's not a good solution. Targeting was missing, good
           | catch. I've just shared an unusual experience to inspire
           | further experimenting.
        
         | lucideer wrote:
         | This seems like it would skew the data significantly for
         | certain use-cases.
         | 
         | Unless you're feature flagging to test infra backing an
         | expensive feature (in which case, in a load-balancer /
         | containerised world, bucketing is going to be much a much
         | better approach than anything at application level), then you
         | most likely want to collect data on acceptance of a feature. By
         | skewing it toward a more accepting audience, you're getting
         | less data on the userbase that you're more likely to lose. It's
         | like avoiding polling swing states in an election.
        
         | remram wrote:
         | What does "tolerating experiments" mean? If they can tell it's
         | an experiment, then isn't your change bad?
         | 
         | Do you mean "tolerate change"? But then you still eventually
         | roll out the change to everyone anyway...
         | 
         | Or do you mean that users would see a different color for the
         | "buy now" button every day?
         | 
         | From a purely statistical point of view, if you select users
         | which "tolerate" your change before you measure how many users
         | "like" your change, you can make up any outcome you want.
        
           | j45 wrote:
           | I suspect it's because some users will actually be pioneers
           | and early adopters vs believing they are.
           | 
           | This kind of threshold adds some flexibility into the
           | subjectivity of finding the best cohort to test a feature
           | with.
        
             | remram wrote:
             | Where the best cohort to test with is the one that agrees
             | with you...
             | 
             | You can call this measure "courage" but that is not
             | actually what you are measuring. What you measure is not
             | that different from agreement.
        
               | j45 wrote:
               | I didn't use the word courage, still I understand what
               | you're saying.
        
               | remram wrote:
               | adontz did above, that's what they called this user-
               | tolerance-for-experiments metric. I didn't mean to imply
               | you would too, apologies.
        
               | j45 wrote:
               | Oh, no need to apologize at all.
               | 
               | I could have clarified as well that I was leaning more
               | towards the user-tolerance... or as I like to call it
               | user-guess that this feature might be OK with them :)
               | 
               | Another thing I like about granular and flexible feature
               | flag management is you can really dial in and learn from
               | which features get used by whom, actually.... instead of
               | building things that will collect dust.
        
           | inhumantsar wrote:
           | I think you might be mixing things up a bit.
           | 
           | the tolerance score wouldn't be tied to a specific change.
           | it's an estimate of how tolerant a person is of changes
           | _generally_.
           | 
           | it's not that different from asking people if they want to be
           | part of a beta testers group or if they would be open to
           | being surveyed by market researchers.
           | 
           | targeting like that usually doesn't have a significant impact
           | on the results of individual experiments.
        
             | remram wrote:
             | If only people who like changes like your change, should
             | you really go ahead?
             | 
             | Plus you don't know what that correlates to. Maybe being
             | "tolerant of changes" correlates with being particularly
             | computer-savvy, and you're rolling out changes that are
             | difficult to navigate. Maybe it correlates to people who
             | use your site only for a single task, it would appear they
             | don't mind changes across the platform, but they don't see
             | them. Maybe it correlates with people who hate your site
             | now, and are happy you're changing it (but still hate it).
             | 
             | You can't use a selected subset that is not obviously
             | uncorrelated from your target variable. This is selection
             | bias as a service.
        
       | v3ss0n wrote:
       | Have they benchmarked against similar sized GGUF quants? How is
       | it compared to them?
        
       | webda2l wrote:
       | A coding world with more standardization will be better world.
       | 
       | I met this week "Standardized Interface for SQL Database Drivers"
       | https://github.com/halvardssm/stdext/pull/6 by example then
       | https://github.com/WICG/proposals/issues too.
       | 
       | Huge work to get everybody on the same page (About my previous
       | example, it's not well engaged by example
       | https://github.com/nodejs/node/issues/55419), but when done and
       | right done, it's a huge win for developers.
       | 
       | PHP PSR, RFC & co are the way.
        
         | oddevan wrote:
         | I was thinking of PSR interfaces when I was reading this!
        
       | bullcitydev wrote:
       | Speaking as an open-source feature flag 'vendor'
       | (https://github.com/flipt-io/flipt), the OpenFeature organization
       | has been a joy to work with. They are very welcoming of new
       | contributors (e.g., implementing a provider SDK in a new
       | language).
       | 
       | If you're interested in this space I'd recommend lurking in their
       | CNCF Slack Channel https://cloud-
       | native.slack.com/archives/C0344AANLA1 or joining the bi-weekly
       | community calls https://community.cncf.io/openfeature/.
        
       | DandyDev wrote:
       | Are there any big feature flag SaaS vendors that support this?
       | Like LaunchDarkly, Flagsmith, Unleash etc?
        
         | aepfli wrote:
         | Yes there are, as I am part of the openFeature community, I
         | have to point you to https://openfeature.dev/ecosystem where
         | you'll see all kinds of providers which are supported (some
         | officially, some by the community)
        
         | dabeeeenster wrote:
         | Hey there - one of the Flagsmith founders here - yes we are
         | supporting it, building adapters for our SDKs and I'm on the
         | CNCF project governance board.
         | 
         | We've got the core functionality pretty much down now, and so
         | there's some more interesting/challenging components to think
         | about now like Event Tracking (https://github.com/open-
         | feature/spec/issues/276) and the Remote Evaluation Protocol
         | (https://github.com/open-feature/protocol)
        
         | zellyn wrote:
         | LaunchDarkly has a mix of OpenFeature providers they wrote, and
         | quite reasonable community-contributed ones, depending on
         | language. They are also very actively engaged with OF in
         | meetings, discussions, etc.
         | 
         | (We are a big LD user at work.)
        
         | andrewdmaclean wrote:
         | Hey there! Andrew here, Community Manager for OpenFeature and
         | DevRel lead at DevCycle. We (DevCycle) have worked hard to
         | ensure an OpenFeature Provider is available for every language
         | supported by OpenFeature and for which we have an SDK
         | (https://docs.devcycle.com/integrations/openfeature)
        
       | vhodges wrote:
       | I am looking to maybe support this in
       | 
       | https://github.com/vhodges/ittybittyfeaturechecker
       | 
       | probably via https://openfeature.dev/specification/appendix-c (I
       | don't have time to maintain a bunch of providers).
       | 
       | We are evaluating new solutions at work and OpenFeature is
       | something we're interested in. (I did the home grown solution
       | that's in use by one product line)
        
       | caniszczyk wrote:
       | I hope that OpenFeature changes the feature flagging space the
       | same way that OpenTelemetry impacted the o11y space, we are
       | overdue for this (in my biased opinion)
        
       ___________________________________________________________________
       (page generated 2024-10-25 23:01 UTC)