[HN Gopher] How I write HTTP services in Go after 13 years
___________________________________________________________________
How I write HTTP services in Go after 13 years
Author : matryer
Score : 704 points
Date : 2024-02-09 19:00 UTC (1 days ago)
(HTM) web link (grafana.com)
(TXT) w3m dump (grafana.com)
| ballresin wrote:
| What is the value making main.go as small as possible?
|
| Whose dreams come true in this scenario?
| sebastianz wrote:
| The initialization has to be done in a separate function that
| you call from the setup code for your end-to-end tests.
| janosdebugs wrote:
| If you do, you can use the application as a library and most of
| your code will also be easier to test.
| mattboardman wrote:
| Usually your main function can't be used by any other part of
| your program. You should move all component implementations to
| modules so they can be re-used elsewhere.
| mosselman wrote:
| "Whose dreams come true in this scenario?"
|
| I love this! I will use this as well.
|
| There are so many situations where I have a feeling that people
| are solving problems that don't exist. In code I run into at
| work, code and projects I see online, etc
|
| The "whose dreams are you making come true" really applies
| here, because dreams are exactly what they are.
|
| I spent quite some time writing an automatic image resizer and
| optimiser for my blog. Does it matter? No! Should I have spent
| that time writing blog posts instead? Yes! Still I was chasing
| some dream.
|
| Thanks for this image
| arccy wrote:
| not main.go but func main. This allows your run function to
| return an error and you only need to deal with the abruptness
| of using os.Exit once
| pphysch wrote:
| For bespoke internal services, I like to keep main.go as flat
| as reasonable, like a "script". Handlers can have their own
| files but the bulk of the control flow and moving parts should
| be apparent from reading the main file.
|
| Abstracting things away from main makes it less readable and is
| general pointless for bespoke services that will be deployed in
| exactly one configuration.
| donio wrote:
| That's a nice way of putting it. When exploring a new
| codebase for the first time it can be very helpful to have
| main.go give you a high level idea about the overall
| structure of the program.
| abuani wrote:
| The author goes on to explain a few scenarios where the pattern
| is helpful. It's not to keep main.go as small as possible, it's
| so that you can test parts of your main.go file properly. In my
| experience, if all of my logic is stuffed into `func main()
| {}`, then I can't actually test it. If I have a helper
| method(like run in this case), I can test out specific
| scenarios and ensure the application handles it properly. Some
| of the examples Mat gave were handling context cancellations
| properly.
| jrockway wrote:
| I've never been a fan of making main.go one line. I create the
| logger, parse the flags, create objects from the flags, and
| call Run() or something. In the tests, you aren't ever going to
| do those things in the same way, so there is really no point in
| putting them in some other file.
| dilyevsky wrote:
| The idea is to keep the untestable code as small as possible
| but in practice you just add a layer of indirection and all of
| your untestable init code is in a different castle.
| perbu wrote:
| I assume you mean main() and not main.go.
|
| main() is the only place where you can't return an error. In
| order to keep as much of the code as idiomatic as possible you
| just call something like run() where you can do so.
|
| In addition there is the testing aspect. You can't invoke
| main() from your tests.
| arccy wrote:
| i really like the patterns in this post, pretty much what i've
| also settled on after much experimentation with different styles.
| romantomjak wrote:
| Great article with lots of interesting ideas. Can't believe I
| didn't know about signal.NotifyContext. Finally I'll be able to
| actually rememeber how to respond to signals instead of copy-
| pasting that between projects.
| sesm wrote:
| TLDR: optimize for unit tests and do DI with explicit function
| arguments. Looks kind of similar to Dropwizard.
| lelandbatey wrote:
| I want to see a greater acceptance of this idea:
|
| > My handlers used to be methods hanging off a server struct, but
| I no longer do this. If a handler function wants a dependency, it
| can bloody well ask for it as an argument. No more surprise
| dependencies when you're just trying to test a single handler.
|
| For HTTP services in any language, your handlers will usually end
| up with a lot of business logic, logic which probably has many
| dependencies. I see single handlers using all of the following on
| a regular basis: DB, cache, blob storage, some kind of special
| authz thing specific to your endpoints, maybe some fancy
| licensing checker, a queue or two, a specialized logger, and
| specialized metrics client. Many of those (metrics,
| request/response logging) can live in middlewares _most_ of the
| time, but in every code base there will be times where you need
| to do something custom with one or the other. As time passes, the
| more I wonder "why aren't these all just function parameters?"
|
| Yes, that would be a lot of function parameters (9+ for a single
| handler, before even getting into the request or custom params
| themselves), and we all have many rules of thumb and linter rules
| which try to keep us from having lots of function parameters. But
| it's not like we're not writing code which depends on all those
| dependencies, instead we're just sticking them on the "server"
| class/struct and pretending that because the method signature is
| shorter, we have fewer dependencies!
|
| As time passes, I find myself wishing more and more for code that
| takes _all_ its dependencies in the function /method signature,
| even if there's 20 of them; at least then we wouldn't be lying
| about how complex the code's getting...
| wereHamster wrote:
| It doesn't have to be 9+ separate arguments, in some languages
| it can be a single 'context' or 'env' object that contains just
| what the handler needs, something like `handleHello({ db,
| cache, blobStore, authz }, req, res)`. That way, if two
| handlers use the exact same context you can reuse, but it's
| also easy enough to declare a per-handler context at the call
| site.
| topicseed wrote:
| I've always have my handlers individually set as a struct each
| with a method to handle the route/request.
|
| type CreateUser struct { store storage.Store
| cache caching.Cache logger logging.Logger
| pub events.Publisher // etc
|
| }
|
| func (op _CreateUser) ServeHTTP(ctx, req, rw) {}
|
| // or if you have custom handlers
|
| func (op _CreateUser) ServeHTTP(ctx, input) (output, error) {}
|
| And in my main.go, or where I set up my dependencies, I create
| each operation, passing it its specific dependencies. I love
| that because I can keep all the helper methods for that
| specific operation/handler on that specific struct as private
| methods.
|
| It does get tedious when you have one operation needing
| another, as you might start passing these around or you extract
| that into its own package/service.
| lelandbatey wrote:
| This is kinda missing the point; each handler needs a lot of
| deps to do it's job, and the most obvious place to put them
| is in the parameters of the function. That is what I want. I
| do not want more indirection for aesthetics; I want clarity,
| even if it's brutal clarity.
|
| Whether all the deps are in the method receiver (the parent
| struct) or in a struct that's a param; it's all just more
| indirection to hide all the "stuff" that we need cause we
| think it's ugly. I dream of a world where we don't do that.
| topicseed wrote:
| You do have to instantiate that struct, and you can do it
| with.... a beautiful NewCreateUser(dep1, dep2, dep3, ...,
| dep20) *CreateUser {...}. This is essentially what he
| recommends with his "func newMiddleware() func(h
| http.Handler) http.Handler".
| lelandbatey wrote:
| I'm pointing out that this is basically "passing all the
| deps at once" with extra steps but no functional benefit;
| they are at best aesthetic, at worst confusing.
|
| I'd like a world that sacrifices a bit of aesthetics in
| order to erase ambiguity or confusion. So instead of
| putting your deps in a struct that's a param, or putting
| your deps in a parent closure, I'd like to put them in
| the function params.
|
| Though I will admit that if I had to choose, I'd use (and
| have used) the closure approach most often.
| jbmsf wrote:
| I don't write go, but I like these patterns. Feels fairly
| universal for testable code.
|
| I never want to see another (esp. Python) Quick Start guide that
| treats dependencies as implicit/static/untestable.
| matt_callmann wrote:
| is there a git repo with example code?
| mtlynch wrote:
| Not OP, but I design my Go projects with a very similar pattern
| that I learned from OP's 2018 post.
|
| I think this is a pretty good example of a real-world
| implementation:
|
| https://github.com/mtlynch/picoshare
|
| Particularly these files:
|
| https://github.com/mtlynch/picoshare/blob/2cd9979dab084ca781...
|
| https://github.com/mtlynch/picoshare/blob/2cd9979dab084ca781...
| karolist wrote:
| out of curiosity, why no sort-of-established pkg and internal
| dirs? What do you think of
| https://github.com/photoprism/photoprism structure?
| mtlynch wrote:
| I'm not familiar with that package structure,
| unfortunately. It might be good, but I'm not sure what the
| reasons are for structuring the project that way.
| mtlynch wrote:
| I really like Mat Ryer's work, and I've applied most of the ideas
| in the 2018 version of this article to all of my Go projects
| since then.
|
| The one weak spot for me is this aspect:
|
| > _NewServer is a big constructor that takes in all dependencies
| as arguments... In test cases that don't need all of the
| dependencies, I pass in nil as a signal that it won't be used._
|
| This has always felt wrong to me, but I've never been able to
| figure out a better solution.
|
| It means that a huge chunk of your code has a huge amount of
| unnecessary shared state.
|
| I often end up writing HTTP handlers that only need access to a
| tiny amount of the shared state. Like the HTTP handler needs to
| check if the requesting user has access to a resource, and then
| it needs to call one function on the datastore.
|
| I'd love to write tests where I only mock out those two methods,
| but I can't write simple tests because the handler is part of
| this giant glob where it has access to _all_ of the datastore and
| every object the parent server has access to because it 's all
| one giant object.
|
| Nothing against Mat Ryer, as his pattern is the best I've found,
| but I still feel like there's some better solution out there.
| jxramos wrote:
| I've become increasingly sensitive to these high afferent
| coupling points in the repos I work on, especially the deeper I
| embed into the world of bazel and how dependency management and
| physical design influence the code I author.
|
| Where possible plugins are a great strategy to lay down these
| code seam points that don't force all possibilities upon some
| body of code, because fundamentally with plugin architectures
| you pick and choose what you want. Plugins are opt out by
| default, you must explicitly opt into a plugin for it to
| manifest. I've been calling software that has this quality
| going as being an "a la carte" style.
|
| But in general you do what you need to do to avoid "doing
| everything so you can do anything".
| zer00eyz wrote:
| I tend to write most of my logic in packages... so a "users"
| package or a "comments" package (if we were building HN). These
| have NO http interface! They do however each have their own
| "main" and some sort of CLI interface: "//go:build ignore" in
| the comment of that file is your friend.
| vjerancrnjak wrote:
| It means the object created by NewServer is dealing with too
| much. Probably has too many data types coupled to it and too
| much behavior.
|
| Simple example is adding a logger. If you add it as a
| dependency to the constructor, the object starts doing a bit
| more than initial simple implementation. It's fine to do it,
| but shame to not figure out how to log without editing the
| implementation of a simple thing.
|
| Higher order functions (a logger decorator) get there to allow
| composition, but even they have their drawbacks.
|
| It's still some form of structure that you can deal with, not a
| mistake.
| oogali wrote:
| I agree that too many arguments to the constructor may have
| the smell of too much coupling.
|
| But if I really feel I can't avoid the need to pass a good
| amount of external context, I create a dedicated "options"
| struct and pass that into the constructor as a pointer. The
| purpose of the pointer (rather than pass by value) is if I
| want default arguments, I can pass nil.
| type ServerOptions struct { logger
| *magic.Logger secretKey string }
| func NewServer(options *ServerOptions) (*Server, error) {
| ... }
| taberiand wrote:
| As you say, having a logger attached is one of those
| pragmatic and acceptable exceptions to the rule. In a perfect
| world we'd have the time to go to the trouble of implementing
| loggable types and data flows and associated higher order
| functions, in practice taking the compromise means getting
| the real business valuable work completed while still having
| the necessary (but usually "low priority") non-functional
| requirements like logging and metrics implemented.
| j1elo wrote:
| I think a useful iteration of that pattern is one called
| _Functional Options_ :
|
| * https://dave.cheney.net/2014/10/17/functional-options-for-
| fr...
|
| * https://github.com/uber-
| go/guide/blob/master/style.md#functi...
| hayst4ck wrote:
| > It means that a huge chunk of your code has a huge amount of
| unnecessary shared state.
|
| Can you explain that a little more?
|
| Which chunk of code has what shared state, and why is it
| unnecessary?
| strawhatguy wrote:
| Basically, not all the handlers will use every dependency the
| server (which is the entire program in this pattern) has. Not
| every handler will use a database, for example.
|
| While I may prefer a struct for this instead of separate
| arguments, I do agree it's useful to capture "the world" as
| the set of all dependencies, even if some handlers don't use
| them ( _yet_ ).
| qaq wrote:
| You can use Dependency Injection to solve this issue but in my
| view the added complexity is not really worth it.
| metaltyphoon wrote:
| Is this a Go thing? In C# land this is trivial.
| gabesullice wrote:
| I felt this way for a long time. And maybe I'm projecting my
| past struggles onto what you're describing. I shared my current
| approach in a different comment already [0]. The gist is that I
| use an optional config struct, whose values get validated and
| copied over to my server struct inside NewServer. This makes
| testing much easier because I can mock fewer deps.
|
| FWIW, I really tried to make the functional option pattern work
| for me, as many others have suggested, but eventually abandoned
| it. I felt it was a little too clever and therefore difficult
| to read, while requiring more boilerplate than the config
| struct + validate and copy pattern.
|
| [0] https://news.ycombinator.com/item?id=39320170
| sethammons wrote:
| I like a lot of what they've done here. My testing looks a bit
| different however.
|
| srv, err := newTestServer()
|
| require.NoError(t, err)
|
| defer srv.Close()
|
| resp, err :=
| http.Post(fmt.Sprintf("http://localhost:%d/signup/json",
| srv.Port()), "application/json", strings.NewReader(`
| {"email":"test@example.com", "password": "p@55Word",
| "password_copy": "p@55Word"} `))
|
| In my newTestServer, I spin up a server with fakes for my
| dependencies. If I want to test a dependency error, I replace
| that property with a fake that will return an error. I can
| validate my error paths. I can validate my log entries. I can
| validate my metric emission. I can validate timeouts and graceful
| shutdowns.
|
| After the server starts, I inspect to determine which port it is
| running on (default is :0 so I have to wait to see what it got
| bound to).
|
| My "unit" tests can test at the handler level or the http level,
| making sure that I can fully test the code as the users of my
| system will see it, exercising all middleware or none. I can spin
| up N instances and run my tests in parallel.
| jopsen wrote:
| I've recently been playing with ogen: https://github.com/ogen-
| go/ogen
|
| Write openapi definition, it'll do routing, definition of
| structs, validation of JSON schemas, etc.
|
| All I need to do is implement the service.
|
| Validating an integer range for a querystring parameter is just
| too boring. And too easy to mistype when writing it manually.
|
| Anyways, so far only been playing, so haven't found the bad parts
| yet.
| dilyevsky wrote:
| The problem with this approach is writing openapi by hand from
| scratch is _incredibly_ tedious process. Writing Protobufs,
| capnproto or any such similar idl feels much more productive
| xyzzy123 wrote:
| Its a bit icky but LLMs / copilot can speed up the creation
| of openapi specs a lot.
|
| Agree it doesn't fix the "root" problem that the overall
| syntax is not ergonomic.
| jopsen wrote:
| My point was that writing an openapi, or other IDL is
| faster than writing the code to manually do these things.
|
| And more accurate than LLMs.
|
| Feels like whenever an LLMs could code it, you'd be better
| of not having the boilerplate code at all.
| flowardnut wrote:
| it lacks flexibility but i really enjoy grpc-gateway for 99%
| of my work
|
| https://github.com/grpc-ecosystem/grpc-gateway
| topicseed wrote:
| Or, if you're more into publishing an Openapi spec from your Go
| code, I do like danielgtaylor/huma[1] and swaggest/rest[2].
|
| [1] https://github.com/danielgtaylor/huma
|
| [2] https://github.com/swaggest/rest
| perbu wrote:
| I've started doing the same but with oapigen;
| github.com/deepmap/oapi-codegen
|
| I thougt it would be boring writing the spec, but it was not
| nearly as bad as I thought. Also, a spec is needed, so might as
| well write it up front.
| zemo wrote:
| > func NewServer(... config *Config ...) http.Handler
|
| one of my biggest pet peeves is when people take a Config object,
| which represents the configuration of an entire system, and pass
| it around mutably. When you do that, you're coupling everything
| together through the config object. I've worked on systems where
| you had to configure the parts in a specific order in order for
| things to work, because someone decided to write back to the
| config object when it was passed to them. Or another case was
| where I've seen it such that you couldn't disable a portion of
| the system because it wrote data into the config object that was
| read by some other subsystem later. The pattern of "your
| configuration is one big value, which is mutable" is one of the
| more annoying patterns that I've seen before, both in Go and in
| other languages.
| doh wrote:
| I think that's a valid criticism. What do you think would be a
| more ergonomic pattern?
| Raynos wrote:
| I wrote a static config class that reads configuration for
| the entire app / server from a JSON or YAML file ( https://gi
| thub.com/uber/zanzibar/blob/master/runtime/static_... ).
|
| Once you've loaded it and mutated it for testing purposes or
| for copying from ENV vars into the config, you can then
| freeze it before passing it down to all your app level code.
|
| Having this wrapper object that can be frozen and has a
| `get()` method to read JSON like data make it effectively not
| mutable.
| doh wrote:
| I use similar pattern myself. Was curious if the OP is
| using some other, like for instance splitting the struct
| into two (im/mutable) and then passing them around, or
| what.
|
| BTW kudos on zanzibar. Love the tech and the code).
| zemo wrote:
| I just use a struct literal, and then I have the type define
| a `func (t *Thing) ready() error { ... }` method and call the
| ready method to check that its valid. I prefer this over
| self-referential options, the builder pattern, supplying a
| secondary config object as a parameter to a constructor, etc.
| gabesullice wrote:
| Not the OP, but I mitigate the issue rather than use a
| different pattern. Like so:
|
| type Server struct { val bool }
|
| type Config struct { Val bool }
|
| func NewServer(... config *Config ...) http.Handler { if
| config == nil { config = &Config{} } return &Server{ val:
| config.Val } }
|
| It took me a long time to settle on this pattern and I admit
| it's tedious to copy configuration over to the server struct,
| but I've found that it ends up being the least verbose and
| maintainable long term while making sure callers can't mutate
| config after the fact.
|
| I can pass nil to NewServer to say "just the usual, please",
| customize everything, or surgically change a single option.
|
| It's also useful for maintaining backwards compatibility. I'm
| free to refactor config on my server struct and "upgrade"
| deprecated config arguments inside my NewServer function.
| tejinderss wrote:
| The keyword here is "mutable" config object and not config data
| object in general. I use immutable config dataclass liberally
| in one of my python projects and i pass it around in all
| modules. Many functions rely on multiple values and instead of
| passing all of them as function parameters (which requires
| their own function typings), the dataclass has all variables
| with typing definitions in one place, its pretty handy design
| pattern.
| gloryjulio wrote:
| I agree. We ran into sev by changing the top level config
| object before. You DO NOT want to modify it. The wasted man
| hour is not worth. You will never know where or how it get
| used. If you make changes it's better to derive from it
| instead.
|
| Update: What's funny was, in our design the config object was
| kinda immutable. You have to use the WARNING_DO_NOT_USE api to
| make modification. We did mutate the object and we caused a sev
| fnordlord wrote:
| I've tended to create a Config struct for each package and then
| a configs.Config struct that's just made up of each package's
| Config. It might not be a Go best practice but I like that I
| can setup the entire system's configuration on startup as one
| entity but then I only pass in the minimally required
| dependencies for each package. It also makes testing a little
| easier because I don't have to fake out the entire
| configuration for testing one package.
| MrDarcy wrote:
| My favorite way to prevent this is to make the config truly
| immutable, but still configurable with something like this:
| package config type options struct { name
| string } type Option func(o *options)
| func Name(name string) Option { return func(o *options)
| { o.name = name } } type
| Config struct { opts *options } func
| New(opts ...Option) *Config { o := &options{}
| for _, option := range opts { option(o) }
| return &Config{opts: o} } func (c *Config)
| Name() string { return c.opts.name }
|
| Use it with: cfg :=
| config.New(config.Name("Emanon")) fmt.Println(cfg.Name())
| zemo wrote:
| I used that pattern for a while but stopped using it. I first
| encountered it from this blog post:
| https://commandcenter.blogspot.com/2014/01/self-
| referential-...
|
| It's a lot of boilerplate to create something that's not
| actually immutable. It also makes it harder to figure out
| which options are available, since now you can't just look at
| the documentation of the type, you have to look at the whole
| module package to figure out what the various options are. If
| one of the fields is a slice or map you can just mutate that
| slice or map in place, so it's not really immutable. The
| pattern as Pike describes it has the benefit that supplying
| an option returns an option that reverses the effect of
| supplying the option so that you can use the options somewhat
| like Python context objects that have enter and exit
| semantics, but in practice I've found that to be useful in a
| small portion of situations.
| kubanczyk wrote:
| > you can't just look at the documentation of the type
|
| Sure you can. Option func is a constructor for option type,
| and constructors are auto-included above methods in the
| docs.
|
| PLS completion works for them as well.
| zemo wrote:
| The options for the thing being constructed are all
| separate types from the thing being constructed; the
| options aren't a facet of the definition of the type they
| mutate.
| kubanczyk wrote:
| I'm saying:
|
| - main constructor is easily available from the main
| type's docs,
|
| - option type is easily available from the main
| constructor's docs,
|
| - all option funcs are easily available from the option
| type's docs (because in fact these option funcs are
| constructors for the option type).
|
| Excerpt from grpc godoc index: type
| Server func NewServer(opt ...ServerOption)
| *Server ... ... type
| ServerOption func
| ChainStreamInterceptor(interceptors
| ...StreamServerInterceptor) ServerOption func
| ChainUnaryInterceptor(interceptors
| ...UnaryServerInterceptor) ServerOption func
| ConnectionTimeout(d time.Duration) ServerOption
| func Creds(c credentials.TransportCredentials)
| ServerOption etc...
|
| One more hop compared to a flat argument list, that's
| true. But if you only commonly use maybe 0-5 arguments
| out of 30-50 available, it does not look like a bad deal.
| zemo wrote:
| I'm not confused about what it is, I just don't like it.
| It's a lot of ceremony for very little gain.
| mtlynch wrote:
| > _The Valid method takes a context (which is optional but has
| been useful for me in the past) and returns a map. If there is a
| problem with a field, its name is used as the key, and a human-
| readable explanation of the issue is set as the value._
|
| I used to do this, but ever since reading Lexi Lambda's "Parse,
| Don't Validate," [0] I've found validators to be much more error-
| prone than leveraging Go's built-in type checker.
|
| For example, imagine you wanted to defend against the user
| picking an illegal username. Like you want to make sure the user
| can't ever specify a username with angle brackets in it.
|
| With the Validator approach, you have to remember to call the
| validator on 100% of code paths where the username value comes
| from an untrusted source.
|
| Instead of using a validator, you can do this:
| type Username struct { value string }
| func NewUsername(username string) (Username, error) {
| // Validate the username adheres to our schema. ...
| return Username{username} }
|
| That guarantees that you can never forget to validate the
| username through any codepath. If you have a Username object, you
| know that it was validated because there was no other way to
| create the object.
|
| [0] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-
| va...
| oorza wrote:
| Conceptually equivalent to the ancient arts of private
| constructors and factory methods.
| Cthulhu_ wrote:
| Which (in Java) were then abstracted away in... interesting
| annotations.
| xboxnolifes wrote:
| Crazy that actually using your type system leads to better
| code. Stop passing everything around as `string`. Parse them,
| and type them.
| stickfigure wrote:
| There's a name for this anti-pattern: "Stringly typed"
| timcobb wrote:
| Bash :(
| nsguy wrote:
| JSON
| tdeck wrote:
| TCL
| tdeck wrote:
| This term is typically used to refer to things like data
| structures and numerical values all being passed as
| strings. I don't think a reasonable person would consider
| storing a username in a string to be "stringly typed".
| samatman wrote:
| The One True Wiki[0] says "Used to describe an
| implementation that needlessly relies on strings when
| programmer & refactor friendly options are available."
|
| Which is exactly what's going on here. A username has a
| string as a payload, but that payload has restrictions
| (not every string will do) and methods which expect a
| username should get a username, not any old string.
|
| [0]: https://wiki.c2.com/?StringlyTyped
| tdeck wrote:
| I don't agree that this example is more "programmer
| friendly". Anything you want to do with the username
| other than null check and passing an argument is going to
| be based directly on the string representation. Insert
| into a database? String. Display in a UI? String.
| Compare? String comparison. Sort? String sort. Is it
| really more "programmer friendly" to create wrapper types
| for individual strings all over your codebase that need
| to have passthrough methods for all the common string
| methods? One could argue that it's worth the tradeoff but
| this C2 definition is far from helpful in setting a clear
| boundary.
|
| Meanwhile the real world usages of this term I've seen in
| the past have all been things like enums as strings,
| lists as strings, numbers as strings, etc... Not
| arbitrary textual inputs from the user.
| cwilkes wrote:
| You inherit some code. Is that string a username or a
| phone number? Who knows. Someone accidentally swapped two
| parameter values. Now the phone number is a username and
| you've got a headache of trying to figure out what's
| wrong.
|
| By having stronger types this won't come up as a problem.
| You don't have to rely on having the best programmers in
| the world that never make mistakes (tm) to be on your
| team and instead rely on the computer making guard rails
| for you so you can't screw up minor things like that.
| quickthrower2 wrote:
| I agree on the one hand but empirically I don't think I
| have seen a bug where the problem was the string for X
| ended up being used as Y. Probably because the
| variable/field names do enough heavy lifting. But if your
| language makes it easy to wrap I say why not. It might
| aid readability and maybe avoid a bug.
|
| I would probably type to the level of Url, Email, Name
| but not PersonProfileTwitterLink.
| baq wrote:
| I've refactored a large js code base into ts. Found one
| such bug for every ~2kloc. The obvious ones are found
| quickly in untyped code, the problem is in rare cases
| where you e.g. check truthiness on something that ends up
| always true.
| antonvs wrote:
| It definitely is stringly typed. It's just that it's a
| very normalized example of it, that people don't think of
| as being an antipattern.
|
| If you want to implement what Yaron Minsky described as
| "make illegal states unrepresentable", then you use a
| username type, not a string. That rules out multiple
| entire classes of illegal states.
|
| If you do that, then when you compile your program, the
| typechecker can provide a much stronger correctness
| proof, for more properties. It allows you to do "static
| debugging" effectively, where you debug your code before
| it ever even runs.
| wruza wrote:
| I don't get what you're about. The root comment clearly
| presents a structure of a separate type. The fact that it
| happens to contain a single string field is completely
| irrelevant (what type an _actual_ username should be, a
| float?). "Stringly typed" is about stringifying non-
| string values to save typing work and is not applicable
| here in the slightest.
| quickthrower2 wrote:
| I wasn't sure who was right. I'll tie break with
| https://wiki.c2.com/?StringlyTyped= which pretty much
| says what you just said
| zeroCalories wrote:
| I've also seen it called primitive obsession, which is also
| applicable to other primitive types like using an integer
| in situations where an enum would be better.
| skitter wrote:
| Sadly enums are too advanced of a concept to be included
| in Go.
| eru wrote:
| See https://eli.thegreenplace.net/2018/go-and-algebraic-
| data-typ... for more background.
| fbdab103 wrote:
| Definitely use to fall for primitive obsession. It seemed
| so silly to wrap objects in an intermediary type.
|
| After playing with Rust, I changed my tune. The type
| system just forces you into the correct path, that a lot
| of code became boring because you no longer had to second
| guess what-if scenarios.
| xboxnolifes wrote:
| > Definitely use to fall for primitive obsession. It
| seemed so silly to wrap objects in an intermediary type.
|
| A lot of languages certainly don't make it easy. You
| shouldn't have to make a Username struct/class with a
| string field to have a typed username. You should be able
| to declare a type Username which is just a string under
| the hood, but with different associated functions.
| zeroCalories wrote:
| Yeah, modern type systems are game changers. I've soured
| on Rust, but if Go had the full Ocaml type system with
| match statements I think it would be the perfect
| language.
| kaba0 wrote:
| Go would need such a revamp to be anywhere close to a
| decent language, that it would be just a straight up
| other language.
| munk-a wrote:
| As a PHP developer I am frankly disappointed you think that
| we only do that with strings. I've got an array[1] full of
| other tools.
|
| 1. Or maybe a map? Those keys might have significance I
| didn't tell you about.
| xboxnolifes wrote:
| I originally typed out `int` and wanted to do more, but I
| try to keep my comments as targeted as possible to avoid
| the common reply pattern of derailing a topic by commenting
| on the smallest and least important part of it. If I type
| `string`, `int`, `arrays`, `maps`, `enums`... someone will
| write 3 paragraphs about enums are actually an adequate
| usage of the type system, and everyone will focus on that
| instead of the overarching message.
| tuyguntn wrote:
| things have different costs.
|
| Types limit you from making some mistakes, but it also
| impacts your extensibility. Imagine an enum with 4 values and
| you want to add 1 because 10 level deep one of the services
| need new value. How does it usually go with strongly typed
| languages? You go and update all services until new value is
| properly propagated to lowest level who actually needs that
| value.
|
| Now imagine doing same with strings, you can validate at the
| lowest level, upper levels just pass value as it is. If upper
| layers have conditionals based on value, they still can limit
| their logic to those values
| ndriscoll wrote:
| Why would you need to update code that isn't matching on
| the value? It just knows it has an X and passes it to a
| function that needs an X.
| tuyguntn wrote:
| if you don't update the code in intermediate layers, some
| automated validation based on enum values will fail,
| which also drops the request
| ndriscoll wrote:
| You only need to update the parser and the places that
| are using it. Depending on language, the parser might
| update itself (Scala generally works this way). Everyone
| else has an already parsed value that they're just
| passing around. That's the point: only run your
| validation at the outer layer of your application.
| asp_hornet wrote:
| Now what? the username is in an unexported field and unusable?
| I can kind of see what its going for but it seems like a way
| just to add another layer of wrapping and indirection.
| pants2 wrote:
| It would need a getter here. Probably good to keep it
| immutable, if you want guarantees that it will never be
| changed to something that violates the username rules.
| asp_hornet wrote:
| > need a getter
|
| Yeah, thats what I figured. Im not sure if I want the
| tradeoff of calling .GetValue in multiple places just to
| save calling validate in maybe 2 or 3 places.
|
| Not to mention I cant easily marshal/unmarshal into it and
| next week valid username is a username that doesnt already
| exist in the database.
|
| Maybe this approach appeals to people and Im hesitant to
| say "that's not how Go is supposed to be written" but for
| me this feels like "clever over clear".
| klodolph wrote:
| > Yeah, thats what I figured. Im not sure if I want the
| tradeoff of calling .GetValue in multiple places just to
| save calling validate in maybe 2 or 3 places.
|
| The tradeoff is not that you save calling validate, it's
| that you avoid _forgetting_ to call validate in the first
| place, because when you forget to validate, you get a
| type error.
|
| IMO it's a little more clear this way:
| type Ticket struct { requestor Username
| assignee Username }
|
| It lets you write code that is little more obvious.
| asp_hornet wrote:
| I'm not sure I understand. In your example you've grouped
| related data in a struct and validating that it matches
| your system's invariants, that feels good to me.
|
| The original example was more "wrap a simple type in an
| object so it's always validated when set" which looks
| beautiful when you don't have the needed getters in the
| example nor show all the Get call sites opposed to the 1
| or 2 New call sites. All in the name of "we don't want to
| set the username without validation" but without private
| constructors Username{"invalid"} can be invoked, the
| validation circumvented and I'm not convinced the
| overhead we paid was worth it.
| Sakos wrote:
| The countless bugs I've had to deal with and all the time
| I've lost fixing these bugs caused by people who forgot
| to validate data in a certain place or didn't realize
| they had to do so proves to me that the overhead of
| calling a get on a wrapper type is totally worth it.
|
| I value the hours wasted on diagnosing a bug far more
| than the extra keystrokes and couple of seconds required
| to avoid it in the first place.
| asp_hornet wrote:
| No, you've achieved an illusion of that as now your
| spending hours wasted on discovering where a developer
| forgot to call NewUsername and instead called
| Username{"broken"}. I cant see the value in this
| abstraction in Go.
| iampims wrote:
| They can't because value is not exported. They must use
| the NewUsername function, which forces the validation.
|
| In my opinion, this pattern breaks when the validation
| must return an error and everything becomes very verbose.
| asp_hornet wrote:
| Oh, thats true about it being unexported. I hadn't
| considered that.
| cogman10 wrote:
| The issue is DRY often comes to wreck this sort of thing. Some
| devs will see "Hmm, Username is exactly the same as just a
| string so let's just use a string as Username is just added
| complexity".
|
| I've tried it with constructs like `Data` and `ValidatedData`
| and it definitely works, but you do end up with duplicate
| fields between the two objects or worse an ever growing
| inheritance tree and fields unrelated to either object shared
| by both.
|
| For example, consider data looking like Data
| { value string }
|
| and ValidatedData looking like ValidatedData
| { value int }
|
| There's a mighty temptation for some devs to want to apply DRY
| and zip these two things together. Unfortunately, that can
| really be messy on these sorts of type changes and the where of
| where validation needs to happen gets muddled.
| xboxnolifes wrote:
| Except Username is _not_ exactly the same as string, and that
| 's important. Username is a _subset_ of string. If they were
| equivalent, we wouldn 't need to parse/validate.
|
| The often misinterpreted part of DRY is conflating "these are
| the same words, so they are the same", with "these are the
| same concept, so they are the same". A Username and a String
| are conceptually different.
| cogman10 wrote:
| DRY is just "Do not repeat yourself". And a LOT of devs
| take that literally. It's not "Do not repeat concepts"
| (which is what it SHOULD be but DRC isn't a fun acronym).
|
| Unfortunately "This is the same character string" is all a
| DRY purist needs to start messing up the code base.
|
| I honestly believe that "DRY" is an anti-pattern because of
| how often I see this exact behavior trotted out or
| espoused. It's a cargo cult thing to some devs.
| oorza wrote:
| That's why I like to tell people to always remember to
| stay MOIST - the Most Optimal is Implicitly the Simplest
| Thing.
|
| When you add complexity to DRY out your code, you're
| adding a readability regression. DRY matters in very few
| context beyond readability, and simplicity and low
| cognitive load need to be in charge. Everything else you
| do code-style wise should be in service of those two
| things.
| HeavyStorm wrote:
| DRY has nothing to do with readability. The fact that it
| might help with it is purely coincidental.
|
| DRY is about maintainability - if you repeat rules
| (behavior) around the system and someone comes along and
| changes it, how can you be sure it affected all the
| system coherently?
|
| I've seen this in practice: we get a demand from the PO,
| a more recent hire goes to make the change, the use case
| of interest to the PO gets accepted. A week later we have
| a bug on production because a different code path is
| still relying on the old rule.
| oorza wrote:
| Maintainability and readability are two sides to the same
| coin. It's not exactly rocket science to cook up an
| example situation where making a change in one place is
| less maintainable than making it in two, because of
| overly DRY, overly abstracted nonsense leading to a
| _single_ place to change that's so far removed from where
| you'd expect it to be that it takes much longer and is
| much more wrought with risk than just having to do it
| twice.
|
| Doing something twice is not an anathema, that's my
| point, not when doing it twice is a cognitively easier
| and practically faster task.
|
| In almost every case, bugs are the result of human error,
| and keeping cognitive load as low as possible reduces the
| likelihood of human error in all cases. As DRY as
| possible is very rarely the lowest cognitive load
| possible.
| hombre_fatal wrote:
| This seems less about DRY and more a story about a
| hypothetical junior dev making a dumb mistake
| masquerading as commentary about "DRY purism".
| cogman10 wrote:
| Man I wish it was just jr devs. I cut jrs a ton of slack,
| they don't know any better. However, it's the seniors
| with the quick quips that are the biggest issue I run
| into. Or perhaps senior devs with jr mentalities
| ikiris wrote:
| most srs are just jrs with inflated egos and titles
| sroussey wrote:
| Like everything, it depends is the right answer.
| piva00 wrote:
| In my experience (~20 years) with software development I
| developed the belief that people will go through the path
| of applying patterns, techniques, architectures, good
| practices, first as dogma, then to rejection, ending in
| acceptance of the knowledge that almost _all_ of software
| development patterns /best practices are mostly good
| heuristics, which require experience to apply correctly
| and know when to break or bend the rules.
|
| DRY applied as a dogma will eventually fail, because it's
| not a verified mathematical proof of infallible code,
| it's just a practice that gives good results inside its
| constraints, people just don't learn the constraints
| until it explodes in their faces a few times.
|
| Like any wisdom, it's hard it will be received and
| understood without the rite of passage of experience.
| Intermernet wrote:
| DRY vs premature optimisation is the landscape most long
| term devs find themselves in. You can say that FP, OO and
| a bunch of other paradigms affect this, but eventually
| you need to repeat yourself. The key is to determine when
| this happens without spending too much time determining
| when this happens.
| devjab wrote:
| One of the major issues with a lot of the outdated concepts
| in programming is that we still teach them to young people. I
| work a side gig as an external examiner for CS students.
| Especially in the early years they are taught the same OOP
| content that I was taught some decades ago, stuff that I
| haven't used (also) for some decades. Because while a lot of
| the concepts may work well in theory, they never work out in
| a world where programmers have to write code on a Thursday
| afternoon after a terrible week.
|
| It's almost always better to repeat code. It's obviously not
| something that is completely black and white, even if I
| prefer to never really do any form of inheritance or
| mutability, it's not like I wouldn't want you to create a
| "base" class with "created by" "updated by" and so on for
| your data classes and if you have some functions that do
| universal stuff for you and never change, then by all means
| use them in different places. But for the most part,
| repeating code will keep your code much cleaner. Maybe not
| today or the next month, but five years down the line nobody
| is going to want to touch that shared code which is now so
| complicated you may as well close your business before you
| let anyone touch it. Again, not because the theoretical
| concepts that lead to this are necessarily flawed, but
| because they require too much "correctness" to be useful.
|
| Academia hasn't really caught on though. I still grade first
| semester students who have the whole "Animal" -> "duck",
| "dog", "cat" or whatever they use into their heads as the
| "correct way" to do things. Similar to how they are often
| taught other processes than agile, but are taught that agile
| is the "only" way, even though we've seen just how wrong that
| is.
|
| I'm not sure what we can really do about it. I've always
| championed strongly opinionated dev setups where I work. Some
| of the things we've done, and are going to do, aren't going
| to be great, but what we try to do is to build an environment
| where it's as easy as possible for every developer to build
| code the most maintainable way. We want to help them get
| there, even when it's 15:45 on a Thursday that has been full
| of shit meetings in a week that's been full of screaming
| children and an angry spouse and a car that exploded. And
| things like DRY just aren't useful.
| shrimp_emoji wrote:
| > _It's almost always better to repeat code._
|
| God no. Stop the copy pasta disease! It's horrible,
| mindless programming.
|
| When reviewing code, I'm astonished anything was
| accomplished by copy pasting so much old code (complete
| with bugs and comment typos).
|
| Incidentally, OOP encourages you to copy a lot. It's just
| an engine for generating code bloat. Want to serialize some
| objects? Here's your Object serializer and your overloaded
| Car serialize and your overloaded Boat serializer, with
| only a few different fields to justify the difference!
|
| OOP is bad. Copy pasta is bad. DRY is good. All hail DRY,
| forever, at any cost.
| jddj wrote:
| Countless man-centuries have been lost looking for the
| perfect abstraction to cover two (or an imagined future
| with two) cases which look deceptively similar, then
| teasing them apart again.
| zerbinxx wrote:
| OOP and Dry are compatible! I've actually done the thing
| that the above commenter suggests - create a base object
| with created on/by so that I never have to think about
| it. Whether or not you actually care about that, if you
| implement a descended of that object you're going to get
| some stuff for free, and you're gonna like it!
| hooverd wrote:
| For what it's worth, I've always had an easier time
| combining WET code than untangling the knot than is too
| DRY code. Too little abstraction and you might have to
| read some extra code to understand it. Too much
| abstraction and no one other than the writer, and even
| then, may ever understand it.
| kcrwfrd_ wrote:
| It's a balancing act, but deletable code is often
| preferable to purely-DRY-for-the-sake-of-DRY, overly
| abstracted code.
| HeavyStorm wrote:
| Yeah, no. Not at all. I imagine that you are taking DRY
| quite literally, as if and critiquing the most stupid use
| cases of it, like DRYing calls to Split with spaces to
| SplitBySpace.
|
| DRY's goal is to avoid defining behaviors in duplicity,
| resulting in having multiple points in code to change when
| you need to modify said behavior. Code needs to be coherent
| to be "good", for a number of of the different quality
| indicators.
|
| I'm doing a "side project" right now where I'm using a
| newcomer payment gateway. They certainly don't DRY stuff.
| Same field gets serialized with camel case and snake case
| in different API, and whole structures that represent the
| same concept are duplicate with slightly different fields.
| This probably means that Thursday 15.25 the dev checked-in
| her code happy because the reviewer never cared about DRY,
| and now I'm paying the price of maintaining four types of
| addresses in my code base.
| taberiand wrote:
| There's a mistake many junior devs (and sometimes mid and
| senior devs) make where they confuse hiding complexity with
| simplicity - using a string instead of a well defined domain
| type is a good example, there is a certain complexity of the
| domain expressed by the type that they don't want to think
| about too deeply so they replace with a string which
| superficially looks simpler but in fact hides all of the
| inherent complexity and nuance.
|
| It causes what I call the lumpy carpet syndrome - sweeping
| the complexity under the carpet causes bumps to randomly
| appear that when squashed tend to cause other bumps to pop up
| rather than actually solving the problem.
| Cthulhu_ wrote:
| Go now has generics, so I'm confident some smart fellow will
| apply DRY and make it a generic ValidatedData[type,
| validator] type struct, with a ValidatedDataFactory that
| applies the correct validator callback, and a
| ValidatorFactory that instantiates the validators based on a
| new valdiation rule DSL written in JSON or XML.
|
| ...Easy!
| skybrian wrote:
| This is a good design pattern, but be wary of doing validation
| too early. The design pattern allows you to do it as early or
| late as you like, but doesn't tell you when to do it. Often
| it's best to do it as part of parsing/validating some larger
| object.
|
| See Steven Witten's "I is for Intent" [1] for some ideas about
| the use of unvalidated data in a UI context.
|
| [1] https://acko.net/blog/i-is-for-intent/
| lolinder wrote:
| I read through that piece and strongly disagree with the
| premise that their insight is somehow at odds with leaning
| into the type system for correctness.
|
| The legitimate insight that they have is that anchoring the
| state as close as possible to the user input is valuable--I
| think that that is a great insight with a lot of good
| applications.
|
| However, there's nothing that says you can't take that user-
| centric state and put it in a strongly typed data structure
| as soon as possible, with a set of clearly defined and well-
| typed transitions mapping the user-centric state to the
| derived states.
|
| Edit: looks like there was discussion on this the other day,
| with a number of people making similar observations--
| https://news.ycombinator.com/item?id=39269886
| skybrian wrote:
| A text file and an abstract syntax tree can both be
| rigorously represented using types, but one is before
| parsing and other is after parsing. The question is which
| one is more suitable for editing?
|
| Text has more possible states than the equivalent AST, many
| of which are useful when you haven't typed in all the code
| yet. Incomplete code usually doesn't parse.
|
| This suggests that _drafts_ should be represented as text,
| not an AST.
|
| And maybe similarly for drafts of other things? Drafts will
| have some representation that follows some rules, but maybe
| they shouldn't have to follow all the rules. You may still
| want to save drafts and collaborate on them even though
| they break some rules.
|
| In a system that's _not_ an editor, though, maybe it makes
| sense to validate early. For a command-line utility, the
| editor is external, provided by the environment (a shell or
| the editor for a shell script) so you don 't need to be
| concerned with that.
| andrus wrote:
| I've found it hard to apply this pattern in Go since, if
| Username is embedded in a struct, and you forget to set it,
| you'll get Username's zero value, which may violate your
| constraints.
| mrklol wrote:
| Why? You can easily call NewUsername inside NewAccount for
| example, just return the error. Or did I misunderstood?
| BreakfastB0b wrote:
| Because go doesn't have exhaustiveness checking when
| initialising structs. Instead it encourages "make the zero
| value meaningful" which is not always possible nor
| desirable. I usually use a linter to catch this kind of
| problem https://github.com/GaijinEntertainment/go-
| exhaustruct
| vorticalbox wrote:
| I like this but in the examples would volume be
| calculated by width/length rather than being set?
| Cthulhu_ wrote:
| But if you then create a constructor / factory method for
| that struct, not setting it would trigger an error. But this
| is one of the problem with Go and other languages that have
| nil or no "you have to set this" built into their type
| system: it relies on people's self-discipline, checked by the
| author, reviewer, and unit test, and ensuring there's not a
| problem like you describe takes up a lot of diligence.
| 3pm wrote:
| AKA 'Value Object' from DDD or a similar 'Quantity' accounting
| pattern. Another angle is that this fixes 'Primitive Obsession'
| code smell.
| costco wrote:
| Just do type Username string
|
| And replace return Username{username}
|
| with return Username(username)
| daveFNbuck wrote:
| If you do that, people outside the package can also do
| Username(x) conversions instead of calling NewUsername.
| Making value package private means that you can only set it
| from outside the package using provided functionality.
| mtlynch wrote:
| The problem there is that you lose the guarantee that the
| parser validated the string value.
|
| A caller can just say: // This is returning
| an error for some reason, so let's do it directly. //
| username, err := parsers.NewUsername(raw) username :=
| parsers.Username(raw)
|
| You also get implicit conversions in ways you probably don't
| want: var u Username u = "<hello>"
| // Implicitly converts from string to Username
| costco wrote:
| That's true I did not think of that.
| Scarblac wrote:
| > and a human-readable explanation of the issue is set as the
| value.
|
| This is annoying to translate later. At least also include some
| error code string that is documented somewhere and isn't prone
| to change randomly.
| klodolph wrote:
| I mean, you may end up just wanting something like,
| type UsernameError struct { name string
| reason string } func (e *UsernameError)
| Error() string { return fmt.Errorf("invalid
| username %q: %s", e.name, e.reason) }
|
| And reason can be "username cannot be empty" or "username may
| not contain '<'" or something like that.
|
| This is fine for lots of different cases, because it's likely
| that your code wants to know how to handle "username is
| invalid", but only humans care about _why_.
|
| I have personally never seen a Go codebase where you parse
| error strings. I know that people keep complaining about it
| so it must be happening out there--but every codebase I've
| worked with either has error constants (an exported var set
| to some errors.New() value) or some kind of custom error type
| you can check. Or if it doesn't have those things, I had no
| interest in parsing the errors.
| Scarblac wrote:
| I write mostly frontends. Sometimes the APIs I talk to give
| back beautiful English error messages - that I can't just
| show to the user, because they are using a different
| language most of the time. And I don't want to write logic
| that depends on that sentence, far too brittle.
| klodolph wrote:
| Right--I think the "error code" here is going to be the
| error type, i.e., UsernameError, or some qualified
| version of that.
|
| It's not perfect, but software evolves through many
| imperfect stages as it gets better, and this is one such
| imperfect stage that your software may evolve through.
|
| Including a human-readable version of the error is useful
| because the developers / operators will want to read
| through the logs for it. Sometimes that is where you
| stop, because not all errors from all backends will need
| to be localized.
| patrickkristl wrote:
| em ai have a problem from cars
| the_gipsy wrote:
| It's not guaranteed at all, that's where go's zero-values come
| in. E.g. nested structs, un/marshaljson magic methods etc. How
| do you deal with that?
| stouset wrote:
| Every struct requiring its zero value to be meaningful is
| probably one of the worst design flaws in the language.
| randomdata wrote:
| There is no such requirement. Common wisdom suggests that
| you should ensure zero values are useful, but that isn't
| about every random struct field - _only the values you
| actually give others_. Initialize your struct fields and
| you won 't have to consider their zero state. They will
| never be zero.
|
| It's funny seeing this beside the DRY thread. Seems
| programmers taking things a bit too literally is a common
| theme.
| stouset wrote:
| > Initialize your struct fields and you won't have to
| consider their zero state.
|
| "Just do the right thing everywhere and you don't have to
| worry!"
|
| You can't stop consumers of your libraries from creating
| zero-valued instances.
| randomdata wrote:
| Then the zero value is their problem, not yours. You have
| no reason to be worried about that any more than you are
| worried about them not getting enough sleep, or eating
| unhealthy food. What are you doing to stop them from
| doing that? Nothing, of course. Not your problem.
|
| Coq exists if you really feel you need a complete type
| system. But there is probably good reason why almost
| nobody uses it.
| stouset wrote:
| > Then the zero value is their problem, not yours.
|
| Except for all those times _you 're_ the consumer of
| someone else's library and there's no way for them to
| indicate that creating a zero-valued struct is a bug.
|
| Again, it's the philosophy of "Just do the right thing
| everywhere and you don't have to worry!" Sometimes it's
| nice to work with a type system where designers of
| libraries can actually prevent you from writing bugs.
| randomdata wrote:
| _> Except for all those times you 're the consumer of
| someone else's library and there's no way for them to
| indicate that creating a zero-valued struct is a bug._
|
| Nonsense. Go has a built-in facility for documentation to
| communicate these things to other developers. Idiomatic
| Go _strongly_ encourages you to use it. Consumers of the
| libraries expect it.
|
| _> Sometimes it 's nice to work with a type system where
| designers of libraries can actually prevent you from
| writing bugs._
|
| Well, sure. But, like I said, almost nobody uses Coq. The
| vast, vast, vast majority of projects - and I expect 100%
| of web projects - use languages with incomplete type
| systems, making what you seek impossible.
|
| And there's probably a good reason for that. While
| complete type systems sound nice in theory, practice
| isn't so kind. There are tradeoffs abound. There is no
| free lunch in life. Sorry.
| owl57 wrote:
| _> The vast, vast, vast majority of projects - and I
| expect 100% of web projects - use languages with
| incomplete type systems, making what you seek
| impossible._
|
| ...where, "what GP seeks" is...
|
| _> way for [library authors] to indicate that creating a
| zero-valued struct is a bug_
|
| I'd say that's a really low and practical bar, you really
| don't need Coq for that. Good old Python is enough, even
| without linters and type hints.
|
| Of course it's very easy to create an equivalent of zero
| struct (object without __init__ called), but do you think
| it's possible to do it while not noticing that you are
| doing something unusual?
| randomdata wrote:
| _> Good old Python is enough_
|
| No, Python is not enough to "...work with a type system
| where designers of libraries can actually prevent you
| from writing bugs." Not even typed Python is going to
| enable that. Only a complete type system can see the
| types prevent you from writing those bugs. And I expect
| exactly nobody is writing HTTP services with a language
| that has a complete type system - for good reason.
|
| > Of course it's very easy to create an equivalent of
| zero struct
|
| Yes, you are quite right that you, the library consumer,
| can Foo.__new__(Foo) and get an object that hasn't had
| its members initialized just like you can in Go. But
| unless the library author has specifically called
| attention to you to initialize the value this way, that
| little tingling sensation should be telling you that
| you're doing something wrong. It is not conventional for
| libraries to have those semantics. Not in Python, not in
| Go.
|
| Just because you can doesn't mean you should.
| the_gipsy wrote:
| You don't have to go as far as Coq. Rust manages "parse,
| don't validate" extremely well with serde.
|
| Go's zero-values are the problem, not any other lack of
| its type system.
| randomdata wrote:
| _> You don 't have to go as far as Coq._
|
| No, you do. Anywhere the type system is incomplete means
| that the consumer can do something the library didn't
| intend. Rust does not have a complete type system. There
| was no relevance to mentioning it. But I know it is time
| for Rust's regularly scheduled ad break. And while you
| are at it, enjoy a cool, refreshing Coca-Cola.
|
| _> Go 's zero-values are the problem_
|
| _" Sometimes it's nice to work with a type system where
| designers of libraries can actually prevent you from
| writing bugs."_ has nothing to do with zero-values. It
| doesn't even really have anything to do with Go
| specifically. My, the quality of advertising has really
| declined around here. Used to be the Rust ads at least
| tried to look like they fit in.
| the_gipsy wrote:
| Any language without zero-values (or some equally
| destructive quality) can do "parse, don't validate". Go
| cannot. Rust is just an example.
| randomdata wrote:
| Top of the hour again? Time for another Rust
| advertisement?
|
| The topic at hand is about preventing library users from
| doing things the library author didn't intended using the
| type system, not "what happens if a language has zero-
| values". Perhaps you are not able to comprehend this
| because you are hungry? You're not you when you are
| hungry. Grab a Snickers.
| the_gipsy wrote:
| what happens if a language has zero-values, is that you
| can't "parse, don't validate".
|
| Maybe it's time for you to finally try rust? Or any other
| language without zero-values, since rust seems to
| irritate you in particular.
| stouset wrote:
| This insane perspective of "nothing is totally perfect so
| any improvements over what go currently does are
| pointless" whenever you confront a gopher with some
| annoying quirk of the language is one of the worst design
| flaws in the golang community hivemind.
| randomdata wrote:
| Tell us, why you hold that perspective? It's an odd one.
| Nobody else in this thread holds that perspective. You
| even admit it is insane, yet here you are telling us
| about this unique perspective you hold for some reason.
| Are you hoping that we will declare you insane and admit
| you in for care? I don't quite grasp the context you are
| trying to work within.
| DandyDev wrote:
| You manage to present a strawman and produce a No True
| Scotsman fallacy all at once in this comment thread.
|
| Nobody is suggesting that Coq should be used, so stop
| bringing it up (strawman). And yes, Coq might have an
| even stricter and more expressive type system than Rust.
| But nobody is asking for a perfect type system (no true
| Scotsman). People are asking to be able to prevent users
| of your library to provide illegal values. Rust (and
| Haskell and Scala and Typescript and ....) lets you do
| this just fine whereas Golang doesn't.
|
| And personally I would much rather have the compiler or
| IDE tell me I'm doing something wrong than having to read
| the docs in detail to understand all the footguns.
|
| My personal opinion is that - even though I'm very
| productive with Golang and I enjoy using it - Golang has
| a piss poor type system, even with the addition of
| Generics.
| randomdata wrote:
| _> People are asking to be able to prevent users of your
| library to provide illegal values. [...] and Typescript_
|
| Typescript, you say? const bar: Foo = {}
| as Foo
|
| Hmm. Oh, right, just don't hold it wrong. But _"
| sometimes it's nice to work with a type system where
| designers of libraries can actually prevent you from
| writing bugs."_
|
| Your example doesn't even satisfy the base case, let
| alone the general case. Get back to us when you have
| actually read the thread and can provide something on
| topic.
| DandyDev wrote:
| But that is not an accident, is it? It's someone very
| deliberately casting an object. It's not the same and you
| probably know it.
| randomdata wrote:
| It might be an accident. Someone uninitiated may think
| that is how you are expected to initialize the value. A
| tool like Copilot may introduce it and go unnoticed.
|
| But let's assume the programmer knows what they are doing
| and there is no code coming from any other source. When
| would said programmer write code that isn't deliberate?
| What is it about Go that you think makes them, an
| otherwise competent programmer, flail around haphazardly
| without any careful deliberation?
| ttymck wrote:
| This is where we arrive at my conclusion that go is not
| well-suited to implementing business logic!
| bsdpufferfish wrote:
| C++ constructors actually make the guarantee, but it comes
| with other pains
| masklinn wrote:
| Lots of languages handle it just fine and don't need the
| mess of C++ ctors.
|
| GP is pointing out that go specifically makes it an issue.
| bsdpufferfish wrote:
| What language do you have in mind?
| masklinn wrote:
| Any language which supports private state: smalltalk,
| haskell, ada, rust, ...
| swah wrote:
| My Go is rusty, do you mean not exporting the type "Username"
| (ie username) to avoid default constructor usage?
| mtlynch wrote:
| In Go, capitalized identifiers are exported, whereas
| lowercase identifiers are not.
|
| In the example I gave above, clients outside of the package
| can instantiate Username, but they can't access its "value"
| member, so the only way they could get a populated Username
| instance is by calling NewUsername.
| whimsicalism wrote:
| the fact that this is some special "technique" really shows how
| far behind Go's type system & community around typing is
| edflsafoiewq wrote:
| You can use new types with validation too. In fact the
| approaches seem to be duals.
|
| Parse, don't validate: string
| ParsedString untrusted source -------> parse
| --------------> rest of system
|
| Validate, don't parse:
| UnvalidatedString string untrusted source
| ------------------> validate -------> rest of system
| mtlynch wrote:
| The problem is that pattern "fails open." If anyone on the
| team forgets to define an untrusted string as
| UnvalidatedString, the data skips validation.
|
| If you default to treating primitive types as untrusted, it's
| hard for someone to accidentally convert an untrusted type to
| a trusted type without using the correct parse method.
| edflsafoiewq wrote:
| The dual problem would be any function which forgets to
| accept a ParsedString instead of a string can skip parsing.
|
| Both cases appear to depend on there being a "checkpoint"
| all data must go through to cross over to the rest of the
| system, either at parsing or at UnvalidatedString
| construction.
| mtlynch wrote:
| > _The dual problem would be any function which forgets
| to accept a ParsedString instead of a string can skip
| parsing._
|
| > _Both cases appear to depend on there being a
| "checkpoint" all data must go through to cross over to
| the rest of the system, either at parsing or at
| UnvalidatedString construction._
|
| The difference is that if string is the trusted type,
| then it's easy to miss a spot and use the trusted string
| type for an untrusted value. The mistake will be subtle
| because the rest of your app uses a string type as well.
|
| The converse is not true. If string is an untrusted type
| and ParsedString is a trusted type, if you miss a spot
| and forget to convert an untrusted string into a
| ParsedString, that function can't interact with any other
| part of your codebase that expects a ParsedString. The
| error would be much more visible and the damage more
| contained.
|
| I think UnvalidatedString -> string also kind of misses
| the point of the type system in general. To parse a
| string into some other type, you're asserting something
| about the value it stores. It's not just a string with a
| blessing that says it's okay. It's a subset of the string
| type that can contain a more limited set of values than
| the built-in string type.
|
| For example, parsing a string into a Username, I'm
| asserting things about the string (e.g., it's <10
| characters long, it contains only a-z0-9). If I just use
| the string type, that's not an accurate representation of
| what's legal for a Username because the string type
| implies any legal string is a valid value.
| eru wrote:
| The example also assumes that everything is like a
| 'ParsedString' that contains a copy of the original
| untrusted value inside.
| zelphirkalt wrote:
| I always understood "parse don't validate" a bit differently.
| If you are doing the validation inside of a constructor, you
| are still doing validation instead of parsing. It is safer to
| do the validation in one place you know the execution will go
| through, of course, but not the idea I understand "parse don't
| validate" to mean. I understand it to mean: "write an actual
| parser, whatever passes the parser can be used in the rest of
| the program", where a parser is a set of grammar rules for
| example, or PEG.
| mtlynch wrote:
| I'm not a Haskell developer, so it's possible that I
| misunderstood the original "Parse, Don't Validate" post.
|
| > _If you are doing the validation inside of a constructor,
| you are still doing validation instead of parsing._
|
| Why that would be considered validation rather than parsing?
|
| From the original post:
|
| > _Consider: what is a parser? Really, a parser is just a
| function that consumes less-structured input and produces
| more-structured output._
|
| That's the key idea to me.
|
| A parser enforces checks on an input and produces an output.
| And if you define an output type that's distinct from the
| input type, you allow the type system "preserve" the fact
| that the data passed a parser at some point in its life.
|
| But again, I don't know Haskell, so I'm interested to know if
| I'm misunderstanding Lexi Lambda's post.
| kevincox wrote:
| Parse don't validate means that if you want a function that
| converts an IP address string to a struct IpAddress{
| address: string } you don't validate that the input string
| is a valid IP address then return a struct with that string
| inside. Instead you parse that IP into raw integers, then
| join those back into an IP string.
|
| The idea is that your parsed representation and serializer
| are likely produce a much smaller and more predictable set
| of values than may pass the validator.
|
| As an example there was a network control plane outage in
| GCP because the Java frontend validated an IP address then
| stored it (as a string) in the database. The C++ network
| control plane then crashed because the IP address actually
| contained non-ASCII "digits" that Java with its Unicode
| support accepted.
|
| If instead the address was parsed into 4 or 8 integers and
| was reserialized before being written to the DB this outage
| wouldn't have happened. The parsing was still probably more
| lax than it should have been, but at least the value
| written to the DB was valid.
|
| In this case it was funny Unicode, but it could be as
| simple as 1.2.3.04 vs 1.2.3.4. By parsing then re-
| serializing you are going to produce the more canonical and
| expected form.
| xyzzy_plugh wrote:
| Perhaps "normalize" or "canonicalize" is more
| appropriate. A parser can liberally interpret but I don't
| take it to imply some destructured form necessarily.
| There are countless scenarios where you want to be able
| to reproduce the exact input, and often preserving the
| input is the simplest solution.
|
| But yes usually you do want to split something into it's
| elemental components, should it have any.
| mtlynch wrote:
| Thanks for that explanation! I hadn't appreciated that
| aspect of "parse, don't validate," before.
|
| But even with that understanding and from re-reading the
| post, that seems to be an extra safety measure rather
| than the essence of the idea.
|
| Going back to my original example of parsing a Username
| and verifying that it doesn't contain any illegal
| characters, how does a parser convert a string into a
| more direct representation of a username without using a
| string internally? Or if you're parsing an uint8 into a
| type that logically must be between 1 and 100, what's the
| internal type that you parse it into that isn't a uint8?
| eru wrote:
| > Or if you're parsing an uint8 into a type that
| logically must be between 1 and 100, what's the internal
| type that you parse it into that isn't a uint8?
|
| Just for the sake of example, your internal
| representation might start from 0, and you just add 1
| whenever you output it.
|
| Your internal type might also not be a uint8. Eg in
| Python you would probably just use their default type for
| integers, which supports arbitrarily big numbers. (Not
| because you need arbitrarily big numbers, but just
| because that's the default.)
| kevincox wrote:
| Personally I don't think I would have used the phrase
| "parse don't validate" for something like a username. It
| isn't clear to me what it would mean exactly. I generally
| only thing of this principle for data that has some
| structure, not as much a username or number from 1-100.
|
| IP address would be about the minimum amount of
| structure. Something else would be like processing API
| requests. You can take the incoming JSON and fully parse
| it as much as possible, rather than just validate it is
| as expected (for example drop unknown fields)
| tdeck wrote:
| But surely this is just another way of doing validation and not
| fundamentally "parsing"? If at the end you've just stored the
| input exactly as you got it, the only parsing you're
| potentially doing is in the validation step and then it gets
| thrown away.
| ezrast wrote:
| Implementation-wise, yes, but the interface you're exposing
| is indistinguishable from that of a parser. For all your
| consumers know, you could be storing the username as a
| sequence of a 254-valued enum (one for each byte, except the
| angle brackets) and reconstructing the string on each "get"
| call. For more complex data you would certainly be storing it
| piecewise; the only reasons this example gets a pass are 1)
| because it is so low in surface area that a human can
| reasonably validate the implementation as bug-free without
| further aid from the type checker, and 2) because Go's type
| system is so inexpressive that you can't encode complex
| requirements with it anyway.
| draven wrote:
| The validation is not completely thrown away, since the type
| indicates that the data has been validated. I understand
| "parsing" as applying more structure to a piece of data.
| Going from a String to an IP or a Username fits the
| definition.
|
| I push my team to use this pattern in our (mostly Scala)
| codebase. We have too many instances of useless validations,
| because the fact that a piece of data has been
| "parsed"/validated is not reflected in its type using simple
| validation.
|
| For example using String, a function might validate the
| String as a Username. Lower in the call stack, a function
| ends up taking this String as an arg. It has no way of
| knowing if it has been validated or not and has to re-
| validate it. If the first validation gets a Username as a
| result, other functions down the call stack can take a
| Username as an argument and know for sure it's been validated
| / "parsed".
| dang wrote:
| Related:
|
| _Parse, don 't validate (2019)_ -
| https://news.ycombinator.com/item?id=35053118 - March 2023 (219
| comments)
|
| _Parse, Don 't Validate (2019)_ -
| https://news.ycombinator.com/item?id=27639890 - June 2021 (270
| comments)
|
| _Parse, Don't Validate_ -
| https://news.ycombinator.com/item?id=21476261 - Nov 2019 (230
| comments)
|
| _Parse, Don 't Validate_ -
| https://news.ycombinator.com/item?id=21471753 - Nov 2019 (4
| comments)
| kvnhn wrote:
| This is a variation on one of my favorite software design
| principles: Make illegal states unrepresentable. I first
| learned about it through Scott Wlaschin[1].
|
| [1]: https://fsharpforfunandprofit.com/posts/designing-with-
| types...
| WuxiFingerHold wrote:
| So far I like the commonly used approach in the Typescript
| community best:
|
| 1. Create your Schema using https://zod.dev or
| https://github.com/sinclairzx81/typebox or one of the other
| many libs.
|
| 2. Generate your types from the schema. It's very simple to
| create partial or composite types, e.g. UpdateModel,
| InsertModels, Arrays of them, etc.
|
| 3. Most modern Frameworks have first class support for
| validation, like Fastify (with typebox). Just reuse your schema
| definition.
|
| That is very easy, obvious and effective.
| otabdeveloper4 wrote:
| > you have to remember to call the validator on 100% of code
| paths
|
| But copy-pasting the same lines of code in literally _every_
| function is the Golang Way.
|
| It makes code "simpler".
| HeavyStorm wrote:
| Encapsulation saves lives.
| jupp0r wrote:
| I found fx(https://github.com/uber-go/fx) to be a super simple
| yet versatile tool to design my application around.
|
| All the advice in the article is still helpful, but it takes the
| "how do I make sure X is initialized when Y needs it" part
| completely out of the equation and reduces it from an N*M problem
| to an N problem, ie I only have to worry about how to initialize
| individual pieces, not about how to synchronize initialization
| between them.
|
| I've used quite a few dependency injection libraries in various
| languages over the years (and implemented a couple myself) and
| the simplicity and versatility of fx makes it my favorite so far.
| Philip-J-Fry wrote:
| >All the advice in the article is still helpful, but it takes
| the "how do I make sure X is initialized when Y needs it" part
| completely out of the equation and reduces it from an N*M
| problem to an N problem, ie I only have to worry about how to
| initialize individual pieces, not about how to synchronize
| initialization between them.
|
| I gotta say, I hate these dependency injection frameworks.
|
| In a well designed system this should be trivial. Making sure
| something is initialised when you want to use it is just a
| matter of it being available to pass in a constructor as a
| parameter. stockService := NewStockService()
| orderService := NewOrderService() orderProcessor :=
| NewOrderProcessor(stockService, orderService)
|
| There shouldn't be any sort of "synchronisation" of
| initialisation needed because your code won't compile if you do
| something wrong. If you add a cyclic dependency you will
| clearly see that because you won't be able to construct things
| in the right order without an obvious workaround.
| jupp0r wrote:
| If you have ever topologically sorted 100 components
| connected in a complex graph by hand or found the right spot
| to insert the 101st, you'd quickly appreciate more help than
| a compiler check.
| therealdrag0 wrote:
| I'm sure there's a place for them.
|
| But when micro-services are so common, it seems like people
| use them (Spring) because everyone else does, not because
| they actually provide needed value.
| jupp0r wrote:
| Spring is an overly complicated mess. I use DI when it
| makes things simpler, I don't see how Spring would ever
| do that.
| Philip-J-Fry wrote:
| Your dependency structure should just be a tree.
|
| It should be inserted literally right next to it's first
| use case. Your IDE will literally point it to you with red
| squigglys because the places where you've added a
| dependency will be missing a parameter. Go to the highest
| one and add it on the line above.
| jupp0r wrote:
| I've never seen a tree graph, not without lots of global
| mutable state to cheat around DI. Your logger is just
| going to be needed almost everywhere.
|
| What do you do on shutdown? In languages with
| destructors, that can automatically give you a call order
| in reverse of the construction order, but in Go you end
| up manually ordering things or just not having panicless
| shutdowns.
| Philip-J-Fry wrote:
| Okay, it's not a tree. Because multiple objects will
| depend on something like a logger. But it's an acyclic
| graph if designed properly. Which is incredibly simple to
| setup and teardown.
|
| If your loggers are needed everywhere, then you just pass
| them as a constructor to the objects that need them.
| You're literally doing this with fx anyway.
|
| Like, a logger is probably the first thing you new up in
| main(). So now you can pass it down as a dependency in
| constructors.
|
| For shutdown you just defer your shutdown functions. Have
| a basic interface where your services have a Shutdown()
| method and then you can push them onto a stack and pop
| them off during shutdown.
|
| There's no manual ordering involved. Your initialisation
| is a linear top down process, your shutdown is bottom up.
| It can't be any simpler. If you keep code as close to
| usage sites then there's only 1 possible order.
| jupp0r wrote:
| I agree with you on all of this. fx is not doing much
| more for shutdown than what you describe (calling a
| handler pushed to a stack created during initialization).
| Instead of implementing this for every app, I just prefer
| to use a library with great documentation and tests.
| williamdclt wrote:
| I'm not against DI, but I don't find your argument
| convincing: having dependencies modelled directly with the
| simplest language constructs (variables and arguments) and
| validated by the compiler makes "debugging" a ton simpler
| than dealing with DI errors, even in a good DI framework.
| Having an error just means I wrote invalid code: even a
| junior can easily figure it out.
|
| DI still has other advantages, but that's not one
| jupp0r wrote:
| I don't disagree with you, I've argued against the usage
| DI frameworks plenty of times on projects I was working
| on. Many are not well made, are overly complicated and do
| much more than one single thing.
|
| Especially in Go, where you don't have destructors to
| help with shutdown, having common structure in place to
| help tear down components has always been a net benefit
| for me.
| liampulles wrote:
| I agree with a lot of this, I'll add my own opinions:
|
| * I would pass a waitgroup with the app context to service
| structs. This way the interrupt can trigger the app shutdown via
| the context and the main goroutine can wait on the waitgroup
| before actually killing the app.
|
| * If writing a CLI program, then testing stdout, stdin, stderr,
| args, env, etc. is useful. But for an http server, this is less
| true. I would pass structured config to the run function to let
| those tests be more focused.
|
| * I disagree with parsing templates using sync.Once in a handler
| because I don't think handlers should do template parsing at all.
| I would do this when the app starts: if the template cannot be
| parsed, the app should not become ready to receive any requests
| and should rather exit with a non-zero exit code.
| hyeomans wrote:
| I find your first point interesting, wouldn't be that solved by
| context propagation and waiting for the server to shutdown?
| Thanks!
| earthboundkid wrote:
| The validator should return map[string][]string so that a request
| can have multiple problems with one field.
| earthboundkid wrote:
| The sync.Once should be sync.OnceValues instead.
| bumpa wrote:
| The encode example contains a bug and a lint issue. Firstly,
| calling w.Header().Set after w.WriteHeader is likely a bug, as
| the w.WriteHeader method call should occur after setting the
| headers.
|
| The second issue involves passing an unused *http.Request, which
| will likely cause the linter to flag it.
| Animats wrote:
| I just run Go servers under fcgi. You get orchestration and crash
| recovery with a very simple interface. Fcgi will launch server
| processes as needed, feed them events, and shut it down when
| there's no traffic. Performance is good, and you can run on cheap
| hosting.
| chubot wrote:
| Which hosting do you use? I use fastcgi with python on
| Dreamhost and it works fine, but I'm sorta worried that they'll
| turn it off because it seems kind of niche and under-documented
| Animats wrote:
| Dreamhost too. Dreamhost will let you run a continuously
| running process.
|
| The amount of work you can get done on low-end shared hosting
| is really quite impressive.
| sylware wrote:
| I did write my own HTTP stuff in C (and more generally internet
| stuff), on linux (sometimes without a libc, namely direct
| syscalls), running on ARM64 and x86_64.
|
| And I plan to move to rv64 assembly once I can get reasonably
| performant hardware (it is already here, but it extremely hard to
| get some where I am from and how I operate). I dunno if it will
| be bare metal or with a linux kernel first (coze a minimal TCP
| stack is already a big thingy).
| jurschreuder wrote:
| This is 100% not how I write it.
|
| Only thing I agree on is putting all the paths in one file.
|
| In most other programming languages I've done a lot of research
| how to make it nice and clean.
|
| Was hoping this was it for Go because I'm cleaning up a big
| project.
|
| But my very basic no nonsense current setup seems better to me
| than this in many ways.
|
| If anybody has another example that is a lot better (and I don't
| mean complexer I don't have those ego issues), I am very
| interested.
|
| But this I want to hide as best as I can from my dev team this is
| all wrong. It's clever in a lot of ways but it's wrong.
|
| It does not have unit testing at all, all these tests would be
| duplicated in the end-to-end test.
|
| I also like end-to-end tests better but why put them here, way
| better to put them in postman for example then you have the most
| up to date documentation always auto-generated.
|
| Passing the config, man I had so many discussions with junior
| developers about this, don't do that you'll make things dependent
| on the config and cannot reuse them in other programs. But that
| was already mentioned a lot here.
|
| There are also a lot of functions with like 10 arguments passed.
| If you have that many arguments just pass a stuct containing a
| lot of the arguments it's always super confusing when people make
| functions with 12 arguments. I'm always counting them an after 14
| times counting I rewrite their function.
|
| It's a matter of style so keep doing it this way if you like it,
| but it's not my style at all it makes no sense at all to me.
|
| If anybody knows a better example please tell me.
___________________________________________________________________
(page generated 2024-02-10 23:01 UTC)