[HN Gopher] Keep Pydantic out of your Domain Layer
___________________________________________________________________
Keep Pydantic out of your Domain Layer
Author : erikvdven
Score : 73 points
Date : 2025-07-23 06:54 UTC (3 days ago)
(HTM) web link (coderik.nl)
(TXT) w3m dump (coderik.nl)
| IshKebab wrote:
| This seems ridiculously over-complicated. This guy would _love_
| Java.
|
| He doesn't even say _why_ you should tediously duplicate
| everything instead of just using the Pydantic objects - just
| "You know you don't want that"! No I don't.
|
| The only reason I've heard is performance... but... you're using
| Python. You don't give a shit about performance.
| photios wrote:
| The "gain" TFA is describing is very very questionable too.
| You're losing a lot in terms of complexity.
|
| You're going from a straightforward "Pydantic everywhere"
| solution to a weird concoction of:
|
| 1. Pydantic models
|
| 2. "Poor man's Pydantic models" (dataclasses)
|
| 3. Obscure third party dependencies (Dacite)
|
| Thanks, I'll pass.
| pletnes wrote:
| Pydantic seems to be fast (in the context, it's written in
| rust) so it might make sense to keep using pydantic for
| performance reasons.
| yedpodtrzitko wrote:
| Just because it's written in Rust it doesnt mean it's fast. I
| was working on a project where Pydantic was the bottleneck -
| there were multiple levels of nested Pydantic objects, and
| creating the instances was very slow due to the validation
| which is performed on input values. Even after disablign the
| validation, dataclasses were twice as fast, compiling the
| dataclasses with mypyc improved the performance ten times.
| ensignavenger wrote:
| Were you using v2?
|
| Pydantic docs do clearly state that multple levels of
| nesting of Pydantic objects can make it much slower, so it
| isn't particularly surprising that such models were slow.
| franktankbank wrote:
| > you're using Python. You don't give a shit about performance.
|
| That's dumb. You may not care about max performance but you've
| got some threshold where shit gets obviously way to slow to be
| workable. I've worked with a library heavy on pydantic where it
| was the bottleneck.
| hxtk wrote:
| The main "why" that I find is that it allows you to
| intentionally design your API types and know when a change is
| touching them.
|
| I worked on a project with a codebase on the order of millions
| of lines, and many times a response was made by taking an ORM
| object or an app internal data structure and JSON serializing
| it. We had a frequent problem where we'd make some change to
| how we process a data structure internally and oops, breaking
| API change. Or worse yet, sensitive data gets added to a
| structure typically processed with that data, not realizing it
| gets serialized by a response handler.
|
| It was hard to catch this in code review because it was hard to
| even know when a type might be involved in generating a
| response elsewhere in the code base.
|
| Switching to a schema-first API design meant that if you were
| making a change to a response data type, you knew it. And the
| CODEOWNERS file also knew it, and would bring the relevant
| parties into the code review. Suddenly those classes of
| problems went away.
| microflash wrote:
| For many cases, we don't do these kind of things in Java; a
| single annotated record can function as a model for both data
| and API layers. Regardless of the language, the distinction
| becomes important when these layers diverge or there's some
| sensitive data involved.
| never_inline wrote:
| Even when DTO is separate, you can use projection methods of
| eg: spring data JPA, or something like mapstruct.
| slt2021 wrote:
| >you're using Python. You don't give a shit about performance.
|
| Maybe it is true if you artificially limit yourself to a single
| instance single thread model, due to GIL.
|
| But because nowadays apps can easily be scaled up in many
| instances, this argument is irrelevant.
|
| one may say that Python has large overhead when using a lot of
| objects, or that it has GIL, but people learned how to serve
| millions of users with python easily.
| dontlaugh wrote:
| Any Python code will be dozens to hundreds of times slower
| than Go or Java, but it may still be fast enough to stay
| within human reaction latencies.
|
| And you make be able to scale to many users, worst case with
| more machines. But it'll still costs you a lot more than a
| faster language. That is extremely relevant, even today.
| chausen wrote:
| The article is written for those who want to apply DDD/onion
| architecture to Python apps using Pydantic. Those concepts
| explain the motivation and the article assumes the reading
| knows about them. As others are writing, it may not be worth it
| to apply this to simple apps, but as an app grows in complexity
| it will help make it more extensible, maintainable, etc.
|
| I'm not a Python expert, but looking into it briefly it seems
| like Pydantic's role is at application boundaries for bringing
| validation/typing to external data sources. If you are not
| working with external data, there is no reason to use it. So,
| if you separate out a domain layer, it brings no benefit there.
| Creating a domain layer where you handle business logic
| separately from how you interact with external data means those
| layers can evolve independently. An API could change and you
| only need to update your API models/mapping.
| rtpg wrote:
| In the Django world I have gotten very frustrated at people
| rushing to go from DRFs serializers to Django Ninja + Pydantic.
|
| You have way less in terms of tools to actually provide nice
| straightforward APIs. I appreciate that Pydantic gives you type
| safety but at one point the actual ease of writing correct code
| goes beyond type safety
|
| Just real straightforward stuff around dealing with loading in
| user input becomes a whole song and dance because Pydantic is an
| extremely basic validation thing... the hacks in DRF like request
| contexts are useful!
|
| I've seen many projects do this and it feels like such a step
| back in offering simple-to-maintain APIs. Maybe I'm just biased
| cuz I "get" DRF (and did lose half a day recently to weird DRF
| behavior...)
| zo1 wrote:
| This is the Javascript hipster effect. FastAPI and Pydantic are
| pushed heavily because of their fancy docs page and the
| evangelism which thrives on reinventing the wheel. So we are
| all now stuck with everything being Pydantic this Pydantic
| that, instead of existing frameworks which are frankly better.
| WesolyKubeczek wrote:
| It's also because Pydantic has VC money and needs to grow
| fast now, or else.
| murkt wrote:
| What's the story here, can anyone enlighten me? How can
| they make money being a Python library?
|
| I can stretch my imagination about Astral monetizing their
| tools, but this one is too difficult
| stephantul wrote:
| Pydantic (the company) owns logfire, a logging service.
| There's a lot of money in logging/observability. The
| pydantic library itself is not monetizable, as you
| indicate.
| zo1 wrote:
| Wow really I had no idea. This rabbit hole goes deeper then
| I expected!
|
| In 2022, the project evolved into a commercial entity
| called Pydantic Services Inc., founded by Samuel Colvin and
| Adrian Garcia Badaracco, to build products around the open-
| source library. The company raised $4.7 million in seed
| funding in February 2023, led by Sequoia Capital, with
| participation from Partech, Irregular Expressions, and
| other investors. This was followed by a $12.5 million
| Series A round in October 2024, again led by Sequoia
| Capital and including Partech Partners, bringing the total
| funding to approximately $17.2 million across rounds. The
| Series A funding coincided with the launch of Pydantic
| Logfire, a commercial observability platform for backend
| applications, aimed at expanding beyond the core open-
| source validation framework. As of mid-2025, no additional
| funding rounds have been publicly reported.
|
| https://techcrunch.com/2023/02/16/sequoia-backs-open-
| source-...
| rtpg wrote:
| To be fair I do think that Pydantic leaning into the type
| annotation story is nice. If you're really going lean or
| performant the restrictions work well in your favor. Just
| like... for the bog standard B2B SaaS the expressivity
| tradeoff just doesn't feel worth it.
|
| In a more just world pythons typing story was closer to
| typescript's and we could have a fully realized idea like it
| that supports the asymmetric nature of
| serializing/deserializing and offers nice abstractions
| through the stack
|
| Right now Pydantic for me is like "you can validate a
| straightforward data structure! Now it's up to you to
| actually build up a useful data structure from the
| straightforward one". Other tools give me both in one go. At
| the cost of safety (that you can contain, but you gotta do it
| right)
| cout wrote:
| Which existing framework is better?
| lispisok wrote:
| What alternatives do you suggest then?
| zo1 wrote:
| It's a tough answer because we have had years of
| artificially-pumped support and development and ecosystem
| growth of Pydantic.
|
| But if I had to roll the clock back I'd recommend
| marshmallow and that entire ecosystem. It's definitely way
| less bloated than Pydantic currently, and only lacks some
| features. Beyond that, just use plain-old dataclasses.
|
| https://marshmallow.readthedocs.io/en/latest/
| pdhborges wrote:
| Are you refering to DRF model serializers? For medium to big
| applications I think they are worthless.
| rtpg wrote:
| Shrug, I find them more helpful than Pydantic models for lots
| of canonical cases.
|
| I have had good success with DRF model serializers in like
| Django projects with 100+ apps (was the sprawling nature of
| the apps itself a problem? Sure, maybe). Got the job done
|
| As with anything you gotta built your own wrappers around
| these things to get value in larger projects though
| JackSlateur wrote:
| Could you elaborate a bit more, perhaps with examples ?
| lysecret wrote:
| The core thesis is that your types received by the api should not
| be the same as the types you process internally. I can see a
| situation where this makes sense and a situation where this
| senselessly duplicates everything. The blog post shows how to do
| it but never really dives into why/when.
| r9295 wrote:
| Personally, I think that's a good idea. Design patterns
| naturally make sense (Visitor, Builder for e.g) once you
| encounter such a situation in your codebase. It almost makes
| complete sense then. Otherwise IMHO, it's just premature
| abstraction
| roland35 wrote:
| No one is satisfied with premature abstraction :(
| tetha wrote:
| It does touch on what I was thinking as well at the end of the
| first section: Usually this makes sense if your application has
| to manage a lot of complexity, or rather, has to consume and
| produce the same domain objects in many different ways across
| many different APIs.
|
| For example, some systems interact with several different
| vendor, tracking and payment systems that are all kinda the
| same, but also kinda different. Here it makes sense to have an
| internal domain model and to normalize all of these other
| systems into your domain model at a very early level. Otherwise
| complexity rises very, very quickly due to the number of n
| things interacting with n other things.
|
| On the other hand, for a lot of our smaller and simpler systems
| that output JSON based of a database for other systems... it's
| a realistic question if maintaining the domain model and API
| translation for every endpoint in every change is actually less
| work than ripping out the API modelling framework, which occurs
| once every few years, if at all? Some teams would probably
| rewrite from scratch with new knowledge, especially if they
| have API-tests available.
| AlphaSite wrote:
| I'd say where it's more Important is when you need to manage
| database performance. This lets you design an api that's
| pleasant for users, well normalised internally, while also
| performing well.
|
| Usually normalisation and performance lead to a poor api
| that's hard for users to use and hard hard to evolve since
| you're so tightly coupled to your external representation.
| jon-wood wrote:
| I've not done this in Python, where mercifully I don't really
| touch CRUD style web apps anymore, but when I was doing Ruby
| web development we settled on similar patterns.
|
| The biggest benefit you get is being able to have much more
| flexibility around validation when the input model (Pydantic
| here) isn't the same as the database model. The canonical
| example here would be something like a user, where the
| validation rules vary depending on context, you might be
| creating a new stub user at signup when only a username and
| password are required, but you also want a password
| confirmation. At a different point you're updating the user's
| profile, and that case you have a bunch of fields that might be
| required now but password isn't one of them and the username
| can't be changed.
|
| By having distinct input models you make that all much easier
| to reason about than having a single model which represents the
| database record, but also the input form, and has a bunch of
| flags on it to indicate which context you're talking about.
| nvader wrote:
| I'm with you. But what want sufficiently justified in the
| article is why both sides of that divide, canonical User and
| User stubs, could not be pydantic models.
| nine_k wrote:
| The idea, as far as I was able to understand it, is that
| you want your core models as dependency-free as possible.
| If you, for whatever reason, were to drop Pydantic, that
| would only affect the way you validate inputs from API, and
| nothing deeper.
|
| This wasn't mentioned, but the constant validation on
| construction also costs something. Sometimes it's a cost
| you're willing to pay (again, dealing with external
| inputs), sometimes it's extraneous because e.g. a
| typechecker would suffice to catch discrepancies at build
| time.
| Groxx wrote:
| I've also generally found that separating the types passively
| reminds people that they are not _forced_ to keep those types
| the same.
|
| Whenever I've been in codebases with externally-controlled
| types as their internal types, almost every single design
| that goes into the project is based around those types and
| whatever they efficiently model. It leads to much worse API
| design, both externally and internally, because it's based on
| what they _have_ rather than _what they want_.
| mattmanser wrote:
| It's a pattern that rapidly leads to tons of DTOs that
| endlessly repeat exactly the same properties.
|
| Your example doesn't even justify it's use, in that scenario
| the small form is actually a completely different object from
| the User object, a UserSignup. That's both conceptually
| different and practically different to an actual User.
|
| The worst pattern is when programmers combine these useless
| DTOs with some sort of auto mapper, which results in huge
| globs of boilerplate making any trivial changes to data
| definitions a multi file job.
|
| The worst one I've seen was when to add one property I had to
| edit 40 files.
|
| I get why people do it, but if you make it a pattern it's a
| massive drag to development velocity. It's anti-patterns like
| that which give statically typed languages a bad name.
|
| You should really only use it when you really, really need
| to.
| NeutralCrane wrote:
| > The core thesis is that your types received by the api should
| not be the same as the types you process internally.
|
| Is it? I read the blog a couple of times and never was able to
| divine any kind of thesis beyond the title, but as you said,
| the content never actually explains why.
|
| Perhaps there is a reason, but I didn't walk away from the post
| with it.
| causal wrote:
| Yeah title implies a why, but this is really just about how
| lyu07282 wrote:
| Its confusing to ask that, because that's a different subject
| unrelated to pydantic or python. That's just what you are
| supposed to do in "clean architecture"/ddd, you can ask the
| same question in java or whatever.
| skissane wrote:
| I used to work on a Java app where we did this... we had a
| layer of POJO value classes, a layer of ORM objects... both
| written by hand... plus for every entity a hand-written mapper
| which translated between the two... and then sometimes we even
| had a third layer of classes generated from Swagger specs, and
| yet another set of mappers to map between the Swagger classes
| and the value POJOs
|
| Now I mainly do Python and I don't see that kind of boilerplate
| duplication anywhere near as much as I used to. Not going to
| say the same kind of thing never happens in Python, but the
| frequency of it sure seems to have declined a lot-often you get
| a smattering of it in a big Python project rather than it
| having been done absolutely everywhere
| CharlieDigital wrote:
| I think this depends in principle on what you're building.
| Take an API, for example.
|
| The thesis is simple: 1) A DTO is a
| projection or a view of a given entity. 2) The
| "domain entity" itself is a projection of the actual storage
| in a database table. 3) At different layers (vertical
| separation), the representation of this conceptual entity
| changes 4) In different entry/exit points (horizontal
| separation), the projection of the entity may also change.
|
| In some cases, the domain entity can be used in different
| modules/routes and are _projected_ to the API with different
| shapes -- less properties, more properties, transformed
| properties, etc.
|
| Typically, when code has a very well-defined domain layer and
| separation of the DTO and storage representation, the code
| has a very predictable quality because if you are working
| with a `User` domain entity, it behaves consistently across
| all of your code and in different modules. Sometimes, a
| developer intermixes a database `User` or a DTO `User` and
| all of a sudden, the code behaves unpredictably; you suddenly
| have to be cognizant if the `user` instance you're handling
| is a `DBUser`, a `UserDTO`, or the domain entity. It has
| extra properties, missing properties, missing functions,
| can't be passed into some methods, etc.
|
| Does this matter? I think it depends on 1) the size of the
| team, 2) how much re-use of the modules is needed, 3) the
| nature of the service. For a small team, it's overkill. For a
| module that will be reused by many teams, it has long term
| dividends. For a one-off, lightweight service, it probably
| doesn't matter. But for sure, for some core behaviors, having
| a delineated domain model really makes life easy when working
| with multiple teams reusing a module.
|
| I find that the code I've worked with over the years that I
| like has this quality. So if I'm responsible for writing some
| very core service or shared module, I will take the extra
| effort to separate my models -- even if there's more
| duplication required on my behalf because it makes the code
| more predictable to use if everything inside of the service
| expects to have only one specific shape and set of behaviors
| and project shapes outwards as needed for the use case (DTO
| and storage).
| BiteCode_dev wrote:
| Because they don't represent the same thing. Pydantic models
| represent your input, it's the result of the experience you
| expose to the outside world, and therefore comes with
| objectives and constraints matching this:
|
| - make it easy to provide
|
| - make it simple to understand
|
| - make it familiar
|
| - deal with security and authentication
|
| - be easily serializable through your communication layer
|
| On the other hand, internal representations have the goal to
| help you with your private calculations:
|
| - make it performant
|
| - make it work with different subsystems such as persistence,
| caching, queuing
|
| - provide convenience shortcuts or precalculations for your own
| benefits
|
| Sometimes they overlap, or the system is not big enough that it
| matters.
|
| But the more you get to big or old system, the less likely they
| will.
|
| However, I often pass around pydantic objects if I have them,
| and I do this until it becomes a problem. And I rarely reach
| that point.
|
| It's like using Python until you have performance problems.
|
| Practicality beasts premature optimization.
| JackSlateur wrote:
| My pydantic models represent a "Thing" (a concept or
| whatever), not an input
|
| You can translate many things into a Thing, model_validate
| will help you with that (with contextinfo etc)
|
| You can translate your Thing into multiple output format,
| with model_serialize
|
| In your model, you shall put every checks required to ensure
| that some input are, indeed, a Thing
|
| And from there, you can use this object everywhere, certain
| that this is, indeed, a Thing, and that it has all the
| properties that makes a thing a Thing
| BiteCode_dev wrote:
| You can certainly do it, but since serialization and
| validation are the main benefit from using Pydantic, I/O
| are why it exists.
|
| Outside of I/O, the whole machinery has little use. And
| since pydantic models are used by introspection to build
| APIs, automatic deserializer and arg parsing, making it fit
| the I/O is where the money is.
|
| Also, remember that despite all the improved perf of
| pydantic recently, they are still more expensive than
| dataclass, themselves more than classes. They are 8 times
| more expensive to instanciate than regular classes, but
| above all, attribute access is 50% slower.
|
| Now I get that in Python this is not a primary concern, but
| still, pydantic is not a free lunch.
|
| I'd say it's also important to state what it conveys. When
| I see a Pydantic objects, I expect some I/O somewhere.
| Breaking this expectation would take me by surprise and
| lower my trust of the rest of the code. Unless you are deep
| in defensive programming, there is no reason to validate
| input far from the boundaries of the program.
| JackSlateur wrote:
| This is true, there is a performance cost
|
| Apart from what has been said, I find pydantic
| interesting even in the middle of my code: it can be seen
| as an overpowered assert
|
| It helps making sure that the complex data structure
| returned by that method is valid (for instance)
| senkora wrote:
| You should do it if and only if backwards compatibility is more
| important for your project than development velocity.
|
| If you have two layers of types, then it becomes much easier to
| ensure that the interface is stable over time. But the downside
| is that it will take longer to write and maintain the code.
| nyrikki wrote:
| PO?O is just an object not bound by any restriction other than
| those forced by the Language.[0]
|
| From the typing lens, it may be useful to consider it from
| Rice's theorm, and an oversimplification that typing is
| converting a semantic property to a trivial property. (Damas-
| Hindley-Milner inference usually takes advantage of a
| pathological case, it is not formally trivial)
|
| There is no hard fast rules IMHO, because Rice, Rice-Shapiro,
| and Kreisel-Lacombe-Shoenfield-Tseitin theorms are related to
| generalized solutions as most undecidable problems.
|
| But Kreisel-Lacombe-Shoenfield-Tseitin deals with programs that
| are expected to HALT, yet it is still undecidable if one fixed
| program is equivalent to a fixed other program that always
| terminates.
|
| When you start stacking framework, domain, and language
| restrictions, the restrictions form a type of coupling, but as
| the decisions about integration vs disintegration are always
| tradeoffs it will always be context specific.
|
| Combinators (maybe not the Y combinator) and finding normal
| forms is probably a better lens than my attempt at the flawed
| version above.
|
| If you consider using po?is as the adapter part of the hex
| pattern, and notice how a service mesh is less impressive but
| often more clear in the hex form, it may help build intuitions
| where the appropriate application of the author's suggestions
| may fit.
|
| But it really is primarily decoupling of restrictions IMHO.
| Sometimes the tradeoffs go the other way and often they change
| over time.
|
| [0] https://www.martinfowler.com/bliki/POJO.html
| vjerancrnjak wrote:
| Just have 1 input type and 1 output type. You don't need more
| data types in between.
|
| If pydantic packages valid input, use that for as long as you
| can.
|
| Loading stuff from db, you need validation again, either go from
| binary response to 1 validated type with pydantic, or ORM object
| that already validates.
|
| Then stop having any extra data types.
|
| Keeping pydantic only at the edge and then abandoning it by
| reshaping it into another data type is a weird exercise. It might
| make sense if you have N input types and 1 computation flow but I
| don't see how in the world of duck typing you'd need an extra
| unified data type for that.
| sgarland wrote:
| > Loading stuff from db, you need validation again, either go
| from binary response to 1 validated type with pydantic, or ORM
| object that already validates.
|
| You _shouldn't_ need to validate data coming from the database.
| IMO, this is a natural consequence of teams abandoning
| traditional RDBMS best practices like normalization and
| constraints in favor of heavy denormalization, and strings for
| everything.
|
| If you strictly follow 3NF (or higher, when necessary), it is
| literally impossible to have referential integrity violations.
| There may be some other edge cases that can be difficult to
| enforce, but a huge variety of data bugs simply don't exist if
| you don't treat the RDBMS as a dumb KV store.
| NeutralForest wrote:
| What's the motivation for doing this? When does Pydantic in the
| domain model starts being an issue?
| halfcat wrote:
| When the structure of your team makes it a problem. Conway's
| law.
|
| If you have one person maintaining a CRUD app, splitting out
| DTOs and APIs and all of these abstractions are completely not
| needed. Usually, you don't even know yet what the right
| abstraction is, and making a premature wrong abstraction is WAY
| worse. Building stuff because you might need it later is a
| massive momentum killer.
|
| But at some point when the project has grown (if it grows,
| which it won't if you spend all your time making wrong
| abstractions early on), the API team doesn't want their stuff
| broken because someone changed a pydantic model. So you start
| to need separation, not because it's great or because it's "the
| right way" but because it will collapse if you don't. It's the
| least bad option.
| NeutralForest wrote:
| I'm not sure I agree, you can still use Pydantic in the
| domain model and update the version of the API when you
| change the expected schemas of your CRUD application.
|
| Where I'm with you, is that you should take care of your
| boundaries and muddling the line between your Pydantic domain
| models and your CRUD models will be painful at some point. If
| your domain model is changing fast compared to the API you're
| exposing, that could be an issue.
|
| But that's not a "Pydantic in the domain layer" issue, that's
| a separation of concerns issue.
| chausen wrote:
| Often you want your domain models to be structured
| differently than API models, to make them as
| convenient/understandable to work with as possible for your
| use case. If you already have different models, why would
| you want Pydantic in the domain? Even if they start out the
| same, this would allow them to more easily evolve to be
| different. I'm not a python expert, so I could be missing
| the point on Pydantic, but it seems like its value is at
| the edges of your application.
| NeutralForest wrote:
| That's all fair, I just think it has more to do with
| separation of concerns than Pydantic and that the OP
| doesn't make it clear at all.
| politelemon wrote:
| The reasoning given here is more academic than anything else. I'm
| not seeing any actual problem here though. Perhaps this could
| show how this is bad. Until then, I don't think this excessive
| duplication and layering is necessary, and is more of a liability
| itself.
|
| > That's when concerns like loose coupling and separation of
| responsibilities start to matter more.
| gostsamo wrote:
| I'm sure that the pydantic guys had a reason to rename .dict to
| .model_dump. This single change caused so much grieve when
| upgrading to pydantic2. _1 The very idea of unnecessary breaking
| changes is a big reason not to over rely on pydantic, tbh.
|
| _ 1 we were using .dict to introduce pydantic in the mix of other
| entity schemes and handling this change later was a significant
| pain in the neck. Some python introspection mechanism that can
| facilitate deep object recasting might've been nice if possible.
| jmogly wrote:
| Haha, ChatGPT recommends this:
|
| from pydantic import BaseModel
|
| class MyModel(BaseModel): name: str def
| dict(self, *args, **kwargs): return
| self.model_dump(*args, **kwargs)
| gostsamo wrote:
| Yep, and when you are done migrating, you need to remove
| this, and there is pydantic3 coming. Keeping in mind the
| number of libraries nad microservices involved, search and
| replace was the easier option.
|
| PS: thank you, I can think on my own and even failing that,
| chat gpt is not in closed beta any more.
| the__alchemist wrote:
| Representing structured data as key/value pairs is a pattern
| I've only seen in Python, and don't understand why it became
| popular and canonical.
| halfcat wrote:
| > Representing structured data as key/value pairs is a
| pattern I've only seen in Python
|
| Come on. We know you've seen JavaScript.
| brap wrote:
| I'm far from being an experienced Pythonista, but one thing that
| really bugs me in Python (and other dynamic languages) is that
| when I accept an input of some type, like User, I have to wonder
| if it's really a User. This is annoying throughout the codebase,
| not just the API layer. Especially when there are multiple
| contributors.
|
| The argument against using API models internally is something I
| agree with but it's a separate question.
| jon-wood wrote:
| I'm curious, what do you mean by having to wonder if it's
| really a User? It's optional in Python but you can use type
| annotations and then the type checker will shout at you for
| passing something that's not a User instance to things that
| expect one.
| brap wrote:
| The type checker can be ignored in all sorts of ways
| padjo wrote:
| Python has reasonably good types these days. If you were to use
| pydantic to Marshall stuff from the API and then put type
| annotations on every method below that it would be pretty
| bulletproof.
| derriz wrote:
| I've been using Python on and off for a few decades and agree.
| I don't know why you're being downvoted.
|
| I've authored tens of thousands of lines of Python code in that
| time - both for research tools and for "production".
|
| I use type hints everywhere in the Python I write but it's
| simply not enough.
|
| This issue is political and not so much technical as Typescript
| demonstrates how you can add a beautifully orthogonal and
| comprehensive type system to a dynamic language, thus improving
| the language's ergonomics and scaleability.
|
| The political aspect is the fact that early Python promoters
| decided that sanity checking arguments was not "pythonic" and
| this dogma/ideology has persisted to this day. The only
| philosophical basis for this position was that that Python
| offered no support for simple type checking. And apparently if
| you didn't/don't "appreciate" this philosophy, it reflected
| poorly on your software engineering abilities or skill with
| Python.
|
| To be fair, Python isn't the only language of that era, where
| promoters went to great lengths to invent alternate-reality
| bubbles to avoid facing the fact that their pet language had
| some deep flaws - and actually Perl and C++ circles were even
| worse and more inward facing.
|
| So the "pythonic" approach suggests having functions just
| accepting anything, whether it makes sense or not, and allowing
| your code to blow up somewhere deep in some library somewhere -
| that you probably didn't even know you're using.
|
| So instead of an error like "illegal create_user(name: str)
| call: name should be a str but was a float", it's apparently
| better (more "pythonic") to not provide such feed-back to users
| of your functions and instead allow them to have to deal with
| an exception in a 40 line stack trace with something like
| "illegal indexing of float by dict object" in some source file
| library your users haven't even heard of.
| dontlaugh wrote:
| I've also used Python for well over a decade and nowadays I
| mostly don't. But for that reason and because of the terrible
| performance / efficiency.
| zo1 wrote:
| The problem with that small tidbit is that it immediately
| sets your type system to go down the path of Java and
| Typescript (which we all mock for it's crazy type systems and
| examples such as IImplementsFactoryAbstractMethodThingVirtual
| classes). This is not the python way, and is frankly part of
| its secret sauce (if you ask me).
|
| And yes I include Typescript with Java there because it has
| it's own version of the Java class ecosystem hell, we just
| don't notice it yet. Look at any typescript library that's
| reasonably complicated and try to deduce what some of those
| input types actually do or mean - be honest. Heck a few weeks
| back someone posted how they solved a complicated
| combinatorial problem using Typescripts type system alone.
| derriz wrote:
| I don't get your point or what it has to do with the
| "pythonic" suggestion that you don't check early for
| incorrect state/arguments?
|
| Any language, including Python, which supports the concept
| of a class will allow you to write a class called
| IImplementsFactoryAbstractMethodThingVirtual. And none of
| Java, C++, Python, Typescript, CLISP, etc. prevent you from
| building or designing an overly complex class model.
|
| It has nothing to do with ensuring a particular argument to
| a function or method is of the expected type, which was my
| point - the "pythonic" way is to NOT check.
|
| I also do not understand your example of Typescript.
| Compared to the last time I worked on a Javascript code-
| base, recently having to work with a Typescript code-base
| was a joy including reading library code. Stripping out the
| types gives you Javascript - surely you are not claiming
| that it makes it easier to read libraries with the type
| signatures removed?
|
| Whether a library is complex or not is completely
| orthogonal. Doing it regularly, navigating the source of an
| overly-complex Python library is no fun either.
| akkad33 wrote:
| > This issue is political and not so much technical as
| Typescript demonstrates how you can add a beautifully
| orthogonal and comprehensive type system to a dynamic
| language, thus improving the language's ergonomics and
| scaleability.
|
| How does typescript demonstrate this?
|
| I don't see how typescript is different from Python in this
| regard. Typescript compiles down to JavaScript, which like
| Python is dynamic. So at runtime nothing prevents you from
| calling a function written to take ints with strings. In
| fact, JavaScript has even worse typing than Python, so I
| imagine it's worse.
| derriz wrote:
| Typescript demonstrates that you can have a fully dynamic
| language but also provide a type system which can support
| as much (or as little) type checking as is appropriate or
| desired.
|
| I can take my chances in Typescript by just using 'any'
| everywhere but if I do want to constrain variables to
| particular types, the compiler will fully support me and
| provide guarantees about the restrictions I've specified
| via the type signatures.
| 3eb7988a1663 wrote:
| Do you mean like is the User object is a well formed User, or
| did someone actually give you an int?
|
| As to the first problem, I recommend the Parse don't validate
| post[0]. The essential idea is stop using god objects that do
| it all, but use specific types to make contracts on what is
| known. Separate out concerns so there is an UnvalidatedUser
| (not serialized and lacking a primary key) and a ValidatedUser
| (committed to the database, has unique username, etc). Basic
| type hinting should get you the rest of the way to cleaning up
| code paths where you get some type certainty.
|
| [0] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-
| va...
| ac130kz wrote:
| Somewhat solved by type annotations + a good static type
| checker, such as pyright (it's 2025, there must be type
| annotations everywhere), and dynamic cases (very rare, probably
| due to poor or unfortunate design decisions) can be solved with
| validators, e.g. the aforementioned Pydantic. This isn't a
| silver bullet, but it works really well.
| dgan wrote:
| i have to confess , i use Protobuffs for everything. They convert
| to pure python (a la dataclass), to json strings and to binary
| strings, so i literally shove it everywhere : network, logic,
| disk.
|
| BUT when doing heavy computation (c++, not python !) don't forget
| to convert to plain vectors, Protobuffs are horribly inefficient
| the__alchemist wrote:
| Protobuf is fine if:
|
| A: You control both ends of the serialized line, or: B: The
| other end of the line expects protobufs.
|
| There are many [de]serialization scenarios where you are
| interfacing with a third party API. (HTTP/JSON web API, a given
| IC's comm protocol as defined in its datasheet etc)
| dontlaugh wrote:
| You can still use a protobuf schema to parse/generate JSON,
| in most cases.
| dgan wrote:
| i think even if 3rd party API expects json, you could still
| map their models to proto ; i haven't encountered this case
| tho
|
| might still be challenging to convince proto to output what
| you want exactly
| the__alchemist wrote:
| I don't understand then. Here is my mental model; as
| described, you can see why I'm confused:
|
| JSON: UTF-8 Serialization format, where brackets, commas,
| fields represented by strings etc.
|
| Protobuf: Binary serialization format that makes liberal
| use of varints, including to define field number, lengths
| etc. Kind of verbose, but not heinous.
|
| So, you could start and end your journey with the same
| structs and serialize with either. If you try to send a
| protobuf to an HTTP API that expects JSON, it won't work!
| If you try to send JSON to an ESP32 running ESP-Hosted,
| likewise.
| dgan wrote:
| ah I think I understand your confusion. The proto package
| allows conversion between the binary messages and their
| json equivalent. So you can still use the proto objects
| in your code , only to send out json when required
| leoff wrote:
| >The less your core logic depends on specific tools or libraries,
| the easier it becomes to maintain, test, or even replace parts of
| your system without causing everything to break.
|
| It seems like the author doesn't like depending on `pydantic`,
| simply because it's a third party dependency. To solve this they
| introduce another, but more obscure, third party dependency
| called `dacite`, that converts `pydantic` to `dataclasses`.
|
| It's more likely that `dacite` is going to break your
| application, than `pydantic`, a library used by millions of users
| in huge projects, ever will. Not to mention the complexity
| overhead introduced by this non sense mapping.
| wiseowise wrote:
| > simply because it's a third party dependency
|
| Not simply. This is one one of the most important reasons NOT
| to propagate something through your code. How many millions
| codebases use it is irrelevant.
| leoff wrote:
| >How many millions codebases use it is irrelevant.
|
| It is relevant, because it speaks to the reliability of the
| dependency. `pydantic` has 24.7k Github stars and was last
| updated 52 minutes ago.
|
| Adding a random dependency `dacite`, which has 1.9k Github
| stars, no one has ever heard of, and was last updated 4
| months ago, introduces way more complexity and sources of
| instabilities than propagating `pydantic`.
| murkt wrote:
| More updates means more changes and more instability. I
| have never seen dacite, but it's pretty easy for a small
| library to just be complete. If it's complete, why the need
| for constant changes?
| Lucasoato wrote:
| Actually Pydantic could be extremely useful if used in
| conjunction with SQLAlchemy, check out the SQLModel library, from
| the very same creators of Pydantic.
| jessekv wrote:
| Sebastian Ramirez created FastAPI and SQLModel, and was an
| early adopter of Pydantic. Samuel Colvin created Pydantic.
| cout wrote:
| Having used sqlmodel recently for a project, I was
| underimpressed. Documentation was sparse, I found myself going
| to the source code to figure out how to solve problems I ran
| into, and I ended up dropping into sqlalchemy a lot more than I
| wanted. I think the idea is sound, but the code is hard to
| follow, and there are a lot of missing common cases.
| JackSlateur wrote:
| sqlmodel is a wrapper around sqlalchemy, made by the guy who
| made fastapi
|
| While it uses pydantic, sqlmodel has not been written by those
| guys
| stephantul wrote:
| I think this article misses the main point by focusing on
| removing pydantic. The main point is that you should convert
| external types as soon as possible to decouple them from the rest
| of your code. Whether this involves pydantic or something else is
| not really important I guess
| nisten wrote:
| From the article:
|
| "Why are there no laws requiring device manufacturers to open
| source all software and hardware for consumer devices no longer
| sold?"
|
| I think it's because people (us here included) love to yap and
| argue about problems instead of just implementing them and
| iterating on solutions in an organized manned. A good way these
| days to go about it would be to forego the facade of civility and
| use your public name to publicly tell your politician to just
| fuck it, do it it bad, and have plan to UNfuck after you fuck it
| up, until the fucking problem is fucking solved.
|
| Same goes for UBI and other semi-infuriating issues that seem to
| (and probably do) have obvious solutions that we just don't try.
| barbazoo wrote:
| > But Pydantic is starting to creep into every layer, even your
| domain, and it starts to itch.
|
| I can't relate yet. Itch how? It doesn't really go into what the
| problem is they're solving.
| ansc wrote:
| It expands further down in the article:
|
| >Pydantic is great, just not everywhere. [...] Not because it's
| bad, but because your domain should be pure and independent.
|
| It itches because it should be pure and indepen- yeah I don't
| know. I haven't had this itch either to be frank.
| karolinepauls wrote:
| I'll go further and elsewhere at once: APIs should not present
| nested objects but normalised data. It enables clients to easily
| to lay out their display structure independently of API resource
| schemas and eases out tricks like diffing between subsequent
| responses, pulling updates or requesting new data by passing IDs
| and timestamps of already known data, etc. API normalised data
| obviously shouldn't correspond to DB normalised data. Nested
| objects are superior only for use with jq.
| ejflick wrote:
| > APIs should not present nested objects but normalised data
|
| If something is nested, let it be represented as a nested
| structure. I find flattening causes more mental overhead. If
| something is too flat, it becomes less obvious what data is
| exactly necessary to do what you want to do
| mindcrash wrote:
| And that's why it is key in your architecture to differentiate
| between Data Transfer Objects (DTOs) or Models on one hand which
| has values which can and actually _must_ be validated when they
| come from the outside, and Domain Entities / Value Objects on
| the other. Even though the DTO and Domain Entity might look
| _similar_.
|
| Thank me later.
| ripped_britches wrote:
| This persons's head would explode if they saw what we're doing
| over here in typescript with structural typing. It would make
| things way too simple.
| axpy906 wrote:
| The trouble I have with pedantic is that everything is immutable.
| There are use cases where I need mutability and it's not bad but
| a trade off.
| henning wrote:
| Oh boy, I love making adding a trivial nullable column take even
| more code and require even more tests and have even more places I
| forgot to update which results in a field being nullable
| somewhere.
|
| And don't forget, you get to duplicate this shit on the frontend
| too.
|
| And what is a modern app if we aren't doing event-driven
| microservice architecture? That won't _scale_!!!! So now I also
| have to worry about my _Avro schema_ /Protobufs/whateverthefuck.
| But how does everyone else know about the schema? Avro schema
| registry! Otherwise we won't know what data is on the wire!
|
| And so on and so on into infinity until I have to tell a PM that
| adding a column will take me 5 pull requests and 8 deploys
| amounting to several days of work.
|
| Congratulations on making your own small contribution to a
| fucking ridiculous clown fiesta.
| jmward01 wrote:
| Strongly decoupling API implementation and, well, actual
| implementation, is pretty key when you start to evolve an
| application. People often focus on 'the design' like there is one
| perfect design for an application for its lifetime when in really
| it is about how easy the mass of code you have is able to change
| for the next feature/fix/change and not turn into a hairball of
| code. That perfect initial design where the internal and external
| objects are exactly the same generally works well for 1.0, but
| not 1.1 or 2.0 so strongly decoupling the API implementation is a
| good general practice if you think your code will continue to
| evolve.
| clickety_clack wrote:
| I use pyrsistent in the domain, and pydantic for tricky
| validation at the boundary. Pyrsistent is a pretty neat solution
| if you want immutable data structures, with some nice methods for
| working with nested records.
| golly_ned wrote:
| I still don't quite get the motivation for "don't use pydantic
| except at border" -- it sounds like it's "you don't need it",
| which might be true. But then adds dacite to translate between
| pydantic at the border and python objects internally. What
| exactly is wrong with pydantic internally too?
| chausen wrote:
| Could be wrong, never used Pydantic. But looking it up it seems
| like it's used for validation/typing of external data. Sounds
| like it's mainly going to be doing schema validations. So, your
| data arrives at your domain layer and you have guarantees based
| on Pydantic's validations. At this point, your validations are
| going to semantic in nature based on your domain; what value is
| Pydantic bringing?
| ac130kz wrote:
| An easier/moderate approach: make a proper base DTO model, which
| can be extended by validators, such as Pydantic, and the db model
| is the Domain is just whatever an ORM offers/dataclasses.
| throwaway7783 wrote:
| Return of Java DTOs!
___________________________________________________________________
(page generated 2025-07-26 23:01 UTC)