[HN Gopher] Problems with JPA/Hibernate
       ___________________________________________________________________
        
       Problems with JPA/Hibernate
        
       Author : stemlaur
       Score  : 27 points
       Date   : 2021-04-11 18:58 UTC (4 hours ago)
        
 (HTM) web link (www.stemlaur.com)
 (TXT) w3m dump (www.stemlaur.com)
        
       | victor106 wrote:
       | Just because you're using Hibernate doesn't mean you have to use
       | it for everything
       | 
       | --Gavin King, creator of Hibernate
       | 
       | Source:-
       | 
       | https://twitter.com/markuswinand/status/456827165938434048?s...
        
       | doctor_eval wrote:
       | In my extensive experience managing teams using JPA ORMs
       | including Hibernate and EclipseLink, you not only get to learn
       | the unavoidable details of your target database's SQL, but also
       | the complex, non-obvious side effects - especially caching
       | interactions and performance edge cases - of the ORM as well. You
       | also get to learn two distinct but similar query languages (one
       | of which you can't use anywhere else but Java). You get to throw
       | away all the semantics built into the database structure,
       | including knowledge about indexes, and you need to duplicate
       | almost everything from SQL into Java, in the form of complex
       | annotations against entity objects that you often wouldn't need
       | with traditional SQL access patterns. And when something goes
       | wrong, you have a large, complex framework, including an
       | inscrutable caching layer, between your code and your data -
       | which can make debugging JPA code very challenging.
       | 
       | In return you get reduced performance, overly complex and
       | difficult to read SQL code generation, an inflexible entity
       | model, and non optimal database access patterns.
       | 
       | JPA is an extremely complex framework whose benefits, in my
       | opinion, outweigh the costs in only a small number of relatively
       | simple use cases. And even in those simple use cases, in my
       | experience it is far less performant (and double the LOC) than
       | using SQL directly.
       | 
       | I'm not saying you can't use JPA in large, complex applications,
       | because you obviously can. But those applications are harder to
       | write, much larger, less maintainable and less performant than
       | the equivalent applications written with simpler frameworks or
       | libraries.
       | 
       | I personally found MyBatis annotated interfaces to be the ideal
       | way to map SQL to Java, but perhaps ApacheDB is better for
       | smaller projects.
       | 
       | JPA is one of those fabulous engineering experiments that, sadly,
       | didn't work out. It breaks every rule in the engineering book in
       | order to make SQL work like Java, but it doesn't succeed, and I
       | could never recommend it to anyone.
        
       | Areading314 wrote:
       | What is the argument in favor of server-side Java in 2021? It
       | seems like alternatives like Go, Python, or even JS are far ahead
       | at this point
        
         | ssijak wrote:
         | What is the best argument against it? And the whole JVM
         | platform in general (Kotlin, Scala, Clojure...)
        
           | AzzieElbab wrote:
           | I can't speak for others, but scala's has two of the best sql
           | db libs I ever used, namely doobie and quill.
           | https://github.com/getquill/quill
           | https://tpolecat.github.io/doobie/
        
         | rakoo wrote:
         | It works ? I'll argue that 90% of problems solved by SaaS today
         | could be solved by any of the usual suspects of languages. What
         | matters more is the architecture you choose. Choice of language
         | is mostly a convenience choice, based on how comfortable you
         | will be editing software in that language (that includes not
         | only your proficiency, but also the availability of libraries
         | and frameworks to help you)
        
         | buster wrote:
         | A huge developer base and eco system? What specifically, can't
         | be achieved in Java, that you think node is needed for?
        
           | Scarbutt wrote:
           | Rendering Javascript.
        
             | SahAssar wrote:
             | There is
             | https://en.wikipedia.org/wiki/Rhino_(JavaScript_engine) and
             | https://en.wikipedia.org/wiki/GraalVM. The latter of those
             | is the second fastest js server runtime (es4x) according to
             | techempower benchmarks: https://www.techempower.com/benchma
             | rks/#section=data-r20&hw=...
             | 
             | I haven't tried any of those, but saying the JVM can't run
             | JS is not true.
        
               | Scarbutt wrote:
               | GraalVM is not a drop-in replacement for the JVM and
               | Rhino is slow as molasses compared to V8.
        
         | the_af wrote:
         | Far ahead in what sense?
         | 
         | In my previous job we used Java and the benefits were a huge
         | ecosystem of libs and tools, plenty of monitoring tools, a ton
         | of expertise and people who understood its memory model and
         | quirks and were able to troubleshoot production problems. The
         | JVM is rock solid and has great performance for backend with
         | lots of transactions.
         | 
         | I'm not experienced with Go, Python doesn't seem particularly
         | suitable, and server-side javascript looks like a nightmare to
         | me.
         | 
         | NodeJS seems crippled to me.
        
           | paulryanrogers wrote:
           | One benefit I've been seeing with server side JS is from FAAS
           | like Lamba which can serve unpredictable loads very
           | inexpensively.
        
             | eeperson wrote:
             | How is that a benefit specific to JS? Don't those platforms
             | generally support Java as well?
        
               | cle wrote:
               | They do but the Java ecosystem takes a hit here, with
               | many libraries being slow to initialize. The slower it
               | takes to startup, the longer it takes to absorb the new
               | load, to the point where you either keep some headroom
               | (IOW, waste $$$), or try to predict load (complex, also
               | wastes $$$).
               | 
               | There are attempts to fix this with e.g. Graal, which
               | effectively does the expensive initialization and
               | reflection at compile time, but there are so many
               | downsides and pitfalls right now with Graal that I don't
               | consider it a serious solution to the problem. It's
               | basically creating a new ecosystem, which means one
               | primary motivation--to take advantage of the Java
               | ecosystem--is much less compelling.
               | 
               | This isn't constrained to the _public_ ecosystem, lots of
               | companies have their own internal libraries, and Java
               | makes it very easy and even encourages doing lots of
               | expensive things at startup, like classpath scanning and
               | pre-caching things. For a long time, Java made explicit
               | decisions to de-prioritize startup time to improve
               | maintainability (reflection /scanning/dynamic
               | classloading) and runtime performance (e.g JIT). This
               | tradeoff doesn't work out so well in a world of ephemeral
               | processes that come and go as demand changes.
               | 
               | I guess what's specific to JS, Go, Python, et al is the
               | cultural emphasis on fast startup. Interestingly these
               | all come from different constraints but the net result is
               | that, in general, you can go from cold to serving traffic
               | much faster than with Java, with a lot less effort.
        
           | AaronFriel wrote:
           | The JVM is rock solid, and recent improvements in garbage
           | collection have reduced tail latencies dramatically.
           | 
           | But I find that most people who say that it has "great
           | performance" have not built a parallel implementation in JS,
           | Go, Rust, or even .NET Core. I'm omitting memory unsafe
           | languages by default and Python here, because I think that's
           | the domain Java competes in, and Python lacks the investment
           | these other languages have in performance.
           | 
           | The lack of value types and the amount of pointer chasing
           | that JVM languages do as a result, the way generics are
           | implemented via type erasure (which the JIT then has to re-
           | optimize), and so on usually mean that CPU and memory usage
           | for the same throughput is much higher than a competing
           | implementation in a different language. And on older JVMs,
           | tail latency will be orders - plural - of magnitude worse.
           | 
           | It is absolutely true though that for most workloads that
           | efficiency isn't necessary and the ability to reuse that
           | ecosystem reduces time and cost to develop. But it's just
           | definitely not true that Java has "great performance" and I
           | don't think that's ever really been true.
        
             | ssijak wrote:
             | TechEmpower Web Framework Benchmarks would like to disagree
             | with you.
        
               | AaronFriel wrote:
               | I don't think those are particularly realistic workloads,
               | as they don't involve substantial amounts of working with
               | in-memory data. Which of the TechEmpower benchmarks uses
               | an ORM?
               | 
               | Before I finished drafting my comment I did have a
               | sentence like this, which I removed, "Barring obscene
               | amounts of optimization", so yes, some web servers like
               | netty and jetty have gotten to a level of good
               | performance in terms of handling plain requests.
               | 
               | But most line of business backends are not using plain
               | jetty/netty, they aren't just responding to every request
               | with the same "SELECT" query to a backend database.
               | They're doing computation, they're storing intermediate
               | data in data classes like ArrayList, TreeMap, etc.
               | 
               | And then of course due to business requirements, they
               | often have to implement some in-process caching, and
               | suddenly the lightweight Java application is a bloated
               | multi-gigabyte CPU consuming monster.
               | 
               | I just don't see that happening often with more memory
               | and cache friendly languages like Go, Rust, Swift, or
               | even JavaScript on V8/Node.js.
        
               | thu2111 wrote:
               | JavaScript is hardly cache friendly.
               | 
               | As for Go, that's a language which doubles up the size of
               | every pointer and doesn't even use a moving GC, and last
               | time I looked, the quality of machine code it generated
               | was atrocious. It's not really cache/hardware friendly to
               | do those things. Value types are I suspect being over-
               | estimated here: when Java gets them I am expecting
               | disappointment when they don't magically make everything
               | twice as fast.
        
               | jacques_chester wrote:
               | > _Which of the TechEmpower benchmarks uses an ORM?_
               | 
               | There are two: single query and multiple query.
        
               | AaronFriel wrote:
               | Sorry, I should have been more precise. I am very, very
               | familiar with the TechEmpower benchmarks and I first
               | learned Java around SE 5, right after they switched from
               | 1.x numbering. Please don't mistake me for someone who
               | just learned about Go or Rust and is evangelizing them
               | because I think they're the cool new thing.
               | 
               | Which of the Java implementations for the TechEmpower
               | benchmarks use an ORM? Are they representative of the
               | kind of code you would write? I think that the
               | TechEmpower benchmarks suffer from many of the same
               | problems the language benchmarks game benchmarks do -
               | micro-optimization, unrealistic workloads.
               | 
               | My experience tells me that you an get any sufficient
               | level of performance in almost any language, but that you
               | are going to pay for some languages more in opex than
               | others, particularly in memory usage. It takes more
               | compute spend for a workload written in Java than one
               | written in Go, all other things being equal. That's not
               | to say Java is a bad language, but it does lack many
               | features - some of them being intentional design
               | decisions - which make it less cost effective to operate
               | systems built on Java. However, we know that a
               | significant cost is the cost to develop, so it's hard for
               | me to say Java is a bad language for that reason either.
               | 
               | And memory usage is generally a good predictor of density
               | in terms of scheduling workloads, be it Tomcat servers
               | (back in the day) or VMs or containers these days. I also
               | think that Java suffers, performance wise, from boxing
               | values and pointer chasing / poor cache locality. The
               | default container implementations are just, well, it
               | would be polite to simply say that they're as good as the
               | language allows.
               | 
               | However, data is better than claimed experience, no?
               | 
               | I just opened up the raw benchmark stats[1] for the
               | database updates route. It's one that my favorite
               | languages don't do well in, but I was curious about the
               | operational overhead of running them in memory usage,
               | something I've mentioned quite a lot up above.
               | 
               | I looked at a vertx-postgresql benchmark for the
               | "updates" TechEmpower. This is a high performing
               | implementation without an ORM[2]
               | 
               | I also looked at quarkus + reactive routes + hibernate,
               | which appears to use hibernate, applicable to the
               | original post[3].
               | 
               | And lastly, I looked at actix diesel, another ORM using
               | implementation[4].                   java quarkus-
               | hibernate:  3.9GiB memory (peak, start of test)
               | java quarkus-hibernate:  3.1GiB memory (lowest value,
               | near end of test)         java vertx-postgres:
               | 2.35GiB memory (consistent)              rust actix-
               | diesel:       1.2GiB memory
               | 
               | Standard deviation was:                   java quarkus-
               | hibernate:  221.4MiB         java vertx-postgres:
               | 1.9MiB         rust actix-diesel:         0.5MiB
               | 
               | I included the steady state for quarkus because its
               | memory usage (perhaps due to a config flag starting it
               | with a 4GiB heap?) started out extremely high and
               | decreased over the course of the run. That likely affects
               | the standard deviation, which I included to highlight
               | that I didn't try to cherry-pick results.
               | 
               | Perhaps the funniest thing to me digging into it is,
               | again due to the absurdity of Java's design decisions, to
               | make sure that "Integer" objects are efficient, the Java
               | benchmarks use the command line parameter
               | "-Djava.lang.Integer.IntegerCache.high=10000". This tells
               | you that if the benchmark used a wider range of random
               | values[5], performance would degrade. Have you ever heard
               | of a language requiring an integer cache? It's absurd to
               | me that Java, rather than implement value types, requires
               | Integers to be interned for performance.
               | 
               | Are there any other languages in the TechEmpower
               | benchmark or the Debian benchmark game (formerly went by
               | another name) that requires setting an "IntegerCache" to
               | optimize... allocating integers? I mean, come on. You
               | can't tell me this is a language that was designed for
               | performance when integers can't be directly stored in
               | arrays and instead have to be autoboxed and a cache is
               | needed to intern them!
               | 
               | I will say one final thing: cost to operate/memory
               | efficiency is just one metric for measuring languages. I
               | think that Java is actually a pretty bad language for a
               | lot of reasons, but path dependence has produced an
               | extremely rich ecosystem that gives developers a lot of
               | flexibility and a lot of tools to use when writing it. I
               | think Kotlin, Scala, and even Clojure are by far more
               | pleasurable languages to write in, though the JVM still
               | holds them back for all the reasons above.
               | 
               | [1] Raw results from https://tfb-
               | status.techempower.com/unzip/results.2021-01-13-...
               | 
               | [2] You can see they have simply hardcoded the SQL. See: 
               | https://github.com/TechEmpower/FrameworkBenchmarks/blob/m
               | ast...
               | 
               | [3] https://github.com/TechEmpower/FrameworkBenchmarks/bl
               | ob/mast...
               | 
               | [4] https://github.com/TechEmpower/FrameworkBenchmarks/bl
               | ob/mast...
               | 
               | [5] The update benchmark only requires random numbers
               | between 1 and 10,000. Performance of Java apps would
               | degrade if they were asked to use boxed integers greater
               | than 10,000, which is possibly the most absurd statement
               | I have said of any programming language ever. See: https:
               | //github.com/TechEmpower/FrameworkBenchmarks/wiki/Proj...
        
         | rjsw wrote:
         | Java has support for the schema definitions that I need, the
         | other languages you list don't. I am happily using
         | JPA/Hibernate too.
        
       | paulryanrogers wrote:
       | > This loop has to stop, what defines the value of the projects
       | we are working on has nothing to do with technologies and
       | frameworks.
       | 
       | > I WANT to solve business problems, I do not want to keep
       | solving technical issues.
       | 
       | It is interesting how emotionally invested we become with our
       | tools. Yet we don't have infinite time to become productive with
       | all possible frameworks. So we have to specialize at least
       | somewhat.
       | 
       | As to JPA itself, I agree that it's generally a bad fit for most
       | uses. Circa 2010 I used it with Java Enterprise in the hope that
       | an failed bean could be recovered by a parallel worker, but the
       | technical costs were crazy high. And often I had to drop to raw
       | SQL anyway. Less invasive ORMs can still be more generally
       | useful, and remove some tedium.
        
       | vips7L wrote:
       | This article is... Questionable at best. I don't think any of
       | this is an argument against JPA, except that the author doesn't
       | like how it works? I also suspect the author doesn't know
       | hibernate that well.
       | 
       | For instance selecting just the fields you need is relatively
       | simple with JPQL:                   SELECT i.url FROM Image i
       | WHERE i.id = ...
        
         | agilob wrote:
         | JPA and Hibernate make it very easy to use it incorrectly, its
         | almost like they _promote_ bad SQL queries and ideas. They let
         | users of database connection to write Java-first database
         | queries, when database query should be database first, it 's
         | just way too easy to abuse it and get too much data, too many
         | columns and JOINs.
         | 
         | Developers look like JSON looks like, what we send to browser,
         | what formatting it has, validation and size of JSON data, as
         | it's easy to monitor and trim.
         | 
         | Can't say the same about Hibernate. How the hell even hibernate
         | caching works? Why even are there 2 levels of Hibernate cache?
         | It's too easy to create and abuse transactions. Dirty-checking
         | is JUST WAYYYY TOOO EAZY TO ABUSE. not calling, using setter of
         | an instance shouldn't update in database by default omg, It
         | shouldn't be possible for transactions to leak outside some
         | easily specified scope - I've seen one project where
         | transaction leaked to Jackson!! Jackson was calling getters on
         | fields and executing DB queries. JSON ended up as 2.6Mb instead
         | list of 10 fields.
         | 
         | Hibernate is popular because we don't need to learn SQL to get
         | needed data, but it's also super hard to get it right and don't
         | do something stupid by accident.
         | 
         | If in any doubt, refer to JOOQ - it's the SQL-oriented ORM for
         | Java.
        
           | vips7L wrote:
           | > How the hell even hibernate caching works? Why even are
           | there 2 levels of Hibernate cache?
           | 
           | https://docs.jboss.org/hibernate/stable/orm/userguide/html_s.
           | ..
           | 
           | > using setter of an instance shouldn't update in database by
           | default omg
           | 
           | This doesn't happen.. the database won't update until you ask
           | the entity manager to persist the entity.
           | 
           | > I've seen one project where transaction leaked to Jackson!!
           | Jackson was calling getters on fields and executing DB
           | queries. JSON ended up as 2.6Mb instead list of 10 fields.
           | 
           | I suspect you might not agree, but JPA lazy loading is one of
           | the easiest concepts to understand. However, I'd argue that
           | leaking your database models to the view is a mistake to
           | begin with and said application is already incorrect.
           | 
           | > If in any doubt, refer to JOOQ - it's the SQL-oriented ORM
           | for Java.
           | 
           | Both the author/maintainer of jOOq and Hibernate agree that
           | each has their place.
           | 
           | I really am amazed at developers that don't take the time to
           | learn about the tools they use.
        
             | eeperson wrote:
             | > This doesn't happen.. the database won't update until you
             | ask the entity manager to persist the entity.
             | 
             | Doesn't this happen through automatic dirty checking?
        
               | vips7L wrote:
               | No calling a setter on an Entity doesn't automatically
               | issue an sql UPDATE query. You need to ask the
               | EntityManager to merge or persist the entity and it's
               | changes.
               | 
               | Obviously JPA knows what fields were updated via dirty
               | checking.. that's almost half the point of an ORM.
        
         | nimchimpsky wrote:
         | that looks very like sql, whats the benefit then ? X amount of
         | dependencies to write almost sql ?
        
         | alkonaut wrote:
         | Giving up basic OO niceties like invariants in your whole
         | domain just to get automatic persistence from some library I
         | agree with the author: it's insane and no one should accept
         | that tradeoff.
         | 
         | There has to be ways around that though, perhaps using a
         | duplicated domain of DTOs or coercing the ORM to use
         | constructors or private setters to keep encapsulation and
         | invariants.
        
           | vips7L wrote:
           | IMO invariants are better handled by a class who's sole
           | responsibility is to enforce and validate said invariants,
           | especially when you have dependencies involved to enforce
           | them (like making sure the Item actually exists in the db).
           | 
           | Value classes like @Entity shouldn't have the responsibility
           | to enforce those business rules.
           | 
           | We can disagree on which way is more object oriented though.
        
       | harryvederci wrote:
       | The only positive experience I've had with it was in a small
       | application with a not-so-complex DB which was created entirely
       | through Liquibase.
       | 
       | In large enterprise projects, I've always had to create custom
       | SQL statements at some point, at which point I'd rather do
       | everything in SQL. Otherwise, you have to know JPA/Hibernate
       | _and_ SQL, which (in my opinion) defeats the purpose.
        
       | the_af wrote:
       | The underlying problem is one of O-R impedance mismatch. Going
       | full SQL and getting rid of the ORM is a possible answer, but it
       | has tradeoffs and is not a silver bullet. It might mean re-
       | creating from scratch an in-house, bug-ridden ORM, or ditching
       | OOP idioms from your language, or both.
       | 
       | The author of TFA seems to be going through one of the stages
       | described in "ORM is the Vietnam of Computer Science", an article
       | that should be mandatory reading before one claims to know the
       | solution to this decades old problem.
        
         | phreack wrote:
         | Here's the article
         | 
         | http://blogs.tedneward.com/post/the-vietnam-of-computer-scie...
         | 
         | Edit: huh, seems to have got truncated over the years. Here's
         | an archive link
         | 
         | https://web.archive.org/web/20160120004603/https://blogs.ted...
        
       | api wrote:
       | From my experience ORMs are good time savers when you have
       | relatively simple query needs, but fall down when things get
       | really complex.
        
         | Daishiman wrote:
         | "Really complex" means getting down to 6 or 7 joins for your
         | typical SQLAlchemy or Django ORM query. These sorts of queries
         | comprise 5% of my queries, tops.
         | 
         | Seems like a fair tradeoff.
        
           | tarkin2 wrote:
           | I have one of those queries. It's the most important in the
           | database sadly. As django bloats and bloats the table space,
           | over the years, more and more do I need those queries, and
           | slower and slower becomes my app.
        
         | tarkin2 wrote:
         | And time. Let's not forget time.
         | 
         | Years of Django's ORMs has given our app over a hundred tables.
         | Complex and /fast/ queries are next to impossible.
         | 
         | Django's ORM made app development lightening quick for the
         | first developers. And impossible for the ones fives years
         | later.
        
         | The_rationalist wrote:
         | Nothing prevents you from using raw SQL queries with Hibernate.
         | HQL is just a convenient superset.
        
           | karmakaze wrote:
           | Actually mixing raw queries with JPQL/Hibernate queries is
           | the worst of all worlds. To get it right, you will end up
           | making explicit calls to let the EntityManager know what you
           | want each side to be doing to play nice.
        
             | The_rationalist wrote:
             | I've never hit such a bug. Can you expand on what (and
             | when) added complexity would a raw query have versus a raw
             | query without hibernate?
        
           | nimchimpsky wrote:
           | whats the advantage of that ?
        
         | stefan_ wrote:
         | Where simple already doesn't include pagination, when all those
         | frameworks generate OFFSET queries that cause the equivalent of
         | a full table scan when someone clicks "last page".
        
       | jacques_chester wrote:
       | I largely agree that JPA is a maddening mess[0]. However, this
       | sentence caught my eye:
       | 
       | > _I love open source, really, but big companies sponsoring open-
       | source projects get most of their income from support or third
       | party tools._
       | 
       | I work for VMware and consequently take some interest as to how
       | my salary comes into being.
       | 
       | If you think VMware makes "most of its income" from supporting
       | Spring, then I think I'd encourage you to spend some time at the
       | investor relations site[0] reading any of the annual or quarterly
       | reports. I'd advise the same for Oracle and Red Hat/IBM.
       | 
       | [0] The advice to use emails or SSNs as primary keys, though:
       | yikes.
       | 
       | [1] https://ir.vmware.com/
        
       | dale_glass wrote:
       | I disagree with this bit:                   A User can be
       | considered unique in one context by its email address, or by its
       | social security number
       | 
       | Personally, I'm a fan of giving everything a random UUID, because
       | it's more flexible. It's random and impossible to guess, it
       | scales well because there's no central bottleneck like with an
       | autoincrement, and it's future proof and flexible.
       | 
       | What happens when the user changes the email address? What if the
       | social security number changes, because it was wrong or because
       | it actually changes? What if the original unique identifier
       | wasn't a good choice? What if we decide that the user can have
       | multiple email addresses? Then you may end up having to
       | restructure the entire database, which will be a very annoying
       | thing to do. What if you implement additional rules for what an
       | email is allowed to look like and now the constraint fails for
       | existing users, and this correction needs to be propagated to
       | millions of already existing rows?
       | 
       | Real-life personal data is weird and fuzzy. They can violate
       | seemingly sensible rules like being unique, unchanging, or
       | conforming to any rule whatsoever. Best not to let them spread
       | all over the DB and cause trouble later.
       | 
       | Instead, you could just have a random ID that doesn't mean
       | anything and therefore can stay fixed forever, and any user-
       | related metadata stays in the user table, where it can be
       | modified as needed. Plus an UUID is a fixed 16 bytes, which is
       | easy and efficient to deal with.
        
         | jacques_chester wrote:
         | > _What if the social security number changes, because it was
         | wrong or because it actually changes?_
         | 
         | What if the user doesn't _have_ an SSN? What happens if they
         | have one but lawfully refuse to provide it? What happens when
         | you ask for and SSN from a US citizen who is also a European
         | citizen? What happens when your database leaks?
         | 
         | In general, relying only on natural keys is a nightmare. Double
         | nightmare if it's PII. Natural keys only work if you are
         | flawlessly omniscient about the domain. And you aren't.
        
           | echelon wrote:
           | I want to print this comment and frame it.
        
         | santiagobasulto wrote:
         | I'm from Argentina and we have something similar to a SSN, we
         | call it DNI (Documento Nacional de Identidad). You wouldn't
         | believe how many duplicate DNIs we have, it's crazy.
         | 
         | So yes, I agree with you, I always use a random UUID as ID.
        
         | yukinon wrote:
         | > Personally, I'm a fan of giving everything a random UUID,
         | because it's more flexible
         | 
         | Unless of course, you're using a relational database like OP
         | and incur a performance hit from using a UUID as your primary
         | key. Additionally, they're not sortable like autoinc id's.
         | 
         | I've always wanted to try out Twitter's Snowflake ID [1]
         | algorithm to get around this, but it requires requires using
         | something like Zookeeper. I've seen some people on the net talk
         | about UUIDv6 being sortable by time, but there's still the
         | potential performance hit of index size.
         | 
         | While I'm bringing this up I've never actually tested how slow
         | PK UUIDv1's are and at what magnitude their performance hit
         | becomes noticeable.
         | 
         | [1]
         | https://blog.twitter.com/engineering/en_us/a/2010/announcing...
        
           | bbirk wrote:
           | There's no need for zookeeper or any
           | centralised/decentralised service. In the article you link
           | they mention why depending on something like zookeeper is
           | suboptimal. Given that you have less than something like 2048
           | web server instances, (don't remember how many bits they give
           | to worker_number and the snowflake github repo is basically
           | unavaible) all you need to do is make sure every instance has
           | a rank/worker_number (infrastructure/devops problem) which
           | the instance will use when it generates the snowflake ids.
           | Sidenote snowflake also suffers from the unix epoch 2038
           | problem, but that can be simply solved by adding bits for
           | epoch number.
        
           | dale_glass wrote:
           | Things like Twitter are special. I'm talking more about a
           | generally sensible way of doing things, which one may need to
           | deviate from in special circumstances.
           | 
           | Why would you want to sort by ID? Sort by something sensible,
           | like the signup date instead. An autoincrement may stop
           | corresponding to time if for instance at some point a
           | database has an external dataset imported into it.
           | 
           | IMO, using an ID for anything other than an opaque identifier
           | is asking for trouble.
        
             | simonw wrote:
             | I like IDs I can read out over a call, or recognize when I
             | spot them in a log file. The few times I've used UUIDs for
             | IDs I've later regretted it.
        
       | karmakaze wrote:
       | Hibernate was made to solve the problem of JavaEE enterprise bean
       | persistence. It made sense 20 years ago. It is a mismatch for the
       | API-oriented transactions we tend to write today.
        
       ___________________________________________________________________
       (page generated 2021-04-11 23:01 UTC)