[HN Gopher] Lean4 helped Terence Tao discover a small bug in his...
___________________________________________________________________
Lean4 helped Terence Tao discover a small bug in his recent paper
Author : gridentio
Score : 388 points
Date : 2023-10-27 07:25 UTC (15 hours ago)
(HTM) web link (mathstodon.xyz)
(TXT) w3m dump (mathstodon.xyz)
| atomicnature wrote:
| A few years back, I was trying to find out how to reduce mistakes
| in the programs I write.
|
| I got introduced to Lamport's TLA+ for creating formal
| specifications, thinking of program behaviors in state machines.
| TLA+ taught me about abstraction in a clear manner.
|
| Then I also discovered the book series "software foundations",
| which uses the Coq proof assistant to build formally correct
| software. The exercises in this book are little games and I found
| them quite enjoyable to work through.
|
| https://softwarefoundations.cis.upenn.edu/
| Genbox wrote:
| Code correctness is a lost art. I requirement to think in
| abstractions is what scares a lot of devs to avoid it. The
| higher abstraction language (formal specs) focus on a dedicated
| language to describe code, whereas lower abstractions (code
| contracts) basically replace validation logic with a better
| model.
|
| C# once had Code Contracts[1]; a simple yet powerful way to
| make formal specifications. The contracts was checked at
| compile time using the Z3 SMT solver[2]. It was unfortunately
| deprecated after a few years[3] and once removed from the .NET
| Runtime it was declared dead.
|
| The closest thing C# now have is probably Dafny[4] while the C#
| dev guys still try to figure out how to implement it directly
| in the language[5].
|
| [1] https://www.microsoft.com/en-us/research/project/code-
| contra...
|
| [2] https://github.com/Z3Prover/z3
|
| [3] https://github.com/microsoft/CodeContracts
|
| [4] https://github.com/dafny-lang/dafny
|
| [5] https://github.com/dotnet/csharplang/issues/105
| mjburgess wrote:
| The issue is that modern software fails because it's part of
| a complex system with many moving parts, rather than, it is
| inherently complex at the source-level.
|
| Certain sorts of algorithmically complex development (games,
| cars, medical hardware, etc.) would benefit from a 'closed-
| world verification' -- but that's not most software, and they
| have alternatives.
|
| 'Code correctness', including unit testing, ends up being a
| big misdirection here. What you need is comprehensive end-to-
| end tests, and instrumentation to identify where failures
| occur in that end-to-end.
|
| The effort to source-level-check source-level-code is largely
| a huge waste of time and creates an illusion of reliability
| which rarely exists.
| mrkeen wrote:
| Strong disagree.
|
| > The issue is that modern software fails because it's part
| of a complex system with many moving parts, rather than, it
| is inherently complex at the source-level.
|
| The choice to run a system as many different moving parts
| is a decision taken by the team _in order to avoid
| failure_.
|
| > Certain sorts of algorithmically complex development --
| but that's not most software
|
| It's all software.
|
| > 'Code correctness', including unit testing, ends up being
| a big misdirection here. What you need is comprehensive
| end-to-end tests, and instrumentation to identify where
| failures occur in that end-to-end.
|
| No and no. I have comprehensive end-to-end tests. They take
| forever, don't fit into RAM (for some services I need to
| run them on my home PC because my work laptop only has
| 16GB), and most importantly: they show that the code _is
| not correct_. Now I have to change incorrect code to
| correct code (while not breaking any public interfaces. I
| wish my predecessors did not put incorrect code into the
| live system.
| mjburgess wrote:
| I'm not really sure what you're saying here -- the parent
| commenter is talking about closed-world source-level code
| verification.
|
| My claim is that these techniques help in relatively few
| areas of software development. In the main, software is
| built with open-world interactions across a range of
| devices (network, disk, etc.) where those interactions
| dominate the net failure modes of the system.
|
| In the latter case, source-level verification is not only
| of no help, but it's a huge resource misallocation. You
| end up writing tests that just create the illusion of
| success; and then the whole thing constantly falls over.
|
| Spend all that time instead in designing tests that do
| fit into ram (eg., sample your datasets); and
| instrumentation that does reveal errors (eg., sampling of
| memory, requests, etc.).
|
| One of my first software jobs was writing a reverse proxy
| -- one of my co-devs wrote unit tests that simply
| established the in/out of various functions was "as
| expected". Pretty useless -- the issue is whether the
| proxy actually worked.
|
| Likewise most source-level correctness efforts are
| 'testing theatre'.
|
| Devs are looking for cargo-cult solutions they can
| copy/paste. No, testing software is an actual area of
| development -- and you need to _develop_ tests, not
| cargo-cult
| snovv_crash wrote:
| Tests only show that it is correct for the sets of values
| and code paths you exercise. It's quite possible for
| other aspects to be incorrect, and this is what theorem
| provers like Coq help with.
| charcircuit wrote:
| If you have an incomplete or buggy specification Coq
| won't actually prove the absence of bugs.
| Jweb_Guru wrote:
| This is true but mostly meaningless in practice. I have
| never found the people who say this every time formal
| verification comes up to be the same people who actually
| work in formal verification. The reality is that it's far
| far harder to mess up a _formal_ spec that you actually
| have to prove a concrete instance of, than it is to mess
| up the code. With a buggy spec, in the course of trying
| to prove it you 'll arrive at cases that seem
| nonsensical, or be unable to prove basic lemmas you'd
| expect to hold true. Formal spec bugs happen, but they're
| quite rare. Usually even when there are edge cases that
| aren't covered, they will be known by the authors of the
| spec due to the dynamics I mentioned earlier (cases
| encountered during the proof that are clearly being
| solved in ways that don't follow the "spirit" of the
| spec, only the letter). Of course, "informal" specs, or
| paper proofs, will have subtle bugs all the time, but
| that's not what we're talking about here.
| charcircuit wrote:
| >Formal spec bugs happen, but they're quite rare
|
| A lot of specs bugs happen all the time. If you think
| people can account for all edge cases in massively
| complex projects you are wrong. There are many behaviors
| that you can't predict will be nonsensical ahead of time
| until you actually hit them.
|
| Formal verification is not a silver bullet. Despite all
| of the extra effort bugs will still happen. It's better
| to invest in making things safe by default, or by
| aborting when getting into a bad state.
| Jweb_Guru wrote:
| Can you point me to some of the concrete bugs in formal
| specs with associated proofs that you are thinking of?
| Specifically in the part that was formally verified, not
| in a larger project that incorporates both verified and
| unverified components. Because my experience is that when
| asked to actually provide concrete examples of how
| formally verified code has bugs in practice, people find
| it quite difficult to do so. Formal verification is about
| as close to a silver bullet as it gets for eliminating
| bugs, even if it's not 100% (nothing is). It's certainly
| far better at eliminating bugs than vague philosophical
| positions like "making things safe by default" (what does
| that mean?) or "abort when you're in a bad state" (how do
| you know you're in a "bad" state? Isn't it in general at
| least as difficult as figuring out what the correct
| formal spec is? How easy is it to detect? Is the
| performance cost acceptable?).
|
| In fact, what usually happens is the opposite of a formal
| proof being undermined by a bad spec--when an informal
| spec is formally verified, inconsistencies are often
| found in the original spec during the course of the proof
| process, and bugs are subsequently found in older
| implementations of that spec. Fixes are then incorporated
| into both the original spec and the existing
| implementations. Formal verification is a spec-hardening
| process.
| charcircuit wrote:
| >Specifically in the part that was formally verified
|
| No, I am saying that is that it is easy to not verify
| something because you don't know the requirements up
| front.
|
| >"making things safe by default" (what does that mean?)
|
| It mean that complexity is abstracted away such that it
| is hard to the wrong thing.
|
| >"abort when you're in a bad state" (how do you know
| you're in a "bad" state?
|
| There are invariants which you can assume that are always
| true. If they aren't true for whatever reason you can
| abort and later track down what caused you to enter this
| state. It could be as simple as some logic bug or obscure
| as hardware malfunctioning and causing a bit flip (at
| scale you need to deal with hardware misbehaving).
|
| >inconsistencies are often found in the original spec
| during the course of the proof process
|
| Bugs are found in the process of testing too.
| Jweb_Guru wrote:
| > No, I am saying that is that it is easy to not verify
| something because you don't know the requirements up
| front.
|
| Granted, but "this spec doesn't attempt to cover [X]" is
| very different from "this spec claims to cover [X] but
| actually doesn't, in ways the authors are unaware of."
| The former is quite common in formal specs with machine-
| verified correctness proofs and I've never heard anyone
| call that a "spec bug". The latter is not common in them
| at all. Perhaps you didn't intend this yourself, but the
| way people generally talk about "there could be a bug in
| the spec" is usually used to imply that the latter case
| happens often enough to make formal verification
| comparably effective to other defense mechanisms, when
| empirically formal verification is far, _far_ more
| effective than any other bug prevention mechanism--to the
| point that people struggle to find even individual
| examples of these kinds of spec bugs.
|
| > There are invariants which you can assume that are
| always true. If they aren't true for whatever reason you
| can abort and later track down what caused you to enter
| this state. It could be as simple as some logic bug or
| obscure as hardware malfunctioning and causing a bit flip
| (at scale you need to deal with hardware misbehaving).
|
| Yes, and stating these invariants precisely is generally
| exactly what you need to do as part of writing the formal
| spec; getting them right is equally hard in both cases,
| so it seems weird to me to say that it's easier to check
| the invariants at runtime than to get the spec right. In
| fact, many formal specs have invariants that are exactly
| of the form "assert that if I ran this line of code, it
| would not throw an exception." Moreover, you can use much
| more "expensive" invariants when you're creating a spec,
| since you don't actually have to check them at runtime,
| which in practice lets you catch far more and more subtle
| bad states than you can with software checks. For
| example, your proof state can include a full history of
| every action taken by the application in any thread, and
| prove that invariants on them are maintained during every
| instruction, even when those instructions proceed
| concurrently and might witness partial or stale views of
| the latest values in memory; storing and checking all of
| this information on every step on an actual machine would
| be wholly impractical, even for high level languages and
| non-performance-critical software.
|
| Obviously, things are defined according to some base
| model of the world, so if you are concerned about
| hardware bit flips (which in rare contexts you are) your
| spec should take that into account, but the vast majority
| of bugs in modern applications are not bugs of this form,
| and most defensive runtime checks are equally susceptible
| to hardware issues or other unexpected environment
| changes.
|
| > Bugs are found in the process of testing too.
|
| Yes, and I never said testing was bad. Formal
| verification and testing are complementary techniques,
| and tests are important in a formal verification context
| because you should be able to prove the tests will pass
| if you've gotten your spec right. However, them being
| complementary techniques doesn't mean that they are
| equally effective at reducing bugs.
| throw0101c wrote:
| > _The issue is that modern software fails because it 's
| part of a complex system with many moving parts, rather
| than, it is inherently complex at the source-level._
|
| A reminder of Gall's law:
|
| > _A complex system that works is invariably found to have
| evolved from a simple system that worked. A complex system
| designed from scratch never works and cannot be patched up
| to make it work. You have to start over with a working
| simple system.[8]_
|
| * https://en.wikipedia.org/wiki/John_Gall_(author)
| throwalean wrote:
| Note that the Z3 SMT solver was written by Leonardo de Moura,
| who also is the lead dev of Lean 4. Not a coincidence (-;
|
| Lean 4 seems to be used in production at AWS:
| https://github.com/cedar-policy/cedar-spec/pull/138
| pjmlp wrote:
| Or using F* and then generate F# code, https://www.fstar-
| lang.org/
| gmadsen wrote:
| Ada is gaining popularity in safety critical systems
| gilcot wrote:
| Ada/SPARK is already popular in safety critical systems
| jerf wrote:
| "Code correctness is a lost art."
|
| No, it really isn't. It's doing better now than it ever
| has... and I mean _all_ the negative implications of that. It
| was not an art that was ever "found" in the sense you mean.
| kccqzy wrote:
| Formal verification is often not the best tool for ensuring
| code correctness from an ROI perspective. Things like unit
| tests (including property based tests) and ensuring 100% code
| coverage often achieve adequate results with less effort.
| jlouis wrote:
| The key point is that most software don't need correctness
| to the level formal verification provides. There's a subset
| of software however, for which there's no substitute for a
| formal verification process.
| aSanchezStern wrote:
| Additionally, formal verification usability is an area of
| constant research, and the set of software for which it
| is the best ROI increases over time.
| aeonik wrote:
| Have you looked into Idris2 at all. While looking into these
| theorum provers, it always felt like they had an impedance
| mismatch with normal programming.
|
| Idris2 portends to a general purpose language that also has a
| more advanced type system for the theorum proving.
|
| https://github.com/idris-lang/Idris2
| atomicnature wrote:
| That's interesting; no I wasn't aware of it. Will check it
| out sometime, thanks.
| iso8859-1 wrote:
| There's a list of packages here: https://github.com/stefan-
| hoeck/idris2-pack-db/blob/main/STA...
| samvher wrote:
| I had the same positive experience with Software Foundations.
|
| There is another book somewhat derived from it (if I understand
| correctly) using Agda instead of Coq: https://plfa.github.io/
|
| I haven't had the chance to go through it yet, but it's on my
| list - I think Agda (and as mentioned by another commenter,
| Idris) is likely to feel more like a programming language than
| Coq.
| SushiHippie wrote:
| For people that know neither (like me 5 minutes ago):
|
| >Lean4
|
| > Lean is a functional programming language that makes it easy to
| write correct and maintainable code. You can also use Lean as an
| interactive theorem prover.
|
| https://lean-lang.org/about/
|
| > Terence Tao
|
| > [...] is an Australian mathematician. He is a professor of
| mathematics at the University of California, Los Angeles (UCLA),
| where he holds the James and Carol Collins chair.
|
| https://en.wikipedia.org/wiki/Terence_Tao
| antonioevans wrote:
| Field's Medal winner.
| c7b wrote:
| It's correct that he is a professor at UCLA, but it's also
| worth mentioning that he's regularly called nicknames like
| 'greatest mathematician alive' (just try googling that phrase):
| https://academicinfluence.com/rankings/people/most-influenti...
| meowface wrote:
| Famous photo of him studying with Paul Erdos at age 10: https
| ://i0.wp.com/blogs.ams.org/blogonmathblogs/files/2015/0...
| yodsanklai wrote:
| What would make him greater than other fields laureate?
| jyunwai wrote:
| Beyond the incredible quality and quantity of his work
| starting from early in his life, what makes Terence Tao
| memorable to me, is his approachability and willingness to
| write advice for mathematicians and math students in a
| blog: https://terrytao.wordpress.com/career-advice/
|
| He also has an active Mastodon, which further makes him
| more approachable: https://mathstodon.xyz/@tao
|
| It's rare to see a professional at the top of an academic
| field remain so encouraging to other people in the field,
| and work to make their work accessible to colleagues across
| different levels of mathematical maturity.
| ndsipa_pomu wrote:
| Other mathematicians opinions of him and his work. He's
| very prolific and was considered a child prodigy.
| rey0927 wrote:
| some claim his intelligence is the highest ever measured
| throw_pm23 wrote:
| He's a great mathematician all right, but that is a very
| silly way to put it.
| speed_spread wrote:
| He's gonna fix EVERYTHING. And he's gonna do it in ONE WEEK!
| readyplayernull wrote:
| You know, we have this thing in our company, when someone
| performs so well they can do other people's work.
| rey0927 wrote:
| if he helps in building AGI then yes
| alexisread wrote:
| For those that don't get the reference, president Camacho
| from Idiocracy :)
| ben_w wrote:
| I'm sure some people do indeed say that, as that has also
| been said about most famous smart people.
|
| However, from my experience as someone who scored "off the
| charts" on a UK school Cognitive Abilities Test and 148 on an
| IQ test: any score over 130 is suspect, as basically nobody
| is collecting enough data to make the tests statistically
| valid once you're more than 2s from the mean.
| dooglius wrote:
| What do you mean by statistically valid? Are you just
| saying that the map between the raw scores and a z-score is
| questionable (and thus comparing scores between different
| tests is invalid), or are you making a deeper claim?
| ben_w wrote:
| Both, I think. However I would also add that my
| understanding is second-hand, reading and watching what
| others have to say rather than any deep academic research
| of my own.
|
| Every source I've read or watched all agree the sample
| sizes used when creating the tests in the first place
| just aren't large enough to be all that confident beyond
| 2s.
|
| There's also the problem that training on tests is
| effective, which limits the ability of a test to be
| validated on those taking it fairly soon after it gets
| published.
|
| A further other issue is that most IQ tests pre-suppose
| the existence of a G-factor and treat differences between
| types of intelligence as noise rather than a signal for
| the existence of any multi-factor variation. I don't know
| which hypothesis is correct (I lean towards thinking it's
| _mostly_ a G-factor, but there are clearly some things
| which can only be explained by multi-factor models, e.g.
| where a skill is entirely absent such as in the case of
| dyscalculia).
| oh_sigh wrote:
| He's really really really good at math, but is he really
| really really good at a bunch of other unrelated fields too?
| It seems silly to imply that math skills are all there is to
| intelligence.
| rey0927 wrote:
| I think you don't actually know about g-factor. basically
| what these so called 'intelligence tests' measure is the
| rate at which one can absorb information and apply it
| abstractly, how you can come up with solutions to problems
| by basically doing high-level abstractions, finding
| patterns and connecting dots, finding things that no one
| thought of or came to their mind. What this means is that
| even though he might not know about how to do other things
| at first once he starts delving into them he can quickly
| learn them and really excel in them. What these test
| measure is the G factor and that if you are good at one
| thing you'll also be good at other things and I'm not
| making this up there is loads of evidence for this. the
| most rigorously tested idea in the field of psychology/
| psychometrics is intelligence. if you had to choose one
| single concept out the whole field it would be
| intelligence. you can read the g-factor and how they
| measure it, all the theory, stats everything and then tell
| me what do you think.
| oh_sigh wrote:
| I'm familiar with g-factor.
|
| But why is it that the people who always score the
| highest on IQ tests contribute the most to math, and only
| a little bit to all the other subjects under the sun?
|
| Why aren't they also creating the greatest art, growing
| the greatest vegetables, and, I don't know, designing the
| greatest structural engineering designs in the world?
| chx wrote:
| Tao got his PhD at the age of 21 and a tenured professor chair
| at the age of 24. To compare, at the age of 21 ordinary people
| only get their bachelors and typically it's 23-24 they when
| they get their masters degree. A PhD takes several more years
| and of course very few become tenured professors even more
| years later.
| pjmlp wrote:
| Depends on the country, before Bologna Portuguese degrees
| would be 5 years, so 23-24 would be the age of finalizing the
| degree, given that PhD without Msc wasn't possible back then,
| having everything done meant at least 30.
| n4r9 wrote:
| In the other direction, in the UK it's quite possible to
| have your bachelors by 21, masters by 22 and PhD by 25. I
| had my mathematics PhD by 26 and am not a remarkable
| mathematician.
| alliao wrote:
| I'm sure you're remarkable enough congratulations for
| your hard work
| n4r9 wrote:
| Not remarkable enough to land a post doc and stay in
| academia. But thank you :)
| poulpy123 wrote:
| in france the age if you follow the traditional scholl
| schedule to have a phd is 26.
| NotYourLawyer wrote:
| Yeah but since he did all this in the USA, I'm not sure why
| that's relevant.
| pjmlp wrote:
| It isn't, I was making the point university degrees
| aren't the same everywhere.
| malwrar wrote:
| Dumb question: how do people skip years like this in college.
| Like suppose you do know the material well already, does that
| just get recognized by the faculty at some point (you are
| noticed pre-college and get a faculty member interested in
| you? In college?) or would he have needed to specifically
| apply for that in some way?
|
| I ask in jealousy, I felt like my college put as many
| administrative barriers as possible to ensure they got at
| least 4 years of tuition out of me. I've always wondered how
| people somehow managed to speedrun to the top like this,
| rather than just being bored by their classes (esp gen-ed
| classes ugh) until they reach top rank the normal way.
| waythenewsgoes wrote:
| He had a lot of acclaim as a young mathematician, according
| to Wikipedia he was taking college level math at 9 years of
| age. At this level of talent I believe you begin to be
| courted by top universities similar to D1 athlete's, not
| only that, but you are likely genius-level smart, so
| sticking you in a classroom of people your age would be
| doing you an intellectual disservice, similar to sticking a
| college freshman in a 2nd grade classroom. At this level,
| you are probably seen as the next Newton or Gauss, and
| professors will want to attach themselves to you and work
| alongside you. At this point you bring a ton of value to
| the school just for attending and they won't mind if you
| are testing out of course material since they are just
| happy for you to be there for any length of time, and it
| just becomes more of a formality anyway.
| Scaless wrote:
| For non-geniuses like myself, you can just ask to test out
| of some of the lower tier introduction courses. I got a
| typically 4-year degree in 3 years this way. It's called
| credit-by-examination.
|
| I'm sure for the prodigy-level students there is an even
| higher streamlined process for keeping them engaged.
| dllthomas wrote:
| It wasn't "like this" but I just (with parental support)
| enrolled in the local community college after junior high,
| took a light load of college work, and got a homeschool
| program to sign off until I could test out at 16. Did 2
| years worth of college work over the 4 years that would
| have been highschool, then transferred to a "proper"
| university as a junior at 18.
| chx wrote:
| In the 90s I sort of attended two universities at the same
| time in Hungary (given the time and the place it's not as
| big a deal as it sounds) and I said to the second one, hey
| I know the basic stuff, let me attend the more advanced one
| and they said, sure, no skin off my back, do whatever you
| want, if you fail, that's a you problem but if you pass the
| more advanced tests we will credit you the basic stuff.
| pas wrote:
| Kevin Buzzard (a professional mathematician) has a few good
| talks about Lean
|
| https://youtu.be/Dp-mQ3HxgDE from 2019 at MS Research
|
| https://www.youtube.com/watch?v=SEID4XYFN7o&t=4m35s (2022 at
| ICM international math congress)
| JonChesterfield wrote:
| Youtube has forgotten what "audio" is for me so noting the
| first talk is also at https://www.microsoft.com/en-
| us/research/video/the-future-of... - which is using youtube -
| but manages to make noises. Which I suppose is a pointwise
| instance arguing that software does not work as well as it
| might do.
| j7ake wrote:
| Saying Terence Tao is a math prof at UCLA is like saying John
| Carmack is a programmer at Keen Technologies.
| jvm___ wrote:
| If you're going for a PhD or researching some obscure,
| furthest reach study of some mathematical principle and you
| get stuck. One of the ways to move forward is to hope that
| Terence Tao finds your topic interesting and can give you a
| few minutes of his time - because he'll probably have some
| ideas for moving your research forward.
| daynthelife wrote:
| Conversely, there's a running joke that you're in big
| trouble if Terry Tao takes too much of an interest in your
| topic of study, since he will inevitably solve all but the
| hardest problems in the field.
| btreecat wrote:
| So that's an excuse for such a low context post?
| hiddencost wrote:
| Yes. Not knowing who Terry Tao is,, if you're on HN, makes
| you an outlier.
| LoganDark wrote:
| Hi, it's me, I'm the outlier, it's me
| omginternets wrote:
| Welcome! You now know who Terence Tao is :)
| wholinator2 wrote:
| Not any more
| thornewolf wrote:
| For the HN crowd, explaining who Terrence Tao is would be
| ike explaining who Obama is
| fifilura wrote:
| I am sure 100% of thornewolf and 99% of people reading
| the same news as you know who Terence Tao is, but
| extrapolating from your own perspective is treacherous.
|
| https://xkcd.com/1053/
| LoganDark wrote:
| https://xkcd.com/2501
| swatcoder wrote:
| This may be hard to believe, but a lot of us here in "the
| HN crowd" have zero interest in celebrity and do not keep
| a catalog of them all in our heads. Even intellectual
| ones. There's nothing wrong with that, but it's also just
| its own interest one or may not have.
|
| People approach things in a lot of different ways and it
| would be nice if we can just respect that instead of
| digging on each other or making unfounded assumptions.
| ndsipa_pomu wrote:
| I don't have much interest in celebrity either, but I
| feel that you're short-changing Terence Tao. He's not
| really famous due to being celebrated, but he's
| celebrated due to his level of skill and breadth of
| mathematical knowledge (sometimes known as the Mozart of
| Mathematics).
|
| I think your sentiment is misplaced as I would expect HN
| commenters to have heard of Tao as he often features in
| posts about mathematics. I'm sure I recall seeing some
| comments from him too.
|
| There's an approachable Numberphile video featuring him
| here: https://www.numberphile.com/videos/the-worlds-best-
| mathemati...
| VirusNewbie wrote:
| I started to agree with you, but let's keep in mind that
| there are new generations of tech geeks out here who are
| just learning about the things we live and breathe.
| andrewflnr wrote:
| You can't expect the entire Internet to always provide
| enough context for you personally, with your exact level of
| knowledge, to understand a social media post with no
| additional effort. How do you expect that to work at scale?
| Meanwhile, Google is just a tab away.
| omginternets wrote:
| Lack of culture on your part?
| opan wrote:
| I'm relieved that Lean4 doesn't seem to be some AI thing.
| zozbot234 wrote:
| Automated reasoning is usually included in AI (Lean4 has
| tactics so it qualifies as such) but it's also quite
| different from the ML stuff that's the more usual kind of AI
| these days.
| alexvitkov wrote:
| Things usually stop being considered as AI when they start
| working.
| rprenger wrote:
| You can have both! LeanDojo: Theorem Proving with
| Retrieval-Augmented Language Models
| https://arxiv.org/abs/2306.15626 (shameless plug)
| julianeon wrote:
| Now that you mention it, Tao did say he used ChatGPT to help
| teach himself Lean.
|
| https://x.com/8teapi/status/1713867160886599920?s=46&t=4jd61.
| ..
| svat wrote:
| Instead of a twitter post of screenshots of it, the direct
| link to Tao's Mathstodon is: https://mathstodon.xyz/@tao
|
| He's been posting about it since Oct 9:
| https://mathstodon.xyz/@tao/111206761117553482
|
| > _I have decided to finally get acquainted with the #Lean4
| interactive proof system (using AI assistance as necessary
| to help me use it), as I now have a sample result (in the
| theory of inequalities of finitely many real variables)
| which I recently completed (and which will be on the arXiv
| shortly), which should hopefully be fairly straightforward
| to formalize. I plan to journal here my learning process,
| starting as someone who has not written a single line of
| Lean code before._
|
| There are several posts mentioning how GPT has been useful
| (though not always) at suggesting things, including one
| linking to this transcript with ChatGPT: https://chat.opena
| i.com/share/857353ad-a05b-4f9e-bb10-55f15a...
| sebzim4500 wrote:
| There is a project to connect GPT-4 and lean, but I doubt
| they will make a lot of progress until they can finetune
| GPT-4.
|
| https://www.youtube.com/watch?v=CEwRMT0GpKo
| jagrsw wrote:
| > where he holds the James and Carol Collins chair
|
| What is the academic impact of holding an endowed chair like
| the 'James and Carol Collins chair'? It it related to any
| specific advantages or responsibilities? Those titles seem like
| they serve as a form of recognition for donors, is there a
| deeper significance behind them?
| tnecniv wrote:
| Often, no, but some are particularly prestigious due to the
| lineage of individuals who had held the chair. Others have
| some special requirements like any endowment. Mostly, it is
| not really meaningful though.
| chongli wrote:
| The advantage is that the chair usually comes with a bunch of
| money which you can use to fund grad students and pay for
| research. The latter is of course much more important in
| experimental sciences and engineering than it is in math.
| Still, mathematicians often do hire grad students to do grunt
| work for them.
| GEBBL wrote:
| Is it possible that small bugs or assumptions in a root paper
| could cascade through referencing papers leading to wildly
| inaccurate outcomes 5 or 6 papers down the line?
| gridentio wrote:
| https://proofwiki.org/wiki/False_Statement_implies_Every_Sta...
| thfuran wrote:
| I work in medical software and a few years ago fixed a bug that
| was an incorrect parameter value in a model used for computing
| diagnostic criteria that had proliferated throughout the
| literature after seemingly being incorrectly transcribed in one
| influential paper. The difference was relatively small, but it
| did make results somewhat worse.
| isaacfrond wrote:
| More likely, wildly inaccurate outcomes will cause a re-
| examination of the cited theorems, which will probably flush
| out the bug.
|
| By the way it is not clear to me, if the theorem was false or
| if only proof was wrong.
| eigenket wrote:
| The theorem was mostly correct. As stated it was false, but
| it was true for n >= 8. If you change some not very
| interesting constants it becomes true for all n. All you need
| change is the constants for n < 8.
| lmm wrote:
| In theory yes. In practice mathematicians tend to have a good
| instinct for which things are true (although not always - some
| false theorems stood for decades if not centuries) and will
| avoid looking into ideas that don't pass the sniff test. Plus
| if you keep building on the consequences of a false result then
| you'll likely eventually reach a contradiction, which might
| inspire you to spot the bug in the original result.
| pbhjpbhj wrote:
| If mathematicians have "a good instinct" does that suggest
| that their brains can somehow apply an informal proof?
|
| I wonder if they are good, or if it's
| selection/confirmation/other biases that lead is to think
| say.
|
| Also, aren't surprising results the most interesting!?
| eigenket wrote:
| In an informal sense disproving things is easier than
| proving things. For example theorems often say thing like
| "all objects of type Y have property X", its very difficult
| to work out even where to start proving such a statement
| unless you're an expert in Y and X, but to disprove it all
| you have to do is find some example Y which doesn't have
| property X. If you've worked with Y things for a while you
| probably have a bunch of "pet" Y objects you can think of
| and if someone proves something you kinda automatically
| check to see what their proof says about the ones you're
| familiar with.
| Ar-Curunir wrote:
| > In an informal sense disproving things is easier than
| proving things.
|
| Note that this is not true in general, and depends on the
| type of theorem. The idea is that while it's easy to show
| why a _particular_ proof is incorrect, it's much more
| difficult to show that _every_ proof is incorrect.
|
| Formally, this idea is captured by CoNP, which is
| believed to be different from NP and hence a strict
| superset of P.
| drdeca wrote:
| > which is believed to be different from NP and hence a
| strict superset of P
|
| "a strict superset of P" doesn't really follow from that.
| P is also believed to be different from NP, and P is
| certainly not a strict superset of P.
|
| Of course, I assume you just misspoke slightly, and that
| whatever it is that you actually meant to say, is
| correct.
|
| E.g. maybe you meant to say "coNP is believed to both be
| strict superset of P (as is also believed about NP), and
| distinct from NP."
|
| I think it is believed that the intersection of NP and
| coNP is also a strict superset of P, but that this is
| believed with less confidence than that NP and coNP are
| distinct?
|
| I imagine I'm not telling you anything you don't already
| know, but for some reason I wrote the following
| parenthetical, and I'd rather leave it in this comment
| than delete it.
|
| (If P=NP, then, as P=coP, then NP=P=coP=coNP , but this
| is considered unlikely.
|
| It is also possible (in the sense of "no-one has yet
| found a way to prove otherwise" that coNP = NP without
| them being equal to P.
|
| As a separate alternative, it is also possible (in the
| same sense) that they are distinct, but that their
| intersection is equal to P.
|
| So, the possibilities: P=NP=coNP, P[?]cocapNP=NP=coNP,
| P=cocapNP[?]NP,coNP, All 4 are different, (cocapNP is the
| intersection of NP and coNP)
|
| Where the last of these is I think considered most
| likely? )
| eigenket wrote:
| This is a beautifully and slightly ridiculously formal
| comment to read two comments deep in response to my
| comment beginning "In an informal sense".
| eigenket wrote:
| Yes! I work in quantum information theory and this recently
| happened in a subfield called resource theories.
|
| The generalised quantum Stein's lemma [1][2] is (or was) a very
| powerful result that was used for over 10 years to prove things
| in this subfield. However last year it was noticed that there
| was an error in the proof of this lemma, and a whole bunch of
| results based on it weren't valid [3][4]. One of the authors of
| the paper where they wrote about the error gave a talk at the
| conference QIP 2023 this year, and there is a video of that
| talk available here [5]. Bartosz is a good speaker and I
| recommend watching the talk if you're interested, if you go to
| about 10 minutes, 30 seconds in the talk he discusses the
| consequences of this result now being not known to be true.
|
| [1] Published paper
| https://link.springer.com/article/10.1007/s00220-010-1005-z
|
| [2] Arvix version: https://arxiv.org/abs/0904.0281
|
| [3] Published version: https://quantum-
| journal.org/papers/q-2023-09-07-1103/
|
| [4] Arxiv version: https://arxiv.org/abs/2205.02813
|
| [5] Youtube link:
| https://www.youtube.com/watch?app=desktop&v=2Xyodvh6DSY
| zmgsabst wrote:
| This is arguably what caused the push for formalism in
| mathematics:
|
| Multiple papers on calculus claimed results about continuity
| and derivatives, but we're using subtly different definitions.
|
| The conflict between those results, and the counter-examples to
| demonstrate the difference, led to mathematicians building the
| modern machinery around proofs.
|
| > The Weierstrass function has historically served the role of
| a pathological function, being the first published example
| (1872) specifically concocted to challenge the notion that
| every continuous function is differentiable except on a set of
| isolated points. Weierstrass's demonstration that continuity
| did not imply almost-everywhere differentiability upended
| mathematics, overturning several proofs that relied on
| geometric intuition and vague definitions of smoothness.
|
| https://en.wikipedia.org/wiki/Weierstrass_function
| Dudester230602 wrote:
| I was worried that Lean4 is a yet another LLM, but it's actually
| some hard and reliable stuff.
| woolion wrote:
| "Terry Tao finds ChatGPT very helpful to prove new theorems"
| would actually be bigger news than this one, IMO.
| throwalean wrote:
| "Terry Tao finds ChatGPT very helpful to formally verify his
| new theorems" seems to be a true statement. See some of his
| other recent mathstodon toots.
| woolion wrote:
| The point is that LLMs and similar tools tend to be very
| good at automating the trifle but not very useful at what
| would be considered really "interesting" work.
|
| So while your point is somewhat true [0], as he mentions
| that these tools could become good enough to do the formal
| verification part, it's precisely not the interesting part.
| See [1] and [2]; in particular some things that are very
| easy to do in real maths can be very challenging in an
| automated theorem prover, quoting from [2]:
|
| >In the analyst's dialect of Mathematical English, this is
| a one-line proof, namely "by the standard limiting
| argument". Unpacking this into a formal proof required me
| to go through a fair bit of the Mathlib documentation [...]
|
| It's impressive to be able to do such mathematics in
| Lean/Coq..; at all, but it is very tedious mechanical work
| [3].
|
| >It was more tedious than I expected, with each line of
| proof taking about an hour to formalize
|
| So I think that rather proves the point of what LLMs are
| currently good for, and what tools can help for really
| difficult tasks, rather than invalidate it.
|
| [0] https://mathstodon.xyz/@tao/111305365372766606
|
| [1] https://mathstodon.xyz/@tao/111305336701455719
|
| [2] https://mathstodon.xyz/@tao/111259986983504485
|
| [3] https://mathstodon.xyz/@tao/111305336701455719
| jamesrom wrote:
| Anything that helps Terence Tao find bugs in his papers is hard
| and reliable in my books.
| OscarCunningham wrote:
| He is finding Copilot useful to help with the formalisation:
| https://mathstodon.xyz/@tao/111271244206606941.
| ykonstant wrote:
| The Lean 4 community is quite optimistic about combining LLMs
| and theorem provers for proof assistance and formalization.
| deterministic wrote:
| Lean4 is brilliant. Worth digging into as a programmer. Coq,
| Lean4, Agda etc. made my brain explode in a good way. Making me a
| better software developer.
| gridentio wrote:
| I'm kind of interested in how useful Lean4 is as a programming
| language, and if it's easy to prove things about a program
| written in Lean. I should probably look into that when I have a
| minute.
| ykonstant wrote:
| Regarding usefulness: Lean is very nice to program in, if you
| care about pure functional languages; its FFI allows you to
| incorporate fast C routines very easily if pure Lean is not
| performant enough or lacks features. However, in some
| domains, Lean is within a decimal order of magnitude of (not
| hand-optimized) C; some benchmarks I hand-made recently
| impressed me.
|
| Regarding proving things about programs, no, it is not easy,
| and the developers do not seem to consider it a core goal of
| Lean.
| staunton wrote:
| > the developers do not seem to consider it a core goal of
| Lean
|
| I guess it depends on who you ask. The original devs of
| Lean wanted to do "everything" (because that's how you
| start projects, I guess). Since then it has attracted a lot
| of mathematicians (especially those working on Mathlib, a
| library that aspires to formalize "all known math") who are
| happy to have "algorithm objects" and prove things about
| them without being able to actually run an algorithm on any
| input.
|
| This goes together with mostly embracing classical logic
| (which breaks the original and most powerful version of
| Curry-Howard, which allowed you to extract programs from
| proofs). However, in practical situations, algorithms
| extracted in this way tend to be too slow to be useful, so
| maybe that's not actually a downside for programming
| purposes.
|
| Finally, Lean4 "compiles" to C-code, so at least it is (or
| can reasonably easily be made) portable. People have been
| trying to use it for real applications, like the AWS stuff
| others have linked in this thread.
| kmill wrote:
| The core Lean 4 developers do want proving properties about
| programs to be easy. In the short term maybe priorities
| have been elsewhere due to limited resources, but that
| doesn't mean they do not consider this to be a core goal.
| My understanding is that there are still some research-
| level problems here that need to be worked out. (Proving
| things about do notation is currently a real pain for
| example.)
| tux3 wrote:
| For people looking for an easy introduction to Lean4, the Natural
| Number Game is great: https://adam.math.hhu.de/#/g/hhu-adam/NNG4
|
| And if you just want to read without playing a game:
| https://lean-lang.org/theorem_proving_in_lean4/introduction....
| ohduran wrote:
| Noob here, how is Lean4 different from TLA+ or Alloy? Is that
| even a reasonable comparison?
|
| Edit: Wrote Allow the first time, good stuff.
| tux3 wrote:
| I'd say TLA+ is designed more for software people trying to
| design systems, write down specs, and reason about how the
| thing behaves dynamically.
|
| Lean is used mostly for writing down math proofs, and a lot
| less for software (although by the Curry-Howard
| correspondence, math proofs and programs have an equivalence,
| so the line is a little blurry). Lean has "mathlib", which is
| like a standard library of formally verified math that people
| can contribute to and use in new proofs.
|
| A big multi-year effort to start formalizing the proof of
| Fermat's Last Theorem in Lean 4 was approved recently: https:
| //www.ma.imperial.ac.uk/~buzzard/xena/pdfs/AITP_2022_F...
| ohduran wrote:
| Thank you so much for your answer
| kmill wrote:
| Leo de Moura wants Lean 4 to be used for software
| verification too.
|
| A cool thing about Lean 4 is that it's also a programming
| language, using the same syntax as for proofs, making it
| easy to consider proving correctness properties of programs
| you write. Most of Lean 4 and its tactics are written in
| Lean 4 (though at this point almost none of this code has
| any associated proofs).
| dunham wrote:
| Yeah, I believe they said they intend for it to be used
| as a general purpose programming language. I used it to
| complete Advent of Code last year.
|
| There are some really interesting features for general
| purpose programming in there. For example: you can code
| updates to arrays in a functional style (change a value,
| get a new array back), but if the refcount is 1, it
| updates in place. This works for inductive types and
| structures, too. So I was able to efficiently use C-style
| arrays (O(1) update/lookup) while writing functional
| code. (paper: https://arxiv.org/abs/1908.05647 )
|
| Another interesting feature is that the "do" blocks
| include mutable variables and for loops (with continue /
| break / return), that gets compiled down to monad
| operations. (paper:
| https://dl.acm.org/doi/10.1145/3547640 )
|
| And I'm impressed that you can add to the syntax of the
| language, in the same way that the language is
| implemented, and then use that syntax in the next line of
| code. (paper: https://lmcs.episciences.org/9362/pdf ).
| There is an example in the source repository that adds
| and then uses a JSX-like syntax. (https://github.com/lean
| prover/lean4/blob/master/tests/playgr... )
| pjmlp wrote:
| Yeah, here is where TLA+ fails to gain adoption, because
| it is only a model, how people actually implement the
| architecture is completely unrelated to how the model was
| validated.
| forward-slashed wrote:
| Also check out Morph Labs, which is working with Lean to create
| an AI proof assistant. Cool startup by ex-OpenAI folks.
|
| Essentially a strong type system of Lean can help with
| constrained generation. Thus every token would always lead to
| some valid (if not correct) proof in Lean, iiuc. Maybe people @
| Morph can comment.
|
| https://x.com/morph_labs
| cubefox wrote:
| I wonder whether we could combine formal proof checkers (like the
| Lean proof checker) with language models that generate synthetic
| conjecture-proof pairs in a formal language like Lean.
|
| The Lean proof checker could be used to automatically verify
| whether the synthetic proofs written by the language model are
| correct. This information could be used to provide an RL reward
| signal applied to the original language model, which would result
| in it writing better proofs. (Or we train a new model using the
| correct synthetic proofs of the previous round as training data.)
| And then the process repeats. So the model would self-train using
| its synthetic training data, without further human intervention.
|
| We could even make this process more adversarial. First we split
| the generator language model into two: One which generates
| conjectures, and one which tries to prove/disprove them in Lean.
| Then add a predictor model which tries to predict whether a
| synthetic proof is verified by the Lean proof checker. The lower
| the predicted probability that the proof will be correct, the
| more reward gets the proof-generator model if it did indeed
| provide a correct proof.
|
| Finally, we add another model which tries to predict the reward
| the proof-generator model will get for a given synthetic
| conjecture. Then the conjecture-generator model is rewarded for
| conjectures that are predicted to yield a high reward in the
| proof-generator model. So conjectures that are neither too hard
| not too easy for the proof-generator model.
|
| So we would expect that the whole system would progressively
| create harder and harder synthetic proofs, which in turn allows
| for better and better self-training of the proof-generator.
|
| It seems this could in principle scale to superhuman ability in
| generating proofs. The process would be somewhat similar GANs or
| to self-play in AlphaGo Zero.
|
| I think the hard part is the initial bootstrapping part, to get
| the whole process off the ground. Because the initial training of
| the generator models has to be done with human provided training
| data (Lean proofs). But once the synthetic proofs are good
| enough, the system would self-train itself automatically.
| ykonstant wrote:
| This workflow is an explicit goal of the Lean 4 devs. The
| official Zulip chat has a channel dedicated to the interface
| between the two:
|
| https://leanprover.zulipchat.com/#streams/219941/Machine%20L...
| cubefox wrote:
| That would be interesting, though this link requires an
| account.
| staunton wrote:
| There are people doing this already. For Lean, an example is
| https://morph.so/blog/the-personal-ai-proof-engineer/
| cubefox wrote:
| If I understand this correctly, they want to use an AI model
| for "autoformalization", which sounds like "using a language
| model to translate natural language proofs into Lean proofs".
| Which is cool but much less ambitious than the self-train
| system I described above. I guess AI technology isn't yet far
| enough to make such a proposal work.
| aSanchezStern wrote:
| Sounds like you've stumbled into the wonderful world of
| machine-learning guided proof synthesis! While I don't think
| the _full_ system you 're describing has been built yet, many
| similar systems and pieces have. In terms of the initial phase
| of supervised learning on existing proofs to prove new ones,
| there's TacticToe (https://arxiv.org/abs/1804.00596), Tactician
| (https://arxiv.org/pdf/2008.00120.pdf), CoqGym/ASTactic
| (https://arxiv.org/abs/1905.09381), Proverbot9001
| (https://arxiv.org/abs/1907.07794), and Diva
| (https://dl.acm.org/doi/10.1145/3510003.3510138#sec-terms),
| among others. Most of these have some sort of language model
| within them, but if you're specifically looking for the LLM's
| that have been big recently, there's GPT-f
| (https://arxiv.org/abs/2009.03393), Baldur
| (https://arxiv.org/abs/2303.04910), and COPRA
| (https://arxiv.org/abs/2310.04353), though currently these
| models don't seem as effective as the specialized non-LLM
| tools. In terms of using reinforcement learning to learn beyond
| human written proofs, there's TacticZero
| (https://openreview.net/forum?id=edmYVRkYZv), this paper from
| OpenAI (https://arxiv.org/pdf/2202.01344.pdf), rlCoP
| (https://arxiv.org/abs/1805.07563), the HOList line of work
| (https://arxiv.org/pdf/1905.10006.pdf), and HyperTree Proof
| Search (https://arxiv.org/abs/2205.11491), as well as some in
| progress work I'm working on with a team at the University of
| Massachusetts.
| omneity wrote:
| That one of the brightest minds of our generation is able to
| increase his bandwidth with the combination of LLMs and automated
| proofs makes me super bullish on this tech combo in the future!
|
| It starts with bug-fixing, then supports verification, until it
| starts propelling new discoveries and push the envelope.
|
| We need a term when a dynamic like Moore's Law "infects" a field
| that had no such compounding properties before.
|
| EDIT:
|
| There's additional context that Terence Tao is using Copilot to
| help him learn Lean. As shared by adbachman:
| https://mathstodon.xyz/@tao/111271244206606941
|
| Could Terence have done it without Copilot? Sure, but like many
| of us he might not have initiated it due to the friction of
| adopting a new tool. I think LLM tech has great potential for
| this "bicycle for the mind" kind of scenarios.
| aylmao wrote:
| Lean 4 is a theorem prover, and has nothing to do with LLMs as
| far as I know though.
| aylmao wrote:
| Lean 4 is a programming language and theorem prover, and has
| nothing to do with LLMs as far as I know though.
| adbachman wrote:
| the missing context from the previous comment is that Tao
| used GitHub Copilot to help with learning Lean.
|
| He's been writing about it as he goes, most recently:
| https://mathstodon.xyz/@tao/111271244206606941
| omneity wrote:
| Thanks for adding the clarification. I thought this was
| more common knowledge. I will update my comment
| accordingly.
| passion__desire wrote:
| Not to beat the point: LLMs are compiler for english (natural)
| language.
| ocfnash wrote:
| You can even follow his progress on GitHub here:
|
| https://github.com/teorth/symmetric_project/
| user3939382 wrote:
| I'm really excited about dependent types. I'm expecting we won't
| get them for a while though. Dependent Haskell is progressing but
| apparently it's hard to retrofit. Idris' own creator has said he
| expects it to be a model for other languages, I don't think it
| will ever have mainstream adoption. Coq and Agda, F* aren't
| really designed to be general purpose.
|
| Although the implementation for the compiler is complex, and the
| syntax can get complex and verbose, to me my requirement is
| simple: I want to encode everything about input and output that I
| know.
|
| Right now in mainstream languages I often know more about my
| arguments or output than the type system will allow me to
| specify.
| mettamage wrote:
| Eli5 dependent types?
|
| Chatgpt:
|
| Dependent types are a concept in computer science and
| programming languages where the type of a variable can depend
| on the value of another variable. In simpler terms, imagine you
| have a list of numbers and you also have information about how
| long that list is. With dependent types, you could create a
| type for the list that explicitly includes its length, ensuring
| at compile time that operations respect this length. This makes
| it possible to catch certain kinds of errors before the code
| even runs.
|
| For example, in a language with dependent types, you could
| specify a function to take a list of length 3 and no other
| sizes. If you tried to pass a list of length 4, the program
| wouldn't compile, preventing this kind of mistake early on.
|
| It's a bit like having an extra layer of safety checks, where
| the types are more expressive and can encode more intricate
| relationships between variables.
| icen wrote:
| Another classic use case is that you can have expressions in
| the types of your function, for example this won't compile:
| stringOrInt : (x : boolean) -> int -> (if x then String else
| int) stringOrInt true x = toString x
| stringOrInt false x = x + 1 1 + stringOrInt true
| 37 # this will error, because it knows you've not returned an
| int
|
| The other example that you can do in depedently typed
| languages, but is too involved to write out here, is make a
| type-safe printf, where the format string produces the types
| for the other arguments.
| howenterprisey wrote:
| I never care much for chatgpt answers. I don't know why
| people post them on here.
|
| In the first sentence, "another" is wrong because you don't
| need two variables, you just need one. Final paragraph's
| wrong for the same reason.
|
| The example given is poor given that I can write [i8; 3] or
| int[3] in Rust or C and those very much do not have
| "dependent types" in the popular sense (Rust's const generics
| excepted). To be fair, those examples are technically
| dependent types, but it would be better to give an example
| that's impossible in those languages, such as "array with
| length at most 3" or "even integer".
|
| Finally, to nitpick, "a bit like" is unneeded hedging.
|
| Stack Overflow did a much better job:
| https://stackoverflow.com/questions/9338709/what-is-
| dependen.... Wikipedia's article is pretty bad and maybe I'll
| get around to fixing it at some point.
| xeonmc wrote:
| ChatGPT = Cunningham bait
| howenterprisey wrote:
| Worked for me :<
| mettamage wrote:
| What does that mean?
| Jtsummers wrote:
| https://meta.wikimedia.org/wiki/Cunningham%27s_Law
|
| Cunningham's Law: Post something wrong online so you can
| get a correct answer.
|
| Half the ChatGPT answers on here seem to be wrong in
| obvious ways or, worse, subtle but critical ways. When
| people post them they get downvoted, and other people
| chime in with "Why are you trusting a fabulist like
| ChatGPT instead of going to actual resources with
| definitions and explanations that aren't garbage? Here's
| what it actually is..."
| gosub100 wrote:
| I wonder if it would create a singularity in spacetime by
| posting the GPT answer to "what is cunningham's law"
| here?
| gosub100 wrote:
| I don't see anything wrong with GPT's answer other than
| pedantic quibbling. I just don't like it because I don't
| want to share this space with non-humans.
| ChadNauseam wrote:
| I like ChatGPT's answer better than yours. Rust's cost
| generics and C++ aren't dependent types in any sense. And
| saying "the type of one variable can depend on the value of
| another" is vague but a good intuition pump.
|
| Since I guess we're doing dependent types explanations I'll
| give it a go. Dependent types extend a type system with two
| new type thingies: 2-tuples where the type of the second
| element depends on the value of the first, and functions
| where the type of the function's return value depends on
| the value that was passed to it.
| mettamage wrote:
| Because I had the question and I figured this was quicker.
| I didn't know what dependent types were.
|
| So now you know why I do it. Also, I believe this is my
| first time doing it. I might be wrong.
|
| Is it better to ask and wait for an answer instead?
|
| There is nothing in the guidelines on HN about it. I don't
| know what's reasonable and I haven't seen strong cultural
| norms from HN yet. I at least labeled that the text was
| from chatgpt as to not confuse it was my own text.
| kccqzy wrote:
| Did you forget about search engines that enable you to do
| your own research rather than asking a human or an LLM?
| mettamage wrote:
| I could use google but chatgpt wins on speed and asking
| on HN wins on vine. When someone asks me a question I am
| mostly quite enthusiastic in explaining it. You can't get
| that as easily from a search engine.
|
| It all has trade offs.
| kccqzy wrote:
| I don't fault you if you explicitly emphasize speed
| rather than accuracy. Just don't post these on HN.
| howenterprisey wrote:
| It's of course fine to ask chatgpt. I also appreciate
| that you wrote that it was from chatgpt.
|
| However, I wouldn't recommend posting the result here if
| you don't know if it's correct. Moreover, anyone can ask
| chatgpt themselves. It's better to wait for someone here
| to post an answer.
|
| Yes, there's nothing in the guidelines, but they're
| (deliberately) not all-encompassing. Besides, I would
| hope it's part of basic Internet etiquette; just because
| we now have access to tools that generate plausible-
| sounding but not necessarily correct answers to questions
| doesn't mean we need to post what they create in the
| place of genuine answers.
| gosub100 wrote:
| > Is it better to ask and wait for an answer instead?
|
| No, around here it's better to say "So dependent types
| are pretty much $SOMETHING_COMPLETLY_WRONG ?" and wait
| for all the "corrections" (aka Cunningham's law someone
| linked to nearby).
| bjornasm wrote:
| That is really interesting. Whish there was some better
| name for that, as I feel like it isn't that descriptive.
| However their benefit seem really obvious, saying that
| something is an areay, or even an array of integers is
| barely half the picture if in reality it is a array of
| length 5 of even integers to loan from your example. I
| guess you would try to implement it in other languages
| creating objects?
| turndown wrote:
| It's simplest explanation is that dependent types are types
| whose exact type use some kind of parameterized value in the
| definition. So a dependently typed array would use a
| parameterized length value as part of it's definition(this is
| usually the easiest example to understand). So an array of
| length 4(of whatever type) could be identified as different
| from an array of length 5, etc.
| valyagolev wrote:
| I totally share your excitement about dependent types, but it
| seems that, unlike the type systems we're used to, theorems
| about the dependent types are much harder to prove, which makes
| them not very comfortable to use for the whole program.
|
| If only there was some kind of a gradual, perhaps typescript-
| like approach to adding arbitrary type-level value-limiting
| information in random places without having to have everything
| proven everywhere...
| skulk wrote:
| Every non-dependent typing relation is also a dependently
| typed relation so I think things are already the way you
| want, unless you have a certain example in mind.
| valyagolev wrote:
| Sure, in the same sense that every "untyped" program is
| already typed with some kind of a universal type, but
| what's the point?
|
| What I want is to be able to specify and have (when
| possible) automatically proven arbitrary assertions
| anywhere in the code without necessarily making sure that
| every possible presupposition is proven from the ground up.
| Just like in Typescript, I can add a type at any point
| where there was only "any", and have a small portion of the
| program typed without typing the whole thing
| skulk wrote:
| > What I want is to be able to specify and have (when
| possible) automatically proven arbitrary assertions
| anywhere in the code
|
| This doesn't even exist in TypeScript. If I change
| function foo(a: any) { return baz(a); }
|
| to function foo(a: number) { return
| baz(a); }
|
| whoever calls foo has to still prove (or assert) that the
| argument is a number.
|
| Is that what you're after, asserting a dependent type?
| For example being able to change:
| function divByTwo(a: number): number { return a/2; }
|
| to function divByTwo(a: even_number):
| number { return a/2; }
|
| You want every place divByTwo is called to also
| automatically supply a proof that the argument is even?
| valyagolev wrote:
| i don't even need it to go beyond the insides of a
| function. something like a proof of an invariant that's
| only relevant inside the function's body would be fine.
| e.g. in Rust, almost every place where you see something
| like ".unwrap()" with "// unwrap safety:" comment, this
| comment could be an assertion easily proven from the few
| lines above
| dunham wrote:
| Not sure if it's what you're asking for, but in lean, you
| can put `sorry` to skip a proof. It's `believe_me` in
| Idris2, and I think the Agda version is `trustMe`.
|
| You can also use stuff like ! when you don't want to
| prove your array indices (it might crash or substitute a
| dummy value if you're wrong).
| gtirloni wrote:
| Tangent useless comment: while checking this Mastodon instance, I
| noticed a specific user is spamming the global feed by replying
| to his own posts every other day. I only saw his posts, mostly,
| and wondered if this was a personal instance of sorts.
| olddustytrail wrote:
| Did you notice that the domain is _maths_ tadon.xyz so it's
| probably a pretty small user base.
| gtirloni wrote:
| Yes, I noticed the domain and the small instance, but I
| missed your point.
| pmarreck wrote:
| If you have a denominator composed of "n - k - 1", I hardly find
| it surprising that if n=3 and k=2 that you have a slight
| problem...
|
| Anyway, check out Idris[2], it's cool for this sort of thing
| tgv wrote:
| Does anyone understand what the proof is about? An improved
| boundary on the mean of geometric series? And why it's only a
| problem for n=3, k=2, and not for all k=n-1?
| pa7x1 wrote:
| From a very quick glance at page 6 of
| https://arxiv.org/pdf/2310.05328.pdf
|
| You can see he is treating the case of k=1,2 with that formula
| and uses induction to extend it to 1 <= k <= n - 2.
|
| For k = n - 1 he uses a different bound defined in equation
| 2.2. So he bypasses the issue.
| 0xpgm wrote:
| Is there a way to do lightweight incremental proof checking in a
| typical say Python or Javascript codebase?
|
| Maybe specifying some conditions/assertions in comments and have
| it verified using some static analysis tool? Though I recognize
| it could be quite a challenge in dynamically typed languages.
| logicchains wrote:
| >Is there a way to do lightweight incremental proof checking in
| a typical say Python or Javascript codebase
|
| Running a JavaScript codebase through the Typescript compiler
| is a lightweight way to do incremental proof checking, albeit
| it can only check proofs about the soundness of the code.
| ianandrich wrote:
| Python's Deal library provides contracts and a small portable
| formal proofs area of the codebase.
|
| Additionally, Deal integrates with CrossHair which does
| concolic execution of the tests and functions annotated with
| contracts. It's integrated with Z3, and most of the Python
| primitives are covered.
|
| It just works surprisingly well for incrementally building up
| provable code properties.
| pama wrote:
| Here is earlier context for how Tao used LLM tools, including
| GPT-4, to help him in this journey.
|
| https://mathstodon.xyz/@tao/111233986893287137
| syngrog66 wrote:
| I found a "bug" in one of Terence Tao's math blog posts too,
| years ago. I told him about it, he fixed it, and thanked me.
|
| I didnt make the front page of Hacker News, of course. lol
| syngrog66 wrote:
| If anyone interested, I went into more detail about it, over in
| a post on my blog:
|
| https://news.ycombinator.com/item?id=38040982
| X6S1x6Okd1st wrote:
| He started learning lean4 with the help of GPT4 just at the start
| of the month:
|
| https://mathstodon.xyz/@tao/111208692505811257
|
| Many of his mastodon posts this month have been about his
| learning progress.
|
| Certainly an interesting case study of how LLMs can accelerate
| the work of even of the most extremely successful people
| mensetmanusman wrote:
| I have found that good communicators that don't code can
| quickly make functional automation.
|
| Interestingly, LLMs may end up contributing to more inequality
| if only the highly skilled can leverage them effectively.
| marshray wrote:
| My friend had never written anything more than an Excel
| formula a few months ago and now he's using GPT-4 to write
| very nontrivial Python applications and automate large parts
| of his job.
|
| I (having 30 years experience as a professional Software
| Developer^TM) am begging him to teach me his techniques.
|
| Now that you mention it, I met him and we became friends in
| large part due to his communications abilities.
| strikelaserclaw wrote:
| gpt4 is amazing, i rarely use google as as starting point for
| my programming related queries these days.
| danenania wrote:
| Especially now that the training cutoff is being moved up to
| April 2023. Questions requiring more recent results were the
| main ones I've been going back to google for.
| RealityVoid wrote:
| Huh, I can't seem to get in the groove of using it, maybe I'm
| old or something, but it annoys me all the subtle ways it's
| wrong and I feel I have a much better grasp if I think
| through it myself supported by Google.
| popularonion wrote:
| I get a lot of mileage out of ChatGPT just treating it like
| an intern who turns around work instantly. You don't expect
| interns to write perfect code, but they can save you a ton
| of time if you set them loose on the right problems.
|
| For any relatively simple task I can say "Write a Python
| script to do X" and it will almost always spit out working
| code, even if it has subtle mistakes. Fixing mistakes is
| _fine_ and part of the process. I don 't have to read
| StackOverflow posts saying "Do you really want to do X?",
| or sift through documentation that follows the author's
| approach of how they want to introduce the material but
| doesn't directly address my question.
| gsuuon wrote:
| Is there value in learning verified programming for application
| developers? I've always been curious but they mostly seem like
| tools for academics.
| deepsun wrote:
| I believe the most value is taken when there's an asynchronous
| system and you need to verify it's correctness. Backend dev
|
| For app dev I'd say the main problem is always "what we even
| need to build?", and then polishing over user experience.
| neeleshs wrote:
| Lean4 looks like a great language. Has anyone used it in
| production capacity for "normal" products/applications?
| clircle wrote:
| Never heard of a mathematical error called a bug before
| hanche wrote:
| Me neither, but it makes sense to me in the case of an error
| that is relatively easily corrected. If a proof is fatally
| flawed, however, I would not use that term.
___________________________________________________________________
(page generated 2023-10-27 23:00 UTC)