[HN Gopher] Why Is SQLite Coded in C (2017)
___________________________________________________________________
Why Is SQLite Coded in C (2017)
Author : piyushsthr
Score : 99 points
Date : 2021-08-23 17:04 UTC (5 hours ago)
(HTM) web link (www.sqlite.org)
(TXT) w3m dump (www.sqlite.org)
| mikece wrote:
| While I know it's a joke it's still funny on so many levels
| because it _could_ be true: an interview with Stroustrup claiming
| that he invented C++ to be purposely hard to preserve the
| mystique that programming was hard and keep programming salaries
| up:
|
| https://webhome.phy.duke.edu/~rgb/Beowulf/c++_interview/c++_...
| pjmlp wrote:
| The real reason why Bjarne invented C++ was not having to deal
| with BCPL or C.
|
| After his experience getting his thesis ready in Simula, only
| to rewrite it in BCPL, he swore himself not to put himself ever
| again through similar pain.
|
| So after working for a while at AT&T, C with Classes was his
| way to avoid dealing with raw C, while having some of that
| Simula productivity.
| ChrisArchitect wrote:
| (2017)
|
| Previous discussion:
| https://news.ycombinator.com/item?id=16585120
| butterisgood wrote:
| "Go hates assert" explain?
|
| What is panic but a reaction to an assertion failure?
| steveklabnik wrote:
| https://golang.org/doc/faq#assertions
| xiphias2 wrote:
| ,,Rust needs a mechanism to recover gracefully from OOM errors.''
|
| It seems that Linux needs the same requirements from Rust as
| SQLite to compete with C as a systems level language. It's going
| in the right direction.
| faustocarva wrote:
| "old and boring" made my day more fun. Thanks!
| juice_bus wrote:
| How is Rust in regards to the SQLite's requirements now?
| tptacek wrote:
| Pretty much the same. Arch support in particular would be a
| clear dealbreaker, I think.
| geofft wrote:
| Won't rustc_codegen_gcc effectively solve arch support? It's
| not done and shipped, but it exists and is usable and is
| landing into rustc, which is a fair amount of progress.
|
| (Or is SQLite portable to architectures that GCC does not
| support?)
| marcosdumay wrote:
| > architectures that GCC does not support
|
| Do those exist?
|
| (Yeah, there are some microcontrllers that can only be
| programmed by the manufacturer's C compiler, that doesn't
| even support the entire language and is full of bugs. But I
| would be very surprised if SQLite run on a PIC.)
| masklinn wrote:
| > Do those exist?
|
| IIRC a possible issue is the use of forks of gcc (or
| something else), I think it used to be common for console
| toolchains though maybe less so these days.
| tptacek wrote:
| Yes, I imagine it will.
| tyingq wrote:
| For some more detail, sqlite has makefiles for things like
| Windows CE and VxWorks, and the generic configure process
| builds on almost anything else sufficiently POSIXY, like QNX.
| estebank wrote:
| When I see projects like this, with such impressive range
| of supported platforms, I am always a bit weary at _how
| well tested_ those platforms effectively are. I know that
| some bugs have been caught in OpenBSD due to some of the
| more esoteric platforms making evident some incorrect
| assumptions, but I also remember Debian boasting a huge
| amount of packages for, let 's say, ARM that would build,
| but any attempt to use would show they had never been tried
| out and were wholly unsupported.
| tyingq wrote:
| True, though sqlite is pretty universally lauded for
| their approach to testing. Perhaps not all platforms are
| tested by the Sqlite team, but if you run their tests on
| your platform, the coverage is pretty good.
| steveklabnik wrote:
| rustc has seven supported vxworks targets, incidentally.
| masklinn wrote:
| > Rust needs to mature a little more, stop changing so fast,
| and move further toward being old and boring.
|
| Didn't make any sense at any point post 1.0. Rust 1.0 code
| still works today modulo BC breaks required to fix
| unsoundnesses.
|
| > Rust needs to demonstrate that it can be used to create
| general-purpose libraries that are callable from all other
| programming languages.
|
| Also didn't make much sense at any point as the story there was
| always straightforward. But efforts like librsvg have pretty
| much demonstrated that (interestingly the libsrvg conversion
| effort started around the time that page was created, circa
| 2017).
|
| > Rust needs to demonstrate that it can produce object code
| that works on obscure embedded devices, including devices that
| lack an operating system.
|
| Rust is used in a bunch of embedded contexts. Whether Rust can
| produce object code that works on _your_ embedded device is a
| more debatable question, depends on the existence (and quality)
| of the proper llvm backend.
|
| I think there's also a gcc frontend in the works, but I expect
| it's essentially nowhere yet as it was only just started (few
| months old I think?). Though I believe it has financial support
| and a fair amount of manpower. I believe there's also an even
| more recent effort for a gcc backend in rustc.
|
| So yeah this one I'd say there's limited progress yet but
| things seem to be moving in the right direction _and picking
| up_.
|
| > Rust needs to pick up the necessary tooling that enables one
| to do 100% branch coverage testing of the compiled binaries.
|
| Unclear what the issue is there so no idea.
|
| > Rust needs a mechanism to recover gracefully from OOM errors.
|
| That was always possible by working no_std, though of course
| required reimplementing your own abstractions.
|
| With the linux kernel integration effort, a lot more work is
| going into "fallible allocation" APIs, and thus the ability to
| gracefully recover from allocation failures.
|
| > Rust needs to demonstrate that it can do the kinds of work
| that C does in SQLite without a significant speed penalty.
|
| -\\_(tsu)_/-
| tick_tock_tick wrote:
| > depends on the existence (and quality) of the proper llvm
| backend
|
| LLVM's support is really quite lacking that this issue alone
| is enough of a justification for a project like SQLite to be
| written in C. The rust-gcc effort you mentioned will
| hopefully solve this.
|
| Quite a few of the others really are they just don't want to
| be pioneers which is really quite fair.
| [deleted]
| tptacek wrote:
| The branch coverage thing is a weird, artificial-seeming
| requirement that all the branches in the compiled code ---
| not the code as written, but the code ultimately produced by
| the compiler --- be testable. In other words: if the compiler
| generates a bounds check _anywhere_ , it should be possible
| to test what happens when that specific bounds check fails.
| The problem is that sane Rust code doesn't give you all the
| tools you'd need to deliberately trip all the checks the
| compiler generates, because that is part of the point of
| being a safe language.
| infogulch wrote:
| > The branch coverage thing is a weird, artificial-seeming
| requirement
|
| Yes that's because it's a requirement designed by a
| standards body. I found a paper "Is 100% Test Coverage a
| Reasonable Requirement? Lessons Learned from a Space
| Software Project" (2017) that mentions that 100% branch
| coverage is a requirement in European Cooperation for Space
| Standardization (ECSS) for Class A software (where failure
| could result in loss of life, etc). The paper concluded:
|
| > Our findings include that there seems to be a break-even
| point between 80% and 95%, and everything beyond this
| points is increasingly costly and could introduce new
| project risks--which confirms findings reported so far in
| literature (Section 5). However, the interview revealed
| that, still, 100% coverage can be a reasonable quality
| requirement; even though a 100% requirement is not a good
| indicator for the software quality as such.
|
| https://www.researchgate.net/publication/319141355_Is_100_T
| e...
|
| It doesn't follow that _because_ rust is a safe language
| that it _cannot_ expose test harnesses that would enable
| 100% branch coverage. Personally I 'm ambivalent on this,
| maybe it's not useful, but it doesn't seem bad either. But
| your reaction to this requirement seems weird, like the
| "fox and the grapes" fable... You can't get 100%-branch
| coverage, so you give up and claim that 100% branch
| coverage is dumb, why would anyone want that anyway? Do you
| really think that 100% branch coverage testing should be
| unavailable to rust programmers if that's what they need
| (for whatever reason, including meeting some admittedly
| arbitrary standard)?
| tptacek wrote:
| A couple of people have brought this up here, and it's an
| argument that makes sense. I'll just note that the sqlite
| page, which is what I'm critiquing, isn't written this
| way; the project doesn't say "we use C because ECSS
| requires us to build software in an 100%-branch-coverage
| language", but rather speaks to the important benefit of
| literal 100% branch coverage. It's that important benefit
| I question, not the logistics problems they face, which I
| concede.
| infogulch wrote:
| That might just be me twice, I added a new reference this
| time at least ^_^ I agree that 100% branch coverage as a
| goal in and of itself is generally dubious.
|
| I seem to recall an interview where the SQLite creator,
| Richard Hipp, described the reasons behind the branch
| coverage testing in a bit more detail and mentioned that
| it was a requirement from one of their customers, which
| is where I got that idea. Sorry I don't have a specific
| reference.
| masklinn wrote:
| Right so really more of an expressivity issue: the compiler
| is not smart enough to remove _all_ branches which can not
| happen in a given program, so some of those branches will
| be completely untestable despite being in the final object
| code e.g. have a vec![_;4], structurally use it such that
| the index can only be in-bounds, the compiler may not be
| able to elide the OOB checks because it might not
| understand they 're unnecessary for real-world code.
| tptacek wrote:
| Frankly, I think it's a pretty silly concern.
| cormacrelf wrote:
| It's substantially less silly than a small-time
| JavaScript component library adding 100% branch coverage
| testing requirements as a blocker to accepting a PR. But
| they do it for the same reason, to be able to advertise
| it and demonstrate reliability. This is how the sqlite
| project makes money. I guess someone's got to build the
| instrumentation tools that let them keep doing this,
| sounds like it won't be them. (Edit: not sure who else
| would have the motivation, to be honest. If they had to
| pioneer one thing, it should be that.)
| opheliate wrote:
| With the discussion of getting Rust into the Linux kernel, I
| think there's more interest in graceful recovery from OOM
| errors. The new Allocator API will (maybe?) help this.
| masklinn wrote:
| > The new Allocator API will (maybe?) help this.
|
| The Allocator API is about the ability to mix allocators and
| provide "precise" (per-object) allocation strategies. That's
| orthogonal to fallible allocation, which is mostly about
| adding fallible versions of possibly-allocating APIs, and
| being able to statically remove access to the non-failing one
| (and being able to implicitly reject any dependency relying
| on those APIs) (and / or providing alternate implementations
| which expose a fallible API which is what
| fallible_collections does, but the stdlib seems to have gone
| with adding fallible APIs and probably adding a compiler
| feature / flag to be able to disable the non-failig ones)
| petters wrote:
| There is a C backend for LLVM. If/when that works, that should
| solve most of the portability issues.
| masklinn wrote:
| It's been dead for years, though the julia folks are
| apparently trying to resurrect it.
| zelphirkalt wrote:
| > 2. Safe programming languages solve the easy problems: memory
| leaks, use-after-free errors, array overruns, etc. Safe languages
| provide no help beyond ordinary C code in solving the rather more
| difficult problem of computing a correct answer to an SQL
| statement.
|
| Uhm ... If those were project wide "easy problems", then how come
| vulnerabilities in a project like Chromium are 70% caused by
| these "easy problems"?
|
| I'd say with that category of bugs on board, you cannot mistrust
| yourself enough and that there is nothing easy about preventing
| these bugs all the time over the life-time of a project.
| [deleted]
| nmstoker wrote:
| My interpretation is that they mean "easy" within the context
| of the problems raised within that question. See the final
| sentence of your quote.
| Scarbutt wrote:
| _Safe programming languages solve the easy problems: memory
| leaks, use-after-free errors, array overruns, etc._
|
| At least with visual studio I agree these are trivial to check
| for and solve.
| junon wrote:
| They're even easier to solve on Linux with appropriate tools
| (e.g Valgrind).
| tptacek wrote:
| Less than a year after this was published, Tencent released the
| Magellan series of sqlite RCEs.
|
| I think this is a fine page and it is eminently reasonable that
| sqlite remains a C codebase. In particular, I think he's right
| that rewriting sqlite in a memory-safe language would introduce a
| bunch of bugs and likely result in a couple of years of
| instability.
|
| But the "security" paragraphs in this page do the rest of the
| argument a disservice. The fact is, C is a demonstrable security
| liability for sqlite. The real position of the project is that
| memory safety security vulnerabilities are an acceptable tradeoff
| for an otherwise reliable database engine; in practice, people
| will deal with the exposure either by treating it as an
| externality (ie: baking sqlite into products where it is directly
| exposed as part of attack surface, and then throwing up their
| hands and issuing patches when RCEs are discovered) or by
| carefully positioning sqlite so it isn't a meaningful part of the
| attack surface.
|
| Both of these approaches are suboptimal --- that's why we call
| them "tradeoffs" --- and it is the case that if you held
| everything else equal (and you can't, but bear with me), sqlite
| would be a better piece of software written in, I guess, Rust;
| memory corruption wouldn't be one of the problems you need to
| consider (or blow off uncomfortably).
|
| Again: the argument as a whole, and this page --- fine! I use and
| like sqlite.
| eloff wrote:
| Presumably people would want to rewrite sqlite into Rust. But
| it's still a database, a low level, high performance software
| system. Even if you write it in rust, some percentage of the
| code will be unsafe rust or rust that calls a library written
| in C. It will be safer, but still not a panacea. It would be
| more viable in my opinion to improve the safety of the C code
| that currently makes up sqlite.
| tptacek wrote:
| I do not agree with this at all, for what it's worth.
| eloff wrote:
| Can you clarify why?
| masklinn wrote:
| The amount of code which would have to be `unsafe` in a
| library like sqlite would be absolutely minuscule if it
| would exist at all, and much easier to check for than
| _literally all of the codebase_?
| tptacek wrote:
| I don't think we understand enough about computer science
| or computer engineering (or something in the middle of
| those two things) to deliver large-scale C projects with
| rich, flexible interfaces safely. The last 20 years has
| just been a sequence of events where we've been surprised
| by new oversights, from buffer overflows to integer
| mishandling to uninitialized variables to UB. These
| problems compound; they're never solved, but rather
| beaten into development teams (at least the conscientious
| ones) and so even the decades-old problems recur, because
| it's not enough to know about the general pattern of a
| problem, you also have to viscerally understand all the
| combinatorics of those problems _and all the new code
| that you write_ , all of which has the potential to
| create some new scenario that allows a well-known
| vulnerability to re-emerge. And even if you manage to
| clean the Augean Stables this way, you're still S.O.L.
| when the next new memory corruption bug class is
| discovered. A mug's game, a bad bet.
|
| You can use formal methods to sidestep this! But my
| argument would be that at that point, you're really
| writing C In Name Only. By all means, if the aesthetics
| of Rust are that painful to you (I get it!), write C
| against a formal verifier. :)
| wk_end wrote:
| That's not a useful or constructive comment. At best it is,
| indeed, worthless; at worst, you're using your stature on
| HN to bully another commenter by implying that their
| opinion can be dismissed out of hand, because you, tptacek,
| say so.
|
| I don't really agree with it either - it overstates the
| issues with occasionally using escape hatches in Rust by
| implying that they're at all comparable with the problems
| inherent with using C. But yours was still an unnecessary
| reply.
| vlovich123 wrote:
| Let's be generous and assume that 5% of the overall code
| remains unsafe. That's 95% of the code that doesn't need to
| be checked extra. Additionally, that 5% is likely to remain
| static. With C any "securing" effort is a snapshot effort
| that bitrots quickly as more code is added.
| tptacek wrote:
| sqlite is renowned as a project that tests meticulously;
| they claim 100% branch test coverage, to the point where a
| reason they reject current Rust is that they can't achieve
| similar coverage against the branches the compiler itself
| inserts as safety checks (that is: they can't use Rust
| because they can't test safety checks that don't even exist
| in the language they use today).
|
| And the track record of that approach is, well, right there
| for you to see.
|
| The idea that what's needed for sqlite's C to be safe is
| more of the approach sqlite already uses seems pre-
| falsified, doesn't it?
| vlovich123 wrote:
| > sqlite is renowned as a project that tests
| meticulously; they claim 100% branch test coverage
|
| You realize that there are lots of test coverage metrics
| and none of them tell you anything about the correctness
| or safety of the code or about how thoroughly the edge
| cases have been tested, right? 100% branch coverage is
| neat and may indicate that the testing is thorough, or it
| might just indicate the project is chasing it as a
| metric. I'll bet on compile time lifetime and ownership
| analysis guaranteed by the language every time over
| metrics of how good the test suite is (a test suite is
| important but classes of bugs are just impossible in Rust
| and don't need testing/coverage in the first place).
|
| > that is: they can't use Rust because they can't test
| safety checks that don't even exist in the language they
| use today
|
| Can you back up this claim? This feels like a very FUD
| statement as any high level language (including C) has a
| risk that the compiler inserts branches that aren't
| present in the code. Regardless, AFAIK, coverage
| instrumentation happens at a level below where the
| distinction between C and Rust matters, so any coverage
| should be the same. I buy the argument that replicating
| the test suite may be time consuming bit, but the claim
| that there's something inherent about Rust preventing
| branch coverage feels extraordinary to me. Also, I'm not
| even sure where the extra branches are being inserted.
| Are you referring to drop statements the compiler injects
| for lifetimes?
|
| > And the track record of that approach is, well, right
| there for you to see.
|
| Constant CVEs and a fundamental inability to handle
| maliciously crafted files? I'm not shitting on the SQLite
| team. The project is amazing and a marvel. I'm just
| saying that its C heritage has some inescapable
| realities. Also remember that studies tend to show that
| the bug rate is pretty constant across languages in terms
| of LOC. This you want the language and stdlib doing as
| much as possible for you if you're prioritizing
| correctness and security.
| tptacek wrote:
| I'm simply citing the article we're commenting on.
| ansible wrote:
| > _sqlite is renowned as a project that tests
| meticulously..._
|
| Any language that can produce a C library interface
| identical to the existing one for sqlite3 would be a good
| candidate for a reimplementation.
|
| If you show me a Rust library that passes all the sqlite3
| tests flawlessly on my platform (typically x86-64) then
| I'd include that in my project, and sleep soundly at
| night afterwards. There _might_ be problems, but the
| chances are very low.
|
| Most other libraries don't nave nearly as good coverage,
| and it is much riskier to switch over an implementation.
| infogulch wrote:
| I think 100% machine code branch test coverage is
| _legally required_ in some of the environments that
| SQLite targets. That is, there must be a test that
| exercises both sides of every machine code branch
| instruction emitted in the final binary. Basically, every
| branch must have a justification for why it exists. I
| feel like that 's not such a bad target for rust to aim
| for.
| eloff wrote:
| It is better, definitely safer. But it's not perfect. I
| write rust on a daily basis. I've written a lot of C++ in
| the past. It's safer, but I have memory safety issues in
| both. It's more a matter of degree.
|
| For a large existing code base that are well written I
| think it's easier to improve the safety of the existing
| code than rewrite it completely in a safer language.
| tptacek wrote:
| People talk about having memory safety issues in Rust and
| Go, and my general reaction is that these claims tend to
| be pretty artificial (for instance: they've managed to
| introduce concurrency bugs that abort their program). If
| you've got a war story about a security-relevant memory
| corruption vulnerability you managed to introduce into a
| Rust codebase (that wasn't in straight-up `unsafe` code),
| I'd be interested in hearing more about it.
|
| My position right now, before hearing that war story, is
| that Rust vs. C is more than a difference of degree. It's
| a difference of degree in _security writ large_ , to be
| sure. But for memory corruption? I'd say the distinction
| is close to categorical.
| pjmlp wrote:
| It was already like that in the Pascal/Modula/Ada vs C
| days.
|
| Just because it isn't 100% bullet prof, just 95% better,
| the C team always advocates it isn't worth the effort.
| eloff wrote:
| > If you've got a war story about a security-relevant
| memory corruption vulnerability you managed to introduce
| into a Rust codebase (that wasn't in straight-up `unsafe`
| code), I'd be interested in hearing more about it.
|
| No, they're in unsafe code. Except Go where you don't get
| memory safety with some kinds of race conditions. I'm
| debugging a segfault in Rust today as it turns out. I
| guarantee it's in unsafe code. But I have some unsafe
| code. A database would also definitely have quite a bit
| of unsafe code.
|
| It's a lot better than the situation in C. But rewrites
| always introduce bugs. And sqlite is so well written and
| tested, it's not a low quality code base. I don't think a
| rewrite makes sense here, it'd be better to improve the
| existing code to make it safer. That's my opinion anyway.
| CraigJPerry wrote:
| >> some percentage of the code will be unsafe rust
|
| What would be the significance of that? Unsafe blocks in rust
| must still satisfy the borrow checker and all the compiler's
| safety rules. The only things unsafe gives you are:
| Dereference a raw pointer Call an unsafe function or
| method Access or modify a mutable static variable
| Implement an unsafe trait Access fields of unions
|
| From https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html
|
| The use of unsafe blocks does not automatically mean unsafe
| code?
| eloff wrote:
| Those guarantees are checked by the programmer, not the
| compiler. Mistakes and bugs can happen, just like in C.
| ksec wrote:
| >But the "security" paragraphs in this page do the rest of the
| argument a disservice.
|
| I mean the whole security and safe language paragraphs were
| _added_ after people literally start either harassing sqlite or
| claiming SQLite is unsafe.
|
| The paragraphs were an extremely polite way of saying No. We
| dont want to do switch to another language.
| cormacrelf wrote:
| > in practice, people will deal with the exposure either by
| treating it as an externality ... or by carefully positioning
| sqlite so it isn't a meaningful part of the attack surface.
|
| I think you're missing one of the ways. The one where people
| make no deliberate attempt to engage with the risk at all. No
| big throwing up of the hands. Sqlite is deeply embedded
| software, it takes years for a whole generation of smart TVs
| and security cameras and cars to find themselves in landfill
| and put a decade of vulnerabilities to bed.
| nine_k wrote:
| I suppose that a version of SQLite written in a safe(r)
| language will eventually appear, and hopefully become popular.
|
| But it will take a long time to mature. SQLite the project
| cannot _switch_ languages, sadly. The only way to migrate is
| that another project would grow beside in the meantime, mature,
| and become a viable and compatible alternative.
| tptacek wrote:
| I don't like C (correction: I _love_ writing C; I don 't like
| running code written in C), but I'd probably stick with
| sqlite for several years after the introduction of a
| competing Rust sqlike. But I also wouldn't expose sqlite
| directly to untrusted users (for instance, in the hellworld
| where I'm a designer on a major browser, I wouldn't make
| sqlite part of the Javascript interface of that browser).
| nine_k wrote:
| I see; I share much of the sentiment.
|
| But, tangentially, if not SQLite, what would you expose as
| a DB interface in a browser? SQL, while large and hairy, is
| both powerful and logical (in the relational part, not
| syntax). A plain KV store like dbm is no match to it. A
| Redis-like store is better but still more limited. Kdb's
| approach is sort of wonderful but geared towards time
| series and not much general-purpose. Is there an existing
| interface / language you would reuse to give the browser a
| _rich_ database interface?
| masklinn wrote:
| > SQLite the project cannot switch languages, sadly.
|
| I'm sensible to the issues of tradeoffs (e.g. platform
| support which is a completely fair issue) and manpower
| requirements and whatnot, but I don't see why it _can not_
| switch language. Converting libraries "inside out" _has been
| done_. Adding Rust support inside sqlite 's build system and
| migrating modules is technically feasible (again, not opining
| on whether it would be _worth it_ ).
| nine_k wrote:
| It cannot "switch" languages in a way like "we stop
| development in C and switch to development in XYZ only;
| C-based code will only get security updates".
|
| If can switch the officially endorsed "primary"
| implementation, but only after an alternative
| implementation has been around for a long time and was
| battle-tested, all the while the original C implementation
| continued to exist and develop.
| masklinn wrote:
| > It cannot "switch" languages in a way like "we stop
| development in C and switch to development in XYZ only;
| C-based code will only get security updates".
|
| At a purely technical level (ignoring issues of platform
| support and all) it pretty much can do that, actually:
| integrate XYZ into the build system, build new features
| in XYZ, start converting old features to XYZ, end up with
| an sqlite in XYZ.
| Jtsummers wrote:
| Adding Rust support would reduce (presently) the number of
| architectures and platforms that SQLite can target. That
| would greatly reduce its utility for a _lot_ of customers.
| Once Rust supports the same variety of architectures that C
| presently supports this will become a non-issue, but that
| 's unlikely to happen in the near term.
| masklinn wrote:
| > Adding Rust support would reduce (presently) the number
| of architectures and platforms that SQLite can target.
|
| I mentioned that but i don't think that's the subject as
| it does not warrant such a drastic assertion that sqlite
| _can not_ switch language.
| Jtsummers wrote:
| Ok, yes. SQLite _can_ switch languages _if_ the objective
| is to cutoff a large number of customers. Until Rust is
| well-supported on the variety of architectures that C is
| supported on, then that will be the result.
| justin66 wrote:
| > Until Rust is well-supported on the variety of
| architectures that C is supported on
|
| There's no particular reason to think that this will ever
| happen. SQLite runs on a lot of crazy stuff that Rust
| won't want to bother with.
| CraigJPerry wrote:
| I think https://gitlab.com/cznic/sqlite has become pretty
| popular. It's a pure go implementation
| masklinn wrote:
| It is popular not because it's a good idea but because go:
| using cgo is a huge imposition, so recoding everything in
| go so it can be used from go is basically the norm.
|
| That is useless to sqlite itself, this exists only because
| of go's issues and is essentially unusable outside of the
| go ecosystem.
| pjmlp wrote:
| Dislike for cgo, is similar to the old JNI displeasure,
| nowadays plenty of Java libraries don't have any issues
| making use of JNI.
|
| The only issue I see with cgo is that they decided to
| follow such path instead of a proper FFI declaration
| support like most languages.
|
| Even Java is now having such capability thanks Project
| Panama.
| CraigJPerry wrote:
| I've noticed cgo dependencies massively slow down
| compilation.
|
| E.g. the hugo project built with and without extended
| features. Build without and it's all go, it compiles in
| the blink of an eye. The tooling in go, from a devops
| point of view, is surprisingly good. It's not just
| compile speed.
|
| Build with all the c deps and well you're compiling more
| code so of course it's going to be slower but it's
| disproportionately so.
| dathinab wrote:
| I don't think a rust version of sqlite would make sense,
| rewriting a very mature and well tested library rarely makes
| sense.
|
| Only if either:
|
| - there are anyway some _major_ changes comming
|
| - or major problems/improvements in maintainability and
| better external dev commitment/support
|
| could rewriting it make sense IMHO.
|
| Lets be honest many (most?) of the "recent" (~3years)
| security bugs sqlite had would most likely not have been
| prevented by using rust, go or similar.
|
| EDIT:
|
| I guess the biggest drawback of C is that it keeps
| contributors away, but I also might keep some of the
| contributors projects like sqlite might not want to have
| away. So depending on the maintainer it might be seen as a
| benefit not a drawback.
| PaulDavisThe1st wrote:
| >I guess the biggest drawback of C is that it keeps
| contributors away
|
| There's this idea floating around in some circles that if
| you could just adopt technology X, more people would
| contribute to a project.
|
| I myself was guilty of this ... believing that if we added
| a web frontend to the DAW I've worked on for 21 years, all
| those JS/webtech developers would show up.
|
| There's no platform or language you could choose for SQLite
| that would increase the number of contributors. C is
| already a wildly popular programming language, with far,
| far more practicing users than Rust. Rust may be The Cool
| New Thing at present, and in time it may possibly grow to
| something much bigger, bigger even than the C/C++ universe.
| But that's not true right now, and it also wouldn't be true
| even if SQLite was implemented using some impossibly
| performant JS or its cousin.
| tptacek wrote:
| The problem with the first sentence of this comment is that
| it also justifies keeping stuff like ImageMagick around.
| woodruffw wrote:
| > Lets be honest many (most?) of the "recent" (~3years)
| security bugs sqlite had would most likely not have been
| prevented by using rust, go or similar.
|
| Using SQLite's CVE list from 2020[1], we see 12
| vulnerabilities:
|
| * Two NULL pointer dereferences; it's impossible to produce
| a NULL reference in safe Rust.
|
| * Two integer overflows; Rust makes these harder, but not
| impossible.
|
| * Three UAFs; these are impossible in safe Rust.
|
| * One uncategorized segfault; there are impossible in safe
| Rust barring environmental constraints (like a stack
| overflow from unchecked recursion).
|
| * One segmentation fault from incorrect object
| initialization; this is impossible in safe Rust.
|
| * 3 SQL and table-level bugs; Rust is unlikely to have
| helped with these.
|
| By my count, that's 6/12 bugs are would be impossible in
| plain old safe Rust, and another 3/12 that would _likely_
| be prevented by normal best practices in Rust. I still don
| 't think this _necessarily_ means that we should drop
| everything and rewrite SQLite in Rust, but the raw numbers
| don 't back up the claim that doing so _wouldn 't_
| eliminate the actual security bugs that SQLite is seeing.
|
| [1]: https://www.cvedetails.com/vulnerability-
| list/vendor_id-9237...
| PaulDavisThe1st wrote:
| The 3/12 that safe Rust would have prevented would likely
| have been prevented by enforcing the use of current tools
| for C. Since neither Rust safety or lint-ish safety are
| enforced, I think this is a reasonable comparison.
| [deleted]
| woodruffw wrote:
| Show me a project that claims to consistently use static
| analysis tools for C and _doesn't_ ignore them, and I'll
| show you a liar!
|
| But more seriously: Rust's toolchain comes with linting
| built in, and the community as a whole is _much_ better
| about applying and _responding_ to static analysis
| results than the C ecosystem is. And that's even before
| we get to false positives, which (subjectively) C static
| analysis tools seem to spit out a great deal more often.
|
| And I say all of that as someone who's currently doing
| whole-program static analysis of C and C++! It's not that
| you _can't_ do it, it's degrees of ease and a culture of
| stringency that's lacking.
| dathinab wrote:
| Impossible in safe rust.
|
| But would a sqlite port limit itself to safe rust?
|
| (As a side note wrt integer overflows, rust only makes it
| easier to detect them during tests and provides neat
| methods for integer overflow aware code, but just that.)
|
| Anyway I'm surprised that there where more "preventable"
| bugs then I expected.
| woodruffw wrote:
| > But would a sqlite port limit itself to safe rust?
|
| I suppose they wouldn't have to, but why would they
| bother with unsafe? They proudly announce how few system
| APIs and syscalls they depend on, so they have no need
| for that (assuming they chose to not use any number of
| safe wrappers). Complicated self-referential data
| structures, perhaps, but that again is the kind of thing
| that could be exhaustively tested and tucked into a safe
| interface.
|
| And yes, you're absolutely right about integers: Rust
| _itself_ is not going to save you in release builds. But
| it _does_ avoid a major source of overflows in C
| (implicit conversions and promotions), and has explicit,
| fallible APIs that are easy to enforce as a lint.
| geofft wrote:
| > _Safe programming languages solve the easy problems: memory
| leaks, use-after-free errors, array overruns, etc. Safe languages
| provide no help beyond ordinary C code in solving the rather more
| difficult problem of computing a correct answer to an SQL
| statement._
|
| I see where the author is coming from, but I don't think this is
| quite true. The way that safe programming languages work is that
| they have a richer type system that knows about the semantic
| context of variables, which in turn is a tool that helps a lot
| with the "more difficult problems".
|
| For instance, one of the tools Rust uses for enforcing memory
| safety (data races and use-after-free, in particular) is that
| there's a distinction between "mutable" and "constant"
| references. But this is, really, a distinction between unique and
| shared references. If I am statically guaranteed the only holder
| of a reference to X, I can modify it; if some other part of the
| code might have a reference to X, I cannot.
|
| This is essentially a readers/writer lock enforced at compiled
| time, and it therefore is a pattern that makes it much easier to
| use actual readers/writer locks: the lock-for-read function gets
| you a shared reference and the lock-for-write function gets you
| an immutable one. And Rust makes it easy to say, you cannot
| unlock the lock (in either variant) until you return the
| reference, and you cannot accidentally leak the reference out of
| the scope of the lock.
|
| If you're using a readers-writer lock on, say, the schema of a
| table (many simultaneous readers can use a table, but only one
| task can alter the table and nothing else can touch the table
| while it's being altered), having the tools to meaningfully
| distinguish the cases and enforce that your mutable references
| don't get copied does actually make it easier to compute the
| correct answer to a SQL statement that's running at roughly the
| same time as an ALTER TABLE.
|
| Another of the tools Rust uses is tagged unions: a C construct
| like union U {char _x; int y;} would be memory-unsafe, so Rust is
| obligated to forbid it in safe code. Instead, you get an enum
| type that C would describe something like struct U {int tag;
| union {char_ x; int y;};}. If tag == 0, then you're on the first
| variant; if tag == 1, then you're on the second variant, and the
| compiler ensures that (in safe code) tag is never equal to
| anything else, never uninitialized, etc. And there are a bunch of
| language constructs to assist with this - for instance, the
| 'match' keyword lets you write cases for each variant, allowing
| access to x only if tag == 0 and y only if tag == 1. You're not
| allowed to access x or y directly at all outside of a match
| keyword or equivalent (like 'if let'), because that would defeat
| memory safety.
|
| But because you've got good syntax and compiler-checked support
| for handling tagged unions, you may as well use it for problems
| even if you don't care about memory safety / security. Take this
| union from the SQLite source code, for instance:
| https://www.sqlite.org/cgi/src/file?ci=trunk&name=src/sqlite...
|
| There are three types of tables (eTabType), normal, virtual, or
| view. There are three union variants with different data. Even
| ignoring the fact that some of the variables are pointers and
| others are non-pointers, the data _doesn 't make semantic sense_
| when interpreted as the wrong variant. If you have code that
| handles just a normal table, and you extend it to handle virtual
| tables or views too, you will need to make sure you're not
| unconditionally accessing addColOffset, pFKey, or pDfltList,
| because the information you get will be wrong. Rust's enforcement
| of memory safety means it also prevents you from making this
| logical mistake.
|
| I think we've been selling memory-safe languages as a tool for
| security, and I don't mean to detract from that argument at all -
| but there's also the fact that Rust and Go (and D and Vala and
| even modern C++) are newer languages that have been able to
| implement more things than C can, which in turn makes it easier
| to write correct programs in general.
| pjmlp wrote:
| We already had languages like that 10 years before C was
| invented.
|
| https://en.m.wikipedia.org/wiki/Burroughs_large_systems
|
| https://en.m.wikipedia.org/wiki/JOVIAL
|
| There are plenty of other examples when one looks for what was
| happening outside AT&T.
|
| C apologists like to pretend C is some kind of special gift to
| mankind in systems programming languages, and nothing was
| happening before C and UNIX came into scene.
| greenyoda wrote:
| Note: Article is from 2017.
|
| Original HN discussion (from 2018) for those who are interested:
| https://news.ycombinator.com/item?id=16585120
| lukeschlather wrote:
| Worth noting it looks like the section on Rust was added in
| reply to some of the offhand suggestions of rust in that
| thread.
|
| https://web.archive.org/web/20180317194408/https://sqlite.or...
| wudangmonk wrote:
| Its a well know cliche at this point for pretty much every
| program to be rewritten in the "safe" language of Rust. But it
| makes sense for C++ be in there too, before it was the Rust
| people wanting to rewrite everything, it was the C++ people
| wanting to rewrite everything.
| pjmlp wrote:
| It has been like that since the 80's as UNIX was gaining market
| share, but as Rust is gaining ground on that effort, it is easy
| to shit on the effort.
|
| Yes, even with its C underpinnings, C++ is better than plain
| old C, provided the safer types for arrays, strings and RAII
| resource management are used.
|
| Hoare was already complaining about C on his 1980 Turing Award
| speech.
| dathinab wrote:
| Wrt. what rust needs:
|
| A) I don't think this is needed you could just pin a older rust
| version. But if it's needed we are not quite there yet.
|
| B) I think that point has been meet.
|
| c) Depending on the definition of "obscure" this has been meet,
| but given that sqlite only requires _very_ little to make it run
| I guess the for the appropriate definition of "obscure" it is
| not fulfilled quite yet.
|
| D) This still needs a lot of work even just for line coverage the
| tooling isn't quite up to my standards tbh.. While my standards
| are pretty high, I fear the sqlite standards might be even
| higher.
|
| E) It's kinda meet but needs a bit more time to mature.
|
| F) I think this was more or less already meet in 2017, if not
| then it's by now.
| zinekeller wrote:
| > A) I don't think this is needed you could just pin a older
| rust version. But if it's needed we are not quite there yet.
|
| The problem I think is that SQLite is also used in embedded
| systems, they need to be predictable (for example, for all of
| its flaws C89 is still used in SQLite). So unless that there is
| a subset that is that stable, they won't move yet.
| dathinab wrote:
| While there are a lot of good points there are some which are
| strange IMHO (wrt. the save language section):
|
| > import complete binary SQLite database files from untrusted
| sources
|
| Standard sqlite3 contains features _which makes opening databases
| from untrusted sources quite dangerous_ and as far as I know
| there is no "un-trusted" open mode, through you might be able to
| compile a hardened/restricted sqlite. Either way it's true that
| using a "safe" language would not have helped here.
|
| > Safe languages insert additional machine branches to do things
| like verify that array accesses are in-bounds. In correct code,
| those branches are never taken. That means that the machine code
| cannot be 100% branch tested, which is an important component of
| SQLite's quality strategy.
|
| But sqlite favors the use of asserts and what this languages
| insert are basically asserts...
|
| > Safe languages usually want to abort if they encounter an out-
| of-memory (OOM) situation.
|
| This might be true about go (idk.) but isn't this generalizing a
| bit to much?
|
| Just to be clear that sqlite is and stays in C makes totally
| sense I just feel the author tries a bit to hard to find
| arguments beyond the necessary ones.
|
| I mean:
|
| - When it was written there where no chooseable "safe" language
| alternatives.
|
| - Sqlite is _extremely_ well tested, many (most?) of the recent
| bugs where not of the kind any safe language would have
| prevented.
|
| - Safe languages don't necessarily help you with avoiding logic
| bugs. And while many do have additional abstractions/tooling to
| help with preventing logic bugs rewriting sqlite is more likely
| to introduce more logic bugs then it would prevent in the long
| runs. Rewrites are always a good chance to both fix bugs but also
| introduce new bugs.
|
| - Sqlite is extremely portable non of the safe languages have
| quite that level of portability sqlite wants to provide.
___________________________________________________________________
(page generated 2021-08-23 23:02 UTC)