[HN Gopher] I Went to SQL Injection Court
       ___________________________________________________________________
        
       I Went to SQL Injection Court
        
       Author : mrkurt
       Score  : 1149 points
       Date   : 2025-02-25 18:39 UTC (1 days ago)
        
 (HTM) web link (sockpuppet.org)
 (TXT) w3m dump (sockpuppet.org)
        
       | tptacek wrote:
       | Kurt posted this to troll me. Just know my audience here was,
       | mostly, non-technical people involved in politics in my local
       | Chicagoland municipality.
       | 
       | Permit me a PSA about local politics: engaging in national
       | politics is bleak and dispiriting, like being a gnat bouncing off
       | the glass plate window of a skyscraper. Local politics is, by
       | contrast, extremely responsive. I've gotten things done ---
       | including a law passed --- in my spare time and at practically no
       | expense ( _drastically_ unlike national politics).
       | 
       | An amazing thing about local politics, at least in a lot of
       | places, is that they revolve around message boards. The boards
       | won't be in places you want to be (in particular: a lot of them
       | are Facebook Groups) and you just have to suck it up. But if you
       | enjoy participating in a community like HN, you can participate
       | in politics, too, and message-board your way towards making
       | things happen.
        
         | copypasterepeat wrote:
         | Would you care to elaborate which law you helped to pass?
         | 
         | Also, can you link to some good resources for someone who wants
         | to get off the sidelines and get more involved in Chicago
         | politics, whether the resources are on FB or elsewhere? I've
         | previously tried Googling for some but with very limited
         | success.
         | 
         | Thanks.
        
           | tptacek wrote:
           | We're the first municipality in Illinois to draft and adopt
           | an instance of ACLU's CCOPS model legislation, which requires
           | board approval at a recorded public board meeting before any
           | agency (most especially our police force) can adopt any form
           | of surveillance technology, given a broad (ACLU-supplied)
           | definition of "surveillance". Previous to that, our police
           | force could acquire arbitrary surveillance products so long
           | as they kept under a discretionary budget threshold; they
           | used that latitude to acquire a pilot deployment of Flock
           | ALPR cameras, and CCOPS was a response to that.
           | 
           | My real goal is zoning.
           | 
           | In Chicago itself, I have less clarity, but am optimistic
           | that somewhere on Facebook is a message board where the staff
           | at your alderman's office reads posts, and the most
           | politically engaged people in your neighborhood argue with
           | each other. That's your starting point (and maybe your ending
           | point). Just go, listen, and chime in with high-effort
           | comments. If you're used to clearing the bar for HN comments,
           | you're _way_ past the threshold of coding like a super-
           | thoughtful person in local politics.
        
             | pchristensen wrote:
             | My real goal is zoning.
             | 
             | God speed to you sir! What is your goal wrt zoning?
        
               | tptacek wrote:
               | The categorical elimination of single-family zoning along
               | with any building envelope restrictions that would make
               | as-of-right 3-flats uneconomical.
        
               | pchristensen wrote:
               | That would be an outstanding outcome! Is this just for
               | Oak Park, or beyond?
        
               | tptacek wrote:
               | You'd hope that Oak Park, Evanston, Wilmette, and then
               | Berwyn and Schaumburg could get this done, and then your
               | next step would be either Chicago (tough because of
               | aldermanic structure) or statewide, the way California
               | did. Either way: you start in one municipality and work
               | from there.
               | 
               | It helps that zoning _matters_ more in Oak Park (and
               | Evanston) than almost anywhere else in Chicagoland.
        
               | pchristensen wrote:
               | Why does zoning matter more in Oak Park and Evanston?
               | High demand from being on the El and close to Chicago?
        
               | tptacek wrote:
               | Yep. Historically both of these places basically exist to
               | concentrate the interests of the upper middle class and
               | to reinforce segregation. They're both basically Chicago
               | but with a better funded school system (because lawyers
               | and doctors get to funnel all their property taxes into
               | the school down the street from them), which makes them
               | highly desirable.
        
               | ChrisBland wrote:
               | There is no way you get Wilmette to change zoning.
               | They've fought with Small Cheval about the size of their
               | sign for like 9 months. I doubt you'd get any village in
               | the NT district to rezone - the Optima project was
               | pulling teeth, everyone is worried about overcrowding NT,
               | which as a single HS is pretty packed now
        
               | tptacek wrote:
               | The whole project is going to take many years. Even if we
               | fix Oak Park zoning in the coming year, it'll still be
               | years before anything significant gets built, and years
               | past that for us to serve as a test case.
        
               | HDThoreaun wrote:
               | New Trier can just build another campus like they did for
               | freshmen.
        
               | Spivak wrote:
               | It's might actually be easier to win the economics battle
               | by chipping away at restrictions on taller buildings. The
               | builders in my area are copy/pasting a 3-flat design all
               | over the place but it requires bargain-basement land
               | prices (literally building on former toxic waste dumps)
               | or money from the township because 3-flats make you have
               | to build wide.
        
               | tptacek wrote:
               | The muni I live in is very constrained (we're just 4
               | square miles, right on the border of the west side of
               | Chicago) and our land is overwhelmingly SFZ, so most of
               | the ballgame is getting SFZ lots opened up. The emerging
               | consensus is towards "missing middle" housing, which is
               | 2-40 units (but really, a medium term sweet spot in the
               | teens), where you're talking about buildings spanning
               | multiple lots.
               | 
               | That very little can economically be built on existing
               | SFZ lots even with relaxed zoning is actually a feature,
               | not a bug, for getting this done. People want change to
               | be slow. At least to begin with, it's better
               | strategically if it takes a couple years and gradual
               | tweaking to make lots of building happen.
        
               | cozzyd wrote:
               | Kam Buckner is trying to get something passed at the
               | state level (but wouldn't apply to Oak Park. https://ilga
               | .gov/legislation/BillStatus.asp?DocNum=3288&GAID... )
        
               | btucker wrote:
               | A step in the right direction last week for the largest
               | upzoning effort in the city! https://archive.is/QuOcJ
               | 
               | Of course the a vocal minority is fuming about higher
               | density.
        
               | fc417fc802 wrote:
               | Rather than the complete elimination of single family
               | (and by extension even larger lots) I feel like it ought
               | to follow something resembling an iterated 80/20 rule out
               | to huge rural lots at the far end. Notice that this would
               | imply a plurality of the land being zoned for the highest
               | density at any given time.
               | 
               | The thing that really kills density in most cases is the
               | height restrictions. A lot of the upzoning in my area has
               | resulted in ugly, wall-to-wall low-single-digit floor
               | count buildings with near zero setback. It's better than
               | single family but it isn't particularly dense and it's a
               | huge step backwards aesthetically.
        
         | hinkley wrote:
         | "Never doubt that a small group of thoughtful, committed
         | citizens can change the world: indeed, it's the only thing that
         | ever has." - Margaret Mead
        
           | Y_Y wrote:
           | Like a hedge fund? Or are we including those committed to
           | violence?
        
             | Terr_ wrote:
             | Probably not the intent of the attributed author [0] but
             | literally speaking the statement doesn't specific "ethical"
             | or "peaceful", no.
             | 
             | [0] https://quoteinvestigator.com/2017/11/12/change-world/
        
             | Muromec wrote:
             | Would would you ever _exclude_ ones committed to violence?
             | Violence consistently works.
        
             | 0x457 wrote:
             | It's about that it's a small-dedicated group that brings
             | change and not government or private institution. If it's
             | still hard to grasp, then think about how national
             | movements started.
        
             | hinkley wrote:
             | Snipers, patient 0's, drunk drivers...
        
         | chaps wrote:
         | Aaaaaaa! I need to finish my post! :(
        
         | zahlman wrote:
         | >The boards won't be in places you want to be (in particular: a
         | lot of them are Facebook Groups) and you just have to suck it
         | up. But if you enjoy participating in a community like HN, you
         | can participate in politics, too, and message-board your way
         | towards making things happen.
         | 
         | How do you figure out where to go?
        
           | tptacek wrote:
           | The way you'd expect: I bumbled through a bunch of different
           | Facebook Groups, starting with the one simply labeled for my
           | neighborhood, and followed cross-posts. Eventually I found
           | the two really important ones in my area (one is an
           | organizing group for local progressives --- I live in a very
           | blue muni, and the other is the main high-signal political
           | group for the area, in which all the village electeds
           | participate).
        
         | skissane wrote:
         | > Local politics is, by contrast, extremely responsive. I've
         | gotten things done --- including a law passed
         | 
         | You live in a country where local governments have the power to
         | make laws... in a lot of other countries they don't - or, to be
         | more precise, their lawmaking power is extremely limited.
         | 
         | Actually, even in the US, that's often true too - only local
         | governments with "home rule" can enact laws on any topic
         | (provided it doesn't contradict state or federal law), those
         | without it can only enact laws on specific topics authorised by
         | the state legislature. Some states grant home rule to all
         | counties and municipalities, others none, others to some but
         | not others (e.g. in Texas a municipality can give itself home
         | rule powers, with approval of its voters, but only once it
         | reaches a population of 5000).
        
           | bobthepanda wrote:
           | Even state legislators are, by their nature, pretty much
           | locally driven given the relatively small size of their
           | constituencies and thus the margin of victory.
           | 
           | Voters significantly underestimate their power even up to the
           | House level; AOC's first campaign was very scrappy and
           | resulted in a bartender unseating the chair of the
           | Congressional Democrat Caucus and likely successor to Nancy
           | Pelosi, and that was the first campaign in which anyone
           | bothered to primary him.
        
       | duxup wrote:
       | Very interesting read.
       | 
       | It does seem absurd to think of divulging schema as protected, as
       | described it allows for a magical sort of outcome where: "well
       | it's in a database you can't know anything about, and if you
       | can't tell me how to find it you're sol".
       | 
       | Working at a small company with lots of clients I wouldn't want
       | to hand out DB schema outright, but I also go out of my way to
       | search / get the client the data they want ... not reject them.
        
         | rectang wrote:
         | A private company wouldn't want to divulge their DB schemas
         | because it's advantageous for competitors to see how you're
         | doing things. That doesn't apply to government databases.
        
           | bornfreddy wrote:
           | Maybe. But now I'm _really_ curious how bad that schema must
           | be for them to hide it so viciously.
        
             | jrochkind1 wrote:
             | I think it's just an excuse to avoid making it feasible for
             | the public to get the data.
        
             | duxup wrote:
             | Your imagination can't cover how bad you might think it is
             | (and yet it isn't that bad).
             | 
             | Or at least I don't want to explain to "20 years later
             | Monday Morning Quarterback".
        
             | michaelmrose wrote:
             | Used to be relevant data was in a document but much is no
             | stored in specialized web apps whose data in turn is stored
             | in a db.
        
             | hot_gril wrote:
             | Maybe their schema has triggers and stuff
        
           | hinkley wrote:
           | Part of the reason I'm so... enthusiastic... about tech debt
           | is that I've worked a few times where we had a competitor
           | whose lunch we were stealing or who was stealing ours and the
           | ability or inability to copy features cheaply was
           | substantially the difference between us.
           | 
           | That quad graph of value versus difficulty that everyone
           | loves? It's not quadrants it's a gradient and the difficulty
           | dimension depends quite a bit on context. What's a 4
           | difficulty for me might be a 6 for someone else. Accidental
           | versus intrinsic complexity plus similarity to or
           | distinctions from things we have already done.
        
           | bob1029 wrote:
           | The schema on the last project I worked on was probably our
           | most important IP. Specifically, the ways in which we solved
           | certain circular dependency issues.
           | 
           | I wouldn't take the ability to design a schema for granted. I
           | don't think many people are any good at it. Do not
           | underestimate the value of your work products.
        
             | Xelynega wrote:
             | Is that not exactly what the person you're replying to is
             | saying?
             | 
             | Private companies don't divulge schema because it's
             | valuable IP.
             | 
             | Public entities IP belongs to the public, so there is
             | nothing to protect
        
           | chaps wrote:
           | Not quite, and the details get hairier the closer you look.
           | The database in-question here is an IBM system. The database
           | itself is used for government functions, making it FOIA'able,
           | despite it being managed by a third party company. IBM even
           | tried to argue that the schema was trade secret, but the
           | statute isn't straight forward. Here's my (successful)
           | response when they tried:
           | 
           | You mentioned on Thursday over the phone that IBM is not too
           | keen on having its database schema released, and, between IBM
           | and Chicago, is seeking an exemption under 5 ILCS 140/7(1)(g)
           | - an exemption that is only valid if the release of records
           | would cause competitive harm. This email preemptively seeks
           | to address that exemption within the context of this request
           | in the hopes of a speedier release of records. It is FOI's
           | belief that there is little room for the case for the valid
           | use of 5 ILCS 140/7(1)(g) when considering the insignificance
           | of the records in conjunction with the release of past
           | documents:
           | 
           | 1. Chicago released CANVAS's technical specification [1]
           | seven years ago. To the extent that the specification's
           | continued publication does not cause competitive harm, it is
           | very unlikely that the release of CANVAS's database schema
           | would cause any harm. 2. The claim that the release of a
           | database schema would cause competitive harm is not unlike
           | suggesting that the release of filing cabinets' labels can
           | cause competitive harm.
           | 
           | Furthermore, in your response, please be mindful that the
           | burden of proving competitive harm rests on the public body
           | [2].
           | 
           | [1] https://www.cityofchicago.org/content/dam/city/depts/dps/
           | Con... [2] http://foia.ilattorneygeneral.net/pdf/opinions/201
           | 8/18-004.p...
        
       | bobsmooth wrote:
       | What stands out to me about this article is the time between
       | court appearances. Seems like if you want to accomplish anything
       | in court you need to be prepared to spend years of your life on
       | it.
        
         | rectang wrote:
         | And of course, people and entities (private or as in this case
         | public) who have a lot of resources take advantage of that, a
         | state of affairs which often serves to perpetuate injustice
         | indefinitely.
        
         | barbazoo wrote:
         | I thought the same thing. Sure it's async but still you have to
         | keep this in your mind for a very long time.
        
         | lucb1e wrote:
         | Can confirm this is the case everywhere. Even before taking
         | anything to trial, one can spend months on trying to come up
         | with a mutually agreeable solution, in my case getting
         | seemingly one step further each time1. I'm not sure I'd not
         | just give up and move on with my life if this dragged on for
         | years and wasn't about something that majorly impacts my life
         | or that of a loved one
         | 
         | 1 Details: it was a warranty case, so first they agreed to
         | repair it, then they didn't do that (but maintained that they
         | were going to, whenever I asked about the status), then they
         | agreed to refund, then they didn't do that, then I set a
         | deadline, they iirc agreed, then they didn't pay, then I
         | included specifics of what my next steps would be (lots of
         | research here, seeing what even my options are and what I can
         | truthfully claim that won't get shot down by a judge later) if
         | they didn't pay before some other deadline (so I showed I was
         | serious now), then the deadline crept up and they finally
         | refunded the day before it would expire and I was frankly
         | disappointed because, by now, I was prepared and ready, and all
         | I got was the original sum that I had paid them. I checked the
         | legal interest rate and changing my demand to include that
         | simply wasn't worth wasting more time on this, and I didn't
         | find any sort of precedent that I could bill any time I
         | provably spent, not even to the value of minimum wage, so any
         | time you invest is just lost free time (which I didn't have
         | much of during that particular year). Protip: scroll down the
         | reviews before buying something worth more than a few tenners
         | from a small store. I wasn't the first person who had to
         | threaten litigation...
        
       | wswope wrote:
       | Anyone with a legal background willing to opine about potential
       | workarounds to this ruling?
       | 
       | Specifically, would a request for "data field labels" (i.e. a
       | column list without any table structure info) likely circumvent
       | the exemption?
        
         | gpm wrote:
         | I think that would run afoul of
         | 
         | > The one big limitation of Illinois FOIA (with FOIA laws
         | everywhere, really) is that you can't use them to compel public
         | bodies to create new records.
         | 
         | Unless for some reason they already had a list of columns
         | without table structure.
         | 
         | (Not that I claim to have a legal background)
        
           | duxup wrote:
           | Yes but what if we come up with a directive that every FOIA
           | request must be logged into a DB. Therefore every request is
           | automatically invalid as it requires we create a record!
           | 
           | /s
        
           | wswope wrote:
           | I had that thought too, but my naive rebuttal would be that
           | the column data already exists by default in any standard
           | RDBMS as information_schema.columns. No new record creation
           | required.
        
             | 0x457 wrote:
             | Yes, but that requires someone to execute a query on a
             | database and package it as a report?
        
         | Andys wrote:
         | Not a lawyer, but why not use opensource as an example? Many
         | successful public e-commerce websites have public schemas and
         | aren't all hacked.
        
       | pavon wrote:
       | Great read. Frustrating that the court ruled that a schema was a
       | file layout, since I don't think it is, but at the same time if
       | it didn't fall under that exception, there is a strong arguments
       | that would be considered "documentation pertaining to all logical
       | ... design of computerized systems". A schema is literally, the
       | logical design of the database, and the database is a part of the
       | computerized system. Once it was ruled that those examples are
       | "per se" exempt it was a long shot to argue that schema wasn't
       | covered by any of the examples.
        
         | paulddraper wrote:
         | How is a database schema not a file layout?
        
           | kasey_junk wrote:
           | The article describes why. 2 different db engines (or even
           | instances) can use different file layouts for the same
           | schema.
           | 
           | In many was sql is all about divorcing the schema from the
           | files.
        
             | tptacek wrote:
             | Another way to think about it is that if a SQL schema is a
             | file, so is an Excel spreadsheet template.
        
               | hot_gril wrote:
               | File or file layout? Cause both of these are probably
               | stored as files, .sql and .xltx respectively.
        
               | paulddraper wrote:
               | An Excel spreadsheet template is an arrangement of
               | rows/columns/cells which is encoded in a XML document
               | which is encoded in a ZIP file archive.
        
               | tptacek wrote:
               | I don't follow your point.
        
               | paulddraper wrote:
               | Yes, it's a file format.
               | 
               | (Kinda a file format inside a file format inside a file
               | format.)
        
               | tptacek wrote:
               | "Excel" is a file format, but my point is that if a
               | schema is a file format, so are _the contents of_ an
               | Excel spreadsheet.
        
               | atkulp wrote:
               | It's interesting that the opening analogy in the post
               | uses an Excel spreadsheet as a great way to explain a
               | database. It's such an easy next step to say the way an
               | xls/ods file is saved is a file format but the column
               | layout in the tabs/tables are the schemas. The court (and
               | the city) playing these games is so scary since it is so
               | biased toward all modern government data being covered by
               | FOIA exemptions.
        
             | ludston wrote:
             | But on the other hand, in all database systems the schema
             | is used to determine how the files are laid out. Although I
             | suppose the same thing could be argued for any data that is
             | stored in a file, excepting that a schema is metadata that
             | determines the organisation of data so it's a bit of a
             | special case.
        
               | tptacek wrote:
               | In a Microsoft Word document, the section headings also
               | tell Word how to lay out the Word document file.
        
               | hot_gril wrote:
               | Do you mean that section headings aren't a file layout?
               | That's their entire purpose.
               | 
               | Edit: If you're talking about the byte representation
               | only, I don't think section headings indicate the
               | placement of the body's bytes.
        
               | tptacek wrote:
               | You have found an argument that proves too much.
        
               | Xelynega wrote:
               | Yea coupled with the courts arguments the interpretation
               | of sections in a document as a "file format" means no
               | files with sections can be released via FOIA requests
        
               | Xelynega wrote:
               | Does your interpretation not mean that(coupled with the
               | court ruling that file formats can't be foia'd) any
               | document with sections cannot be requested via FOIA?
        
               | hot_gril wrote:
               | If this format is reused across many files, they might
               | want to give the contents of those docs in a different
               | format from the original.
        
               | ludston wrote:
               | Arguably, all requests for files could be returned with
               | all of the letters in the document but scrambled in a
               | random order soas to obfuscate the file layout.
        
             | hot_gril wrote:
             | There's a solid chance that the schema gives away what DBMS
             | is being used. But even if it didn't, I'd still call it a
             | file layout in this context.
        
               | tptacek wrote:
               | So?
        
               | hot_gril wrote:
               | So if you have the schema and the DBMS, you probably know
               | how data is arranged in the files ("files" in the
               | filesystem sense).
        
               | hyperpape wrote:
               | The parent asks "how is it not a file layout" not "can
               | you guess the file layout?" given it.
               | 
               | I am a human, you know I have a kidney, but I am not a
               | kidney.
        
               | hot_gril wrote:
               | If you send a copy of the code, is that sending the code?
               | If it is, what about sending a copy of the code with a
               | Caesar Shift?
        
               | chaps wrote:
               | Is your argument that government agencies should also
               | withhold the names of filing cabinet manufacturers? :)
        
               | hot_gril wrote:
               | Just that it's a file layout. Or even if you strictly
               | define a file layout as say an ext4, NTFS, or FAT file
               | tree, that revealing the schema is revealing the file
               | layout.
               | 
               | I don't know why they don't want to reveal file layouts,
               | but for whatever reason, they decided it was "per se"
               | exempt regardless of the security implications.
        
               | tptacek wrote:
               | It's obviously not a file format. The same SQL schema can
               | generate N different files, with N different layouts, for
               | N different databases. By the logic you're using
               | ("schema" + "database vendor" = "file format"), a Word
               | document outline is also a file format.
        
               | chaps wrote:
               | The DBMS is almost definitely going to be mentioned in
               | RFP or specification documentation. As it was in this
               | lawsuit.
        
               | darkarmani wrote:
               | The gov't releasing the hardware and software licencing
               | used in CANVAS already gives that away.
        
           | michaelmrose wrote:
           | Because it doesn't describe how data is laid out on disk.
        
             | hot_gril wrote:
             | Neither does a file layout. FS will decide that... even
             | then, not physically.
        
               | kelnos wrote:
               | We're talking about "file layout" at the application
               | level, not the filesystem level.
               | 
               | But your comment illustrates just how difficult it is to
               | nail these things down, based on inherently imprecise
               | language.
        
               | hot_gril wrote:
               | So you mean the filetree and file contents, as seen by
               | userspace program?
               | 
               | It's meant to be imprecise, because they didn't want some
               | "gotcha." If they say we won't reveal the disk layout,
               | technically you can't tell that from the filetree. If
               | they won't reveal the filetree, but this is SQLite, it's
               | always a single file. If it's file tree + contents, well
               | the CPU byte endianness might matter for some DBMSes,
               | even though you could just try both.
        
               | 0x457 wrote:
               | We can't FOIA details about how xls file laid out
               | internally, despite that xls file being FOIA'ble itself.
               | That's the file-format we're talking about.
        
           | dools wrote:
           | The schema describes the database layout. The file layout (if
           | you were going to call it that) in a modern RDBMS would
           | describe how the RDBMS implemented a particular database
           | layout as described by the schema.
        
           | hyperpape wrote:
           | It literally does not describe a file, and does not literally
           | describe the data layout of anything on disk (though with
           | enough knowledge, you may be able to infer facts about
           | probable layouts).
        
             | paulddraper wrote:
             | > does not literally describe the data layout of anything
             | on disk
             | 
             | Huh? Depends on the DMBS, but each InnoDB table is a file.
             | 
             | And the schema determines the file structure.
        
               | hyperpape wrote:
               | > but each InnoDB table is a file.
               | 
               | A table isn't a schema, it is a component of a schema,
               | and most databases don't use InnoDB.
        
               | paulddraper wrote:
               | > it is a component of a schema
               | 
               | So if you have the schema, you have the tables.
        
               | kelnos wrote:
               | Schema is an abstraction over the file structure.
               | Different RDBMSes will use different file layouts for a
               | given schema. The same RDBMS may even have different
               | engines that use different file layouts, or may change
               | file layout between major versions.
               | 
               | "Determines" is too weak: it must be "is". If "schema is
               | file layout" is true, then sure, a schema is a file
               | layout. But if it is merely "schema determines file
               | layout", then no, a schema is not a file layout.
        
               | hot_gril wrote:
               | Abstractions are notoriously leaky in DBMSes. First off,
               | they don't even use the same SQL spec. Give me a schema
               | that uses anything Postgres-specific, and I can tell you
               | what the bytes on disk look like for a given row or
               | index.
               | 
               | I think it's a moot point anyway because the language is
               | broader than just files in the filesystem sense, which is
               | basically what the court said too.
        
         | hot_gril wrote:
         | Schema is definitely software, a operating protocol, source
         | code, and file layout. Maybe also documentation.
        
           | tptacek wrote:
           | A schema isn't software in the sense imagined by the ILGA. If
           | it was, every Excel spreadsheet would be too, and Excel
           | spreadsheets are the basic currency of FOIA.
           | 
           | An "operating protocol" is a step-by-step list of things to
           | accomplish some action. It's a finite state machine for
           | humans. Obviously, a schema isn't that; a schema is
           | declarative, and an operating protocol is imperative.
           | 
           | The court definitively established that SQL schemas aren't
           | source code in the sense imagined by the ILGA. SQL queries
           | can be. Schemas are not.
           | 
           | See downthread for why a schema isn't a file format. In fact,
           | a schema is almost the opposite of a file format.
           | 
           | A court will look at the term "documentation" in the ordinary
           | sense of the word; as in, "a prose description and set of
           | instructions".
           | 
           | "Associated with automated data processing operations" isn't
           | an element in the statute; it's a description of all of the
           | elements.
        
             | hot_gril wrote:
             | If the Excel spreadsheet has formulas in it, it's software.
             | If you're just talking about the data in the sheet, i.e.
             | what you'd get exporting it as a CSV, then it's not.
             | 
             | Col types, unique/FK/PK constraints, default values, and
             | computed cols define the steps for handling row
             | inserts/updates/deletes. Even adding a uniqueness
             | constraint to an already-unique col will change how the
             | code interacts with it, specifically how it deals with
             | concurrency/locking. If they said it has to be an
             | imperative programming language, then it's not that.
             | 
             | If they said the schema isn't source code then ok, but I
             | still think it is.
        
               | tptacek wrote:
               | I assure you that Excel spreadsheets with formulas in
               | them are FOIA-able in Illinois. Since we can take that as
               | axiomatic, I think we can put "schemas are software" to
               | bed.
        
               | hot_gril wrote:
               | SQL schemas aren't Excel spreadsheets.
        
               | tptacek wrote:
               | That's fascinating, but you just claimed Excel
               | spreadsheets were "software" in the sense of the Illinois
               | FOIA statute definition, and they are not. QED.
        
               | hot_gril wrote:
               | You said that SQL schemas aren't software, and that's
               | what this lawsuit was about. If they explicitly say that
               | Excel docs (even w/ formulas) aren't software, I think
               | they're wrong, but that doesn't matter because Excel docs
               | aren't SQL schema.
               | 
               | Now if you want to go by Illinois definitions, SQL
               | schemas are file layouts, that's why the plaintiff lost.
        
               | tptacek wrote:
               | Again: the post explains why the court determined schemas
               | to be file layouts, and none of it involves any of the
               | logic you've supplied here. Even Chicago didn't try to
               | claim that a schema was a "software".
        
               | hot_gril wrote:
               | They didn't need to. In the first appeal, it didn't
               | matter because it didn't jeopardize security. In the
               | second appeal, they said it's a file layout.
               | 
               | You also said SQL schemas are declarative, not
               | imperative. Those are types of programming languages, so
               | software.
        
             | n_plus_1_acc wrote:
             | An Excel formula should be considerd a kind of software,
             | because you cab do code golf in it.
        
           | pavon wrote:
           | I think a schema will definitely be part of the source
           | listing, either in the main programming language source code
           | or in a some other file used to define or initialize the
           | database. But I don't think it _is_ software, any more than a
           | protocol is software. Software does something.
           | 
           | One tricky aspect of this is that even if the schema itself
           | as a higher level concept doesn't fit into any of those
           | definitions, all existing _instances_ of the schema are
           | likely considered either source listings or documentation. So
           | the instances are barred from release per se, and you can 't
           | ask the government to create new documents.
        
             | hot_gril wrote:
             | The schema defines how the DBMS sets up its tables and
             | such, so it does quite a bit imo. And if the schema isn't
             | stored in any doc cause just manually punched in CREATE
             | TABLE once, yeah what you said about creating new docs.
        
         | gregw2 wrote:
         | I completely agree with you that (unlike/despite the Supreme
         | Court ruling), database table/column schema design (and other
         | system designs) should fall under the Illinois statute as
         | "documentation pertaining to all logical and physical design of
         | computerized systems". It's interesting that the law did pick
         | up on that distinction between logical and physical design but
         | none of the parties described in this article did.
         | Logical/physical designs are not just about servers and
         | integrations, they are also about data.
         | 
         | I'm not sure why that wasn't argued by the state and the state
         | argued the database schema was a "file format". Per my
         | reasoning, the state still would have won, but for different
         | reasons.
         | 
         | I disagree with you slightly however and would say that the
         | schema table/column names should be considered not logical but
         | "physical design" while the business naming/meaning of tables
         | would be a "logical design" (or conceptual design). See
         | Wikipedia: https://en.wikipedia.org/wiki/Logical_schema
         | 
         | SQL injection is really about physical schema designs, not
         | logical ones (I do get that every bit of information including
         | business naming of tables/columns helps in an attack, but it
         | does change the degree of threat and thus the balancing tests
         | of the risk which are relevant per the definitions and case law
         | described in the original article.)
         | 
         | So in terms of what the law /SHOULD/ be, the law should _not_
         | include logical design as a security exception, only physical
         | design. It  /SHOULD/ be possible for citizens to do FOIA
         | requests and get a logical understanding of all the database
         | fields without giving them the SQL names that can accelerate
         | SQL injection attacks. In that way citizens could ask for the
         | data by a logical/business-named handle rather than a physical
         | one.
         | 
         | And the state should create logical models or provide data
         | dictionaries with business (not technical terms) on request as
         | part of their FOIAable obligations to their citizens for the
         | data they are maintaining.
         | 
         | My 2 cents as someone designing database schemas for 25+ years.
        
       | hnthrow90348765 wrote:
       | >just self-important message-board hedging
       | 
       | I can confidently say it does not stop at message boards for many
       | people, self included
        
         | tptacek wrote:
         | It's a real issue when writing an affidavit or testifying. Lots
         | of ingrained bad habits.
        
       | gowld wrote:
       | This is part of what discouraged me from going to law school. So
       | much of litigation is Kabuki theater, grant rhetoric not in any
       | way intended at achieving a just or logical outcomes, but
       | designed only to the person in power an excuse to decide however
       | they had already wanted to decide before the case was tried.
        
         | lucb1e wrote:
         | > So much of litigation is Kabuki theater, grant rhetoric not
         | in any way intended at achieving a just or logical outcome
         | 
         | Agreed, that is what this sounds like. What stood out to me is
         | the remark >>"only marginal value" is just self-important
         | message-board hedging<<: it's also simply correct, but the
         | author concluded that they shouldn't have said it because
         | "marginal" plus a bunch of explanation didn't have the
         | rhetorical value that "no" would have had
         | 
         | Someone could legitimately configure a WAF-like system to scan
         | for various ways of querying the database schema coming in as
         | HTTP requests (keywords like "information_schema", encodings
         | thereof, etc.), which will always be hacking attempts and can
         | be blocked. If you already have the schema, you can craft a
         | query without needing to bypass that restriction first. Is this
         | likely to be a serious barrier at all? No. Is it anything to do
         | with self-importance? I don't see how that's the case, either.
         | It seems simply correct that this is marginal (situated in the
         | margins, not the point, not important to discuss), but by
         | saying nothing but the truth, now the other side blows that up
         | to something much bigger and tries to get the court to agree
         | that, "see, their own expert says it has value!" And so this
         | expert concludes that they shouldn't have said it, that they
         | should have just said "no value" which I would say is wrong,
         | but _so marginally_ wrong that it 's hard to prove for the
         | opposing side that it is not fully correct, and thus being less
         | correct helps you in (this) court... so it's about rhetoric as
         | much as being an expert...
        
       | chaps wrote:
       | Hi everyone, I'm the plaintiff in this lawsuit. I'm still working
       | on my companion post for tptacek's post! I'll have it ready Soon
       | TM, but feel free to me any questions in the meantime here.
       | 
       | While you're waiting, check out this older post:
       | https://mchap.io/that-time-the-city-of-seattle-accidentally-...
        
         | doctorpangloss wrote:
         | What are the administrators of CANVAS hiding?
        
           | chaps wrote:
           | Hard to say. One of my personal drivers for this lawsuit is a
           | tip I received that said that Chicago has a list of vendors
           | whose tickets are dropped in the back-end. When I requested
           | that info, the city said they had no such list. I trust my
           | source, so having schema information could help figure out
           | the extent and if they were lying.
        
             | noboostforyou wrote:
             | Considering how much they fought to not release the schema,
             | there's probably a column named "exempt_from_penalty" or
             | something equally obvious.
        
               | thaumasiotes wrote:
               | If they lose in court they have to pay court-determined
               | attorneys' fees. That might be sufficient to get them to
               | appeal automatically.
               | 
               | This is a tension you sometimes see discussed in the
               | context of wrongful imprisonment, where one faction says
               | that if you get tossed in jail for 30 years over
               | something there was never any evidence that you did, the
               | state should have to pay a penalty, and another faction
               | says that if you penalize the state for randomly
               | imprisoning innocent people, those people will never be
               | allowed out of jail.
        
             | MBCook wrote:
             | Well that certainly sounds suspicious. But it could also
             | provide more damming evidence of targeting groups, people
             | skimming the till, bribes to make tickets go away, all sort
             | of fun shenanigans.
             | 
             | And boy they're fighting suspiciously hard.
             | 
             | Good luck.
        
               | Muromec wrote:
               | Bribes are most certainly not logged in the system under
               | the "bribes" column or codified in any way. The data
               | discovered through foi could show some patterns which are
               | suggestive of bribes, but the actual thing is negotiated
               | "off chain".
        
               | MBCook wrote:
               | That's what I meant. For example, people who have a
               | suspicious number of tickets dismissed. Or perhaps
               | certain employees that dismiss a suspicious number.
        
             | 9dev wrote:
             | Earnest question: If you suspect them of lying on the
             | issue, why would you trust them to release the full schema
             | in response to the FOIA request, and not just omit any
             | possibly incriminating columns?
        
               | cyanydeez wrote:
               | Many times the people answering the requests aren't part
               | of the conspiracy to commit random acts of malice.
               | Sometimes they're roped into it under threat of
               | termination.
               | 
               | And often times, the denials eventually lead to
               | significant reorg once judges and Congress can revise
               | laws to fix the ambiguities.
        
               | jrockway wrote:
               | It's always a possibility that some low level official
               | not in on the scam sees the FOIA request before
               | management tells them not to work on it. The more you ask
               | for, the less filtering there is going to be, simply
               | because of how people work.
               | 
               | If you're running the scam, you don't want to tell low
               | level employees about it, because they have no incentive
               | not to blow the whistle.
        
               | Muromec wrote:
               | Because this is not how government works. Most of the
               | time it's not a heavily entranched conspiracy. Once the
               | request is approved to go through by the legal
               | department, some technician will happily give you
               | everything you want and it won't be censored or tampered
               | with in process.
        
               | tptacek wrote:
               | How is this different from literally any other FOIA
               | transaction, computer-y or otherwise?
        
               | doctorpangloss wrote:
               | What is the theory then for why they do not want to
               | release this schema? Don't misunderstand me I appreciate
               | how important it is that people push the boundaries of
               | FOIA.
        
               | tptacek wrote:
               | The statute says they're not required to. For a couple
               | years, the statute did say that they had to, as we won
               | multiple cases in lower courts, but Chicago appealed to
               | the Illinois Supreme Court, and the outcome was that now
               | the statute exempts schemas.
        
               | lmm wrote:
               | By that logic there's no point investigating any crime or
               | doing any kind of audit. You increase the costs of
               | covering up, and put them in a dilemma - remember this is
               | exactly what brought down Nixon.
        
               | MBCook wrote:
               | The other answers here are great, but let's say you're
               | right.
               | 
               | If you release a whole DB of data you're going to have a
               | hard time covering something you removed up in such a way
               | that it's not noticeable. Gaps in keys, suspiciously
               | missing data for certain queries, etc.
               | 
               | Even if you do that perfectly, there are other data
               | sources to compare to. If the city said it issued 2500
               | parking tickets and made $7500 in January on some
               | financial report and the DB disagrees you have proof
               | something is going on.
               | 
               | Or you could crowd source people's parking tickets to
               | compare to the DB to see if everything matches. What
               | happens if one doesn't? If one's missing but the person
               | had the proof they paid it?
               | 
               | It could still prove useful.
        
           | butlike wrote:
           | 'ethnicity' header, 'net_income' header... wouldn't doubt
           | chicago could be cave man enough to do this
        
         | hathawsh wrote:
         | Kudos to you for enduring through this fight! We can only
         | achieve transparency when people choose not to be complacent.
         | Thank you.
         | 
         | What do you think are the next steps?
        
           | chaps wrote:
           | My first step is to actually finish my post :)
           | 
           | But after that, getting a reasonable law passed to fix this
           | now-broken nonsense.
        
         | mmaunder wrote:
         | Thanks for fighting the good fight for us all!
        
         | hn_user82179 wrote:
         | This older post was such a fantastic read, thanks for sharing
         | your story!
        
           | layoric wrote:
           | It's dated from ~2 weeks ago... is there other date
           | information I am missing?
        
             | hn_user82179 wrote:
             | ah no, I just said "older" since OP said it was older and I
             | wanted to distinguish from the SQL post that this post is
             | about
        
             | zettabomb wrote:
             | The HN post [0] is from February 9th, 2025, but the post
             | the person you replied to was referencing [1] is from
             | October 19th, 2018.
             | 
             | [0] https://sockpuppet.org/blog/2025/02/09/fixing-illinois-
             | foia/ [1] https://mchap.io/that-time-the-city-of-seattle-
             | accidentally-...
        
         | notjulianjaynes wrote:
         | Damn, this is impressive. I've been fighting with a state
         | agency since December for 17,000 emails. I don't think I've
         | ever tried to request emails and received zero push-back, but a
         | $33 million estimate just, _chef 's kiss_
        
         | maCDzP wrote:
         | Have you tried looking for information from the developer about
         | CANVAS? With any luck the developer has support documentation
         | online that describes CANVAS and maybe you'll be able to narrow
         | down your FOIA request.
        
           | manquer wrote:
           | I think the point of the lawsuit is less about CANVAS schema
           | itself and more about the ability of the government to hide
           | this kind of information from FOIA requests.
        
         | foota wrote:
         | > Normally, a flustered public records officer would just
         | reject a giant request for being for "unduly burdensome"... but
         | this sort of estimate is practically unheard of. So much so
         | that other FOIA nerds have told me that this is the second
         | biggest request they've ever seen. _The passive aggression is
         | thick_. Needless to say, it 's not something I'm willing to pay
         | for!
         | 
         | Welcome to Seattle :-)
        
           | geoduck14 wrote:
           | > that's the second biggest FOIA request I've ever seen!
           | 
           | -Guybrush, from The Secret of Monkey Island
        
         | foota wrote:
         | Out of curiosity, could you ask for something like "one row of
         | data from every table in the CANVAS database"?
        
           | mbreese wrote:
           | This is a technical solution to a people problem. My reading
           | is that the city doesn't want to give up this information. If
           | that's the case, a technical solution wouldn't work, no
           | matter how easy it is. And given that this has already gone
           | to the Illinois Supreme Court (and lost), the only solution
           | is what is discussed at the end: updating the law.
        
             | foota wrote:
             | I agree this is something of a technical solution, but the
             | court wasn't interpreting whether you could ask for rows
             | from a database, but whether you could ask for the schema
             | directly. I don't think the court had the option of saying
             | "you can't ask for the schema, but asking for a sample row
             | is ok".
        
               | chaps wrote:
               | The short answer is yes, you can do this. I've seen this
               | work for emails, where the request is basically, "Give me
               | the most recent email of blah@gov.com".
               | 
               | And yeah, the plan was to eventually submit a batch of
               | requests using the table names, similar to `SELECT * FROM
               | {table_name_from_schema_request} LIMIT 1`, but one FOIA
               | request per-table.
        
               | cyanydeez wrote:
               | Seems like you could asked for a verbally masked
               | description? Like an enigma coda specific to the FOIA.
               | 
               | "Describe to me the columns, in simple non-programmatic
               | english, and what the purpose of the table is for, for
               | each table related to parking tickets"
               | 
               | Essentially a human to schema DSL That is only
               | technically decipherable by the admin of the database.
               | Then you're not having actual code and only the admin
               | could decipher.
               | 
               | But yah, as you said, if the humans don't want to
               | disclose their foibles, how the request is filled is
               | technically meaningless.
        
               | chaps wrote:
               | I wish it were that easy easy. I'll go more into this
               | specific question in my post, but the short answer is
               | that FOIA does not statutorily require the creation of
               | new records in response to a request. The gov agency
               | creating a description of the data in response to the
               | FOIA request would be creating new records. It's silly.
        
               | cyanydeez wrote:
               | Yeah I can see that, seems like masking isn't creating a
               | new record, but obviously that's not how it's
               | interpreted, because you're using the human filling out
               | the form to interpret then return the data. FOIA
               | typically allow for redactions and that seemingly creates
               | new records because they have to redact things and
               | knowing what to redact is providing masked information
               | and that's a new record.
               | 
               | As such, they could claim all FOIAs that require
               | redactions shouldn't be fulfilled because a redacted
               | record is a new record.
        
               | Muromec wrote:
               | They don't do describe, as it creates the new document,
               | which is a blind spot of FOI
        
               | Muromec wrote:
               | I have once wrote a script that translated sql requests
               | into proper Ukrainian legalize invoking the equivalent of
               | FOI to quite citizenship statistics from the agency. It
               | worked, but they were not very happy when I had to get to
               | them on the phone.
        
               | numpad0 wrote:
               | No offense, but how can you be 1) insisting it's safe to
               | give up the information to you and 2) openly planning to
               | use the information obtained for further exploitation, at
               | the same time? You can't have the cake and eat it too,
               | unless the information available in 2) technically do not
               | depend on 1) but doing it this way would only save them
               | massive time or something.
        
             | berkes wrote:
             | > the only solution is what is discussed at the end:
             | updating the law.
             | 
             | That, and actually penetrating the data system and
             | subsequently "leaking" parts of it. Which is nearly always
             | illegal, but could be considered a form of "Civil
             | Disobedience" especially if done ethically - e.g. removing
             | sensitive data or leaking only aggregates of the data.
             | Either from outside, or by a whistle-blower.
             | 
             | I'm not saying "hack the government!". But I am arguing
             | that the pressure of "getting hacked" is like the pressure
             | of protests, blockades, occupying facilities etc, all of
             | which civil disobedience, and often simply illegal too. All
             | are tools in the belts of civilians to keep a government in
             | check. Extracting information that a government is not
             | willing to give but that would benefit the governed, should
             | IMO often be considered such a tool as well.
        
         | dataflow wrote:
         | I don't understand the argument that knowing the column names
         | doesn't help an attacker? Especially in a database that doesn't
         | allow wildcards, doesn't it make things much easier if you know
         | you can do '); SELECT col FROM logins, as opposed to having to
         | guess the column name?
         | 
         | And I don't think I disagree with the court on schema vs. file
         | layouts either. It's not the file layout, but it's analogous:
         | it tells you how the "files" (records) are laid out on the
         | "file system" (database tables). For example, denormalization
         | is very analogous to inlining of data in a file record. The
         | notion that filesystems are effectively databases itself is a
         | well known one too. How do you argue they aren't analogous?
        
           | ic4l wrote:
           | I agree with you. Knowing the exact column names can speed up
           | an attack and, in some cases, make it more feasible.
           | 
           | Why don't they just request disclosure of what's actually
           | stored and allow renaming of the columns? It seems odd that
           | knowing the exact column names would be necessary if the goal
           | is simply to understand what data is being stored and its
           | intended purpose.
        
             | lIl-IIIl wrote:
             | I wonder if that would be considered a "new report", which
             | they don't have to provide.
        
               | philipov wrote:
               | They can either have their cake or eat it. If they don't
               | want to obfuscate the column names, they have to provide
               | the data with the original ones.
        
             | thaumasiotes wrote:
             | > Knowing the exact column names can speed up an attack
             | and, in some cases, make it more feasible.
             | 
             | If I'm looking at a database, I like knowing column names,
             | but I like knowing table names more.
        
           | IshKebab wrote:
           | '); SELECT * FROM logins --
        
             | ic4l wrote:
             | `Especially in a database that doesn't allow wildcards`
        
               | IshKebab wrote:
               | Such as...
        
             | dataflow wrote:
             | This fails if either the UI sanitizes wildcards, or if the
             | database prohibits them, or if it produces so much data
             | that you can't ingest it in time, etc.
        
               | wglb wrote:
               | Sanitization almost always fails. This becomes an arms
               | race.
        
               | valenterry wrote:
               | If you do it wrong, yes. Sure, there is no 100% security,
               | but honestly, it's 2025. We already know the techniques
               | how to prevent SQL injection of _any_ kind. I wrote about
               | this here: https://valentin.willscher.de/posts/sql-api/
        
               | IshKebab wrote:
               | Right but the case that is being imagined here is a site
               | that perfectly sanitises * but somehow still allows SQL
               | injection? I don't think so.
        
               | dataflow wrote:
               | > Right but the case that is being imagined here is a
               | site that perfectly sanitises * but somehow still allows
               | SQL injection? I don't think so.
               | 
               | It could literally just reject anything with asterisks.
               | 
               | It doesn't even need to do anything perfectly, it just
               | needs to do it enough to produce hurdles for you. Like
               | blowing through the number of attempts you realistically
               | have remaining.
        
               | wglb wrote:
               | The parser isn't shown there, so it isn't clear what
               | would happen with weird input.
               | 
               | Have you had anyone do a penetration test on it?
        
               | IshKebab wrote:
               | There are trivial ways around all of those. `LIMIT 1`,
               | `SELECT .. FROM information_schema...`, etc.
        
               | dataflow wrote:
               | > There are trivial ways around all of those. `LIMIT 1`
               | 
               | LIMIT 1 limits row count. The issue here was columns.
               | Like a giant blob someone might've stored in there.
               | 
               | > `SELECT .. FROM information_schema...`
               | 
               |  _no such table: information_schema.columns_
               | 
               | > etc.
               | 
               | https://news.ycombinator.com/item?id=43181799
        
               | wglb wrote:
               | Injections don't always need ' _'. The statements
               | 1=1
               | 
               | and                 1=0
               | 
               | if injected into a query will give different answers if
               | SQLI exists.
               | 
               | There are MANY other tricks that don't involve '_'.
               | 
               | Besides, consider the number of valid queries done by the
               | application that involve '*'. You are not going to turn
               | that off.
        
               | mcv wrote:
               | It also fails if the system was written using
               | parameterized queries. I wouldn't expect a system to be
               | sanitizing anything if fails to take the most basic step
               | for db access. This whole discussion is only relevant for
               | systems developed by amateurs. SQL injection can only
               | work at all if you use string concatenation to create
               | queries, which you should never do.
        
             | qingcharles wrote:
             | Look everyone, it's Little Bobby Tables.
        
           | chaps wrote:
           | The Department of Justice disagrees and voluntarily releases
           | column and table names:
           | https://www.justice.gov/afp/media/1186431/dl?inline=
        
           | tczMUFlmoNk wrote:
           | You can always `SELECT table_name, column_name, data_type
           | FROM information_schema.columns`, which is part of the SQL
           | standard. https://www.postgresql.org/docs/current/infoschema-
           | columns.h...
           | 
           | Plus, generally if you have SQL injection, you have multiple
           | tries. You're not going to be locked out after one shot. And
           | there's only so many combinations of `SELECT
           | {id,userid,user_id,uid} FROM
           | {user,users,login,logins,customer,customer}` before you find
           | something useful.
        
             | zachrip wrote:
             | That's a good point, has anyone hardened a database by
             | locking out users who select columns that don't exist? Or
             | run other dubious queries? This would obviously interrupt
             | production but if someone is running queries on your db
             | it's probably worth it?
        
               | Waterluvian wrote:
               | On the surface that's a very attractive idea.
               | 
               | A sort of "you shouldn't be in here, even if we left the
               | door unlocked."
        
               | closeparen wrote:
               | So if you deploy code before you run the associated db
               | migration, or misspell a column name, you magnify the
               | impact from whichever code paths (& application tier
               | nodes) are running the broken SQL, to your entire
               | production environment.
        
               | zachrip wrote:
               | Yeah it's definitely something that could do more harm
               | than good to a company long term. But I'm sure there are
               | instances where this tradeoff is worth it. They would
               | invest more heavily in runbooks or maybe even ci that
               | runs migrations on deploy. Deleting columns would need to
               | be done on your deploy + 1. Probably no rollback at all.
        
               | snickell wrote:
               | Simple variation to a hard shutoff: immediately page
               | "significant risk a successful sql exploit was found",
               | and then slow down attackers:
               | 
               | If an SQL query requests an unknown table, log the error,
               | but have that query time out instead of responding with
               | an error. Or, even better, the offending query appears to
               | succeed, but returns fake table data, turning it into a
               | honeypot built-in to the DB. This could be done at the
               | application layer, or in the DB.
               | 
               | The goal is to buy an hour for defenders to determine how
               | to respond, or if its a red herring. There are a variety
               | of ways of doing this without significant user impact.
        
               | count wrote:
               | Zane Lackey (with Dan Kaminsky) gave a talk that
               | discussed doing literally that sort of things, back in
               | 2013. Zane went on to found Signal Sciences (acquired by
               | Fastly), doing this sort of stuff in the 'WAF' space.
               | 
               | https://youtu.be/jQblKuMuS0Y?t=866 (timestamp is when
               | Zane starts talking about it)
        
               | hunter2_ wrote:
               | I guess the main difference is that a WAF attempts to
               | spot things like injection (unbalanced delimiters, SQL
               | keywords in HTTP payloads where SQL shouldn't exist,
               | etc.) typically without knowledge of the schema, whereas
               | GP is talking about the DBMS spotting queries where
               | queries must exist but disagree with the schema. Might as
               | well do both, I suppose.
        
               | count wrote:
               | That's not what the talk is about - it's using dbms query
               | error logs to spot attackers. Stuff like "table doesn't
               | exist" or "invalid syntax" on your production database
               | can be extremely high signal indications that something
               | is wrong, potentially maliciously so.
        
               | brk wrote:
               | In the very early 2000's I worked at a company building
               | something along those lines. We could analyze SQL and SMB
               | traffic on the fly and spot anomalous access to
               | tables/columns/files, etc. Dynamic firewalling would have
               | been the next progression if the company didn't have
               | other issues.
        
               | wglb wrote:
               | I once did an security assessment for a product such as
               | what you describe. Among other problems with it, the
               | product itself had SQL injection vulnerabilities
               | 
               | For another example of what defenders are up against, see
               | https://users.ece.cmu.edu/~adrian/731-sp04/readings/Ptace
               | k-N.... This paper all but caused an upheaval in the WAF
               | industry.
        
               | whstl wrote:
               | WAFs help with this, but at the HTTP level. By putting
               | "information_schema", "sys.tables" in the filters.
               | 
               | Not the real solution, IMO, but WAFs are useful for more
               | than SQLi, and is the kind of tech you can ask money for.
        
               | brohee wrote:
               | If you are mature enough to do that, you're mature enough
               | to net SQL injections in the first place. There shouldn't
               | be that many handwritten queries to review in the first
               | place as most mundane DB access is usually through a
               | framework that handles injection properly...
        
               | zachrip wrote:
               | I disagree, if all it took was maturity then we wouldn't
               | see giant data breaches of the largest companies in the
               | world weekly.
        
             | default-kramer wrote:
             | A good DBA would restrict the account so that it can't
             | access the information schema. It's easy to imagine an
             | environment with a vigilant DBA and less vigilant web
             | developers.
        
               | IanGabes wrote:
               | This makes sense, but the the vast majority of tooling
               | including ORMs, autocomplete SQL IDEs, and even suspect
               | application code relies on table descriptions and
               | listings provided by the information schema
        
               | acka wrote:
               | That is why we have development and production
               | environments. The production environment is expected to
               | operate in a potentially hostile space and does not need
               | developer conveniences beyond the ability to generate
               | alerts and produce logs, which should be stored in a safe
               | way, everything else should be locked down as much as
               | possible.
        
               | mrgoldenbrown wrote:
               | My ide logging into my local dev copy of the DB and my
               | public facing prod application should not be using the
               | same SQL login.
        
             | HDThoreaun wrote:
             | Being able to inject doesnt mean you get the output of a
             | select. The inject can be on non-select statements.
        
             | dataflow wrote:
             | > You can always `SELECT table_name, column_name, data_type
             | FROM information_schema.columns`, which is part of the SQL
             | standard.
             | https://www.postgresql.org/docs/current/infoschema-
             | columns.h.
             | 
             | You can "always" do that? Well I just did that. My database
             | said: _no such table: information_schema.columns_
             | 
             | And what if my database had disabled this capability
             | entirely?
             | 
             | Also, is there anything implying SQL here at all? Can't
             | other databases with injection "capability" have schemas?
             | 
             | > Plus, generally if you have SQL injection, you have
             | multiple tries. You're not going to be locked out after one
             | shot.
             | 
             | No, you can't say it with such certainty at all. It really
             | depends on what else you're triggering in the process of
             | that SQL injection. You could easily be triggering
             | something (like a password reset, a payment transaction...)
             | where you're severely limited in your attempts.
             | 
             | > And there's only so many combinations of `SELECT
             | {id,userid,user_id,uid} FROM
             | {user,users,login,logins,customer,customer}` before you
             | find something useful.
             | 
             | account, accounts, password, passwords, profile, profiles,
             | credential, credentials, auth, auths, authentication,
             | authentications, authentication_info, authentication_infos,
             | authorization, authorizations, passwd, passwds, user_info,
             | user_infos, login_info, login_infos, account_info,
             | account_infos... should I keep going?
             | 
             | And these are just the logins/passwords; what if the
             | information of interest was something else, like parking
             | tickets?
        
               | lyu07282 wrote:
               | > You can "always" do that? Well I just did that. My
               | database said: no such table: information_schema.columns
               | 
               | Don't expect attackers to give up after one try. It
               | depends on the database software, not everyone implements
               | this exact ANSI standard for reflection but every
               | database supports reflection. That's why the first step
               | after finding a SQLi is to fingerprint the database
               | software and go from there.
               | 
               | > And what if my database had disabled this capability
               | entirely?
               | 
               | You can't disable it, lots of software, database
               | features, ORMs and clients rely on reflection. If a
               | client can query a table they also can retrieve metadata
               | about that table.
        
               | spoaceman7777 wrote:
               | You can definitely disable it, in a variety of ways, for
               | whatever role, user, etc. you wish to.
        
               | spuz wrote:
               | Absolutely, we have very strict lockdowns on the tables
               | and views available to the users that our application
               | uses. The permissions system in Postgres (for example)
               | are very extensive. We even deny delete and update
               | permissions for most tables so they become append only.
        
               | lyu07282 wrote:
               | Nevermind you are right its possible, but I still think
               | it breaks so much stuff that at least I've never seen
               | anybody doing it or recommending it. All kinds of ORMs
               | and migration tools would break for example. But I guess
               | it would be a defense-in-depth strategy.
        
               | jajko wrote:
               | Yeah those tools may break if such a change is introduced
               | suddenly, without testing etc. But that's not how normal
               | reality for most companies look like, such rules are
               | there for 2 decades at least. DBs are very old tech
               | without much change in past 20 years and this is DB
               | security 101.
               | 
               | Not even going into reasonability of ORMs, most of the
               | stuff I've seen or implemented added practically 0 added
               | value, and added hard-to-debug issues down the line as
               | software evolved. Cargo culting at its best, often done
               | on trivial schemas that could handle either direct SQL or
               | some sql-query-to-object mapping easily.
        
               | fifticon wrote:
               | Your reasoning and motivation is reductio ad absurdum. It
               | does not make sense to base your system security on
               | hiding from the public that your 'Users' table is called
               | 'Users'. If you are vulnerable to this attack, the guilt
               | rests on your deplorable application code, not whether or
               | not your schema table names are known. If we should
               | follow your logic, we would have to name our Users table
               | U_ZER_CLEVER_S because naming it something people could
               | guess would be a vulnerability.
        
               | fifticon wrote:
               | There is one further problem with this entire sub-
               | discussion: There are two mitigation strategies
               | discussed:
               | 
               | - A: guaranteed SQL-injection-proof (SQL injection
               | impossible.) - B: Having non-obvious table-names and
               | 'secure-defaults' (e.g. INFORMATIONSCHEMA disabled).
               | 
               | So, the original commenter says, he wants to _hide the
               | schema_, so that B can protect him in case of A. Well,
               | failure of A is Amateur Hour. If you fail on A, I highly
               | doubt you would have delivered correctly on B. To write
               | it out in plain text: If you have set up and manage an
               | application with SQL injection errors, I have a hard time
               | seeing you still taking care to disable /enable obscure
               | security defaults, or take care to avoid obvious and
               | trivial table names.
               | 
               | Just to put icing on the cake: As soon as you have an SQL
               | injection attack, a simple select * from randomTable or
               | DESC randomTable would give you the table COLUMNS, so it
               | utterly makes no sense to want to hide those column names
               | - you have already lost them! (in the case you are
               | arguing you need their protection in). ..Unless you argue
               | that the guy making sql injection applications ALSO has
               | set up a secure default to disallow select *..
               | 
               | In my experience, SQL injection is evidence of work of
               | the sloppiest and immature nature; it was bad in 2003,
               | and presumably still is.
        
             | aftbit wrote:
             | Ah so what you're saying is that we ought to rename our
             | logins table to "duckwords" because nobody will ever guess
             | that? Also we should probably store passwords in plaintext
             | but name the column "entercod3" because nobody will think
             | of that. Oh and we should use printf with %s to build our
             | queries right?
        
           | dmurray wrote:
           | And this part seems self-defeating:
           | 
           | > Attackers like me use SQL injection attacks to recover SQL
           | schemas. The schema is the product of an attack, not one of
           | its predicates".
           | 
           | If it's the product of an attack, but not the end goal,
           | surely it's of value to the attacker?
           | 
           | It seems clear to me that the statute does, as worded, in
           | principle allow the city not to disclose the database schema
           | - it would compromise the security of the system, or at the
           | very least, it would for some systems, so each request needs
           | to be litigated individually.
           | 
           | The proposed amendment sounds like a good way to fix this -
           | is it likely that will pass?
        
             | lmm wrote:
             | > If it's the product of an attack, but not the end goal,
             | surely it's of value to the attacker?
             | 
             | Well sure, but it doesn't help them attack. That's like
             | arguing that since the bank robber wants dollar bills,
             | dollar bills must be a useful tool for breaking into bank
             | vaults.
        
               | dmurray wrote:
               | If both sides agreed to the analogy of giving the bank
               | robber the blueprints to the vault, I think any lay judge
               | would agree that endangers the bank's security.
        
               | lmm wrote:
               | I'd say it's more like knowing the layout of the drawers
               | inside the cage. If a robber is inside the cage, they've
               | already won. And if an auditor is checking the bank has
               | what it says it does, they've got legitimate grounds to
               | ask which money is in which drawer, and "no, it's a
               | security risk" is not a good answer.
        
             | tptacek wrote:
             | Lots of things are "of value". That's not the bar the
             | statute sets. To the extent something isn't _per se_
             | exempted by the statute (as the outcome of the case
             | established schemas are), the burden is on the public body
             | to demonstrate that disclosure Would jeopardize the
             | security of the system.
        
               | hunter2_ wrote:
               | It still seems like a massively gray area: despite the
               | distinction between "would jeopardize" and "could
               | jeopardize" as explained by TFA, the definition of
               | "jeopardize" includes "danger" which means "could lead to
               | harm" not "would lead to harm" at which point it hardly
               | matters whether a thing "could endanger" or "would
               | endanger" the security of the system.
        
               | tptacek wrote:
               | "Would" versus "could" has nothing to do with why your
               | analysis doesn't hold. If something doesn't enable people
               | to attack a system, but is merely one of the valuable
               | things you could get from that system, it does not
               | jeopardize that system under Illinois law. The standard
               | of proof for the jeopardy doesn't enter into it, because
               | no claim of jeopardy has been made.
               | 
               | Again: this part of the case is settled. We didn't lose
               | at the State Supreme Court because the court was worried
               | there was jeopardy, but because they re-read the statute
               | as _per se_ exempting schemas as  "file layouts".
        
               | dmurray wrote:
               | > this part of the case is settled.
               | 
               | Maybe for this case, but it sounds like enough hinges on
               | the details of the system that in another database, a
               | court could uphold that there "would" be jeopardy instead
               | of there "could" be. So you won on the more fragile part
               | of the ruling.
               | 
               | On the other hand, interpreting the law as exempting
               | database schemas is something that can be applied to any
               | computer system, and it presumably sets a binding
               | precedent (I'm not familiar with Illinois jurisprudence,
               | but that's how I'd expect something called the State
               | Supreme Court to work) so losing on that point is worse
               | for future cases.
        
               | tptacek wrote:
               | Losing on what point? Everybody agrees it is bad schemas
               | are _per se_ exempt from FOIA. On the security concerns
               | of releasing schemas, we won in basically every court.
        
               | thaumasiotes wrote:
               | > If something doesn't enable people to attack a system,
               | but is merely one of the valuable things you could get
               | from that system, it does not jeopardize that system
               | under Illinois law.
               | 
               | The problem I have with this is that the schema isn't
               | something an attacker recovers for its own sake. It's
               | something the attacker recovers in order to further their
               | attack. This necessarily means that it does enable people
               | to attack the system. That's the only value an attacker
               | sees in it.
               | 
               | > Again: this part of the case is settled. We didn't lose
               | at the State Supreme Court because the court was worried
               | there was jeopardy
               | 
               | Doesn't matter to the discussion; the court, Supreme or
               | trial, can be wrong as easily as it can be right.
        
               | tptacek wrote:
               | I don't understand your argument. If I have a SQLI, I
               | can, as you acknowledge, fetch the schema. So what does
               | it matter if the schema is published a priori? All that
               | matters is whether I have SQLI.
        
               | thaumasiotes wrote:
               | No, as other comments in the thread have pointed out, you
               | can easily have an SQLI that doesn't send information
               | back to you. You may find value in _changing_ what 's in
               | the database even if you can't read from it.
               | 
               | If you _do_ have the ability to retrieve information,
               | then one of the first things you 'll do is retrieve the
               | schema.
               | 
               | And the _reason_ you 'll retrieve the schema, if you can,
               | is that it facilitates the attacks you actually want to
               | make. It has no value to you other than enabling your
               | attacks. This observation seems sufficient to answer the
               | question "does knowing the schema enable attacks?".
        
               | tptacek wrote:
               | There is a whole sub-field of software security dedicated
               | to retrieving information from SQL injections that don't
               | directly return results. This is not a plausible
               | objection.
        
               | underdeserver wrote:
               | How is it that this wording stuff isn't already decided
               | globally? I mean, the concept of dangling modifier has
               | existing for centuries, do the courts really decide this
               | kind of thing on a case-by-case basis by random dice
               | roll?
        
               | Paracompact wrote:
               | Whereas math, science, and engineering use language as a
               | vehicle for attaining truth, the legal profession too
               | often regards it _as_ truth.
               | 
               | The greatest legal scholars of the state of Illinois
               | believe there is more decorum in querying Merriam-Webster
               | than there is in reading tea leaves or consulting a Ouija
               | board, but they are wrong. All too often, jurists make
               | decisions based on unconscious accidents of wording by
               | their predecessors, then compound it with their own
               | fallible powers of interpretation and deduction, further
               | cementing their wrongness as "precedent." Instead of
               | addressing this core ambiguity of the FOIA exemption, or
               | attempting to appeal this nonsense interpretation of an
               | undefined term, or introduce better linguistic standards
               | to the legal profession at large, the path of least
               | resistance for victims of litigious violence is to add
               | _more_ complexity in the form of endless amendments. This
               | is what Matt and friends must now pin their hopes on.
               | 
               | Little wonder how one can spend a lifetime specializing
               | in the (martial) art of litigation.
        
           | AdamJacobMuller wrote:
           | > And I don't think I disagree with the court on schema vs.
           | file layouts either.
           | 
           | I disagree that the law _should_ prohibit disclosing  "file
           | layouts" but it's pretty clear that the law does block that,
           | and I fundamentally agree with you that schemas are directly
           | analogous to file layouts and thus restricted.
        
             | tptacek wrote:
             | A SQL schema literally does not indicate the locations of
             | data inside of a file. In fact, the whole reason schemas
             | exist is to decouple the relationships between table rows
             | and the pages and indexes that store that data. We had
             | relational databases before SQL, and there are non-SQL
             | relational (and non-relational) databases today, but you
             | program them, at the query level, with code that is aware
             | of what tables live where.
             | 
             | A schema is the _opposite_ of a file layout. A schema is to
             | a file layout what a Google search is to an IP address.
        
               | HDThoreaun wrote:
               | I dont think "file layout" has to mean the exact location
               | of every byte. An abstract file layout is still a file
               | layout.
        
               | tptacek wrote:
               | It is in literally no sense a layout; the whole point of
               | a schema is that it doesn't tie you down to a layout. SQL
               | schemas make sense _even in the absence of files!_
        
               | maratc wrote:
               | You suggest that we interpret "file formats" as exactly
               | this -- no more, no less. This approach is also called
               | "textualism". The other option is to interpret "file
               | formats" in the context of the law that includes these
               | words. Or: what exactly did the lawmakers have in mind
               | when they said that (a) government needs to provide
               | information; (b) except for several cases, of which one
               | is (c) "file formats". What kind of information did they
               | think it was ok for the government _not_ to provide?
               | 
               | I agree with the Court's argument that "the information
               | about how _the actual information_ is stored and
               | connected one piece to another " is what the lawmakers
               | meant in this case.
               | 
               | - If the actual information is stored in the files, the
               | government _does not_ need to disclose how these files
               | are organized ( "file formats").
               | 
               | - If the actual information is stored in the database,
               | the government _does not_ need to disclose how the
               | database is organized (database schema).
               | 
               | - If the actual information is stored in the block memory
               | -- with structs and pointers -- the government _does not_
               | need to disclose the structs and the pointers.
               | 
               | The "textualist" opponent would of course argue, as OP
               | did, that the second and the third example aren't
               | excepted by clause (c) because "when there is no file,
               | there could be no file format". This however is missing
               | the point (in my opinion), as it doesn't see the forest
               | for the trees.
        
               | Xelynega wrote:
               | How can you literally interpret the two words "file
               | layout" without it pertaining to the layout of a file?
        
               | maratc wrote:
               | We can successfully interpret the two words "guinea pig"
               | without it pertaining to either pigs or things coming
               | from Guinea, so I'm sure this is also possible.
        
               | numpad0 wrote:
               | DBs can be files on disk though? Besides they're a bit
               | like easy hand rolling powder mix for filesystems.
               | Filesystem entries has properties like filenames and
               | inode numbers and file contents. Databases has columns
               | like emails and membership IDs and their favorite
               | cookies. I don't think "file layout" is an absurd
               | framing.
        
               | dataflow wrote:
               | Let me put this differently.
               | 
               | If you tell me that you have a closet for your jackets
               | and another closet for your shirts, you're telling me how
               | clothes are laid out in your wardrobe. Specifically,
               | you're telling me that you're laying those out
               | separately, and able to deal with them independently,
               | with little interference between the two. It's not the
               | _entirety_ of the layout information, but it sure is some
               | of it.
               | 
               | If you tell me that you have a column for your first
               | names and another column for your last names, you're
               | telling me how names are laid out in your database('s
               | files). Specifically, you're telling me that you're
               | laying those out separately, and able to deal with them
               | independently, with little interference between the two.
               | It's not the _entirety_ of the layout information, but it
               | sure is some of it.
               | 
               | Sure -- in theory, you could be actually throwing
               | everything together into a dumpster, then paying enough
               | people to search it all in parallel when you want to
               | retrieve that red jacket. If you're _actually_ doing
               | that, maybe you could legitimately claim that you haven
               | 't divulged anything about your closet's layout by
               | telling me that shirts and jackets are separate. But
               | chances are pretty darn good you're not actually doing
               | that (and I would know this for a fact if I already
               | somehow knew you were actually using closets built by Joe
               | down the street), and thus actually are exposing layout
               | information by telling me that you're storing them
               | separately. One security implication of which is that,
               | the moment that I get a glimpse of your closet and notice
               | that it contains a shirt, _I know it 's not the one with
               | the jackets_, and I can skip it when trying to steal that
               | expensive red jacket.
        
               | tptacek wrote:
               | It's either a file layout or it is not a file layout. If
               | you write an affidavit saying it's "sort of like a file
               | layout", the conclusion will be that it is not one. Now,
               | the Illinois Supreme Court found that it was a file
               | layout (wrongly). But they didn't use any of this kind of
               | message board logic to do it; they pulled up a definition
               | for "file layout" from a technical dictionary (which,
               | ironically, pretty clearly established, even more than
               | this thread does, that schemas aren't file layouts), and
               | then they pulled up a definition of "schema" from
               | Mirriam-Webster, and the definition of "schema" was so
               | abstract it could have matched anything.
               | 
               | If anybody on the Illinois Supreme Court had known what a
               | schema actually was, we'd have won the case. Further, if
               | the definition of "file layout" had been more material to
               | the Chancery case, it would have been in the trial record
               | that it wasn't one.
        
               | dataflow wrote:
               | > Now, the Illinois Supreme Court found that it was a
               | file layout (wrongly). But they didn't use any of this
               | kind of message board logic to do it; they pulled up a
               | definition for "file layout" from a technical dictionary
               | (which, ironically, pretty clearly established, even more
               | than this thread does, that schemas aren't file layouts)
               | 
               | "Wrongly" was exactly what I just spent an hour writing a
               | long comment disputing, with a detailed explanation.
               | Specifically, with a real-world analogy between "a
               | description of the arrangement of the data in a file" and
               | "a description of the arrangement of the clothes in your
               | closet."
        
               | fc417fc802 wrote:
               | If I understand correctly, you're saying that you expect
               | items in a column to tend to cluster near one another on
               | disk. Notably though that doesn't give you any sort of
               | relative or absolute offset. Neither does it have
               | anything to say about, for example, blocks of different
               | types which might be interleaved. Or compression. Or
               | indexes. Or copy on write related garbage collection. Or
               | journaling. Or any number of other things.
               | 
               | Now if you wanted to argue that a schema serves the same
               | purpose as a file layout, ie that it's how a programmer
               | interfaces with the data, and that it impacts workload
               | performance, that would be fair enough. And given that
               | laws are all about intent perhaps that would be relevant.
               | (Or perhaps not. I didn't read about the case yet.)
               | 
               | But I think it's fairly reasonable to say that in typical
               | usage an SQL schema is decidedly not a file layout in a
               | literal sense.
        
               | dataflow wrote:
               | > If I understand correctly, you're saying that you
               | expect items in a column to tend to cluster near one
               | another on disk.
               | 
               | That's one thing I'm saying would be sufficient to
               | consider this file layout, yes. I'm not saying it's
               | necessary. Databases can obviously be row-oriented too.
               | Knowing that they _don 't_ cluster would also be layout
               | information. As could any number of other things.
               | 
               | > Notably though that doesn't give you any sort of
               | relative or absolute offset. Neither does it have
               | anything to say about, for example, blocks of different
               | types which might be interleaved. Or compression. Or
               | indexes. Or copy on write related garbage collection. Or
               | journaling. Or any number of other things.
               | 
               | It doesn't have to include offsets or any of those other
               | things. File layout information could be as simple as
               | "data should be aligned to a page boundary for
               | performance" or "this field must reserve space for up to
               | 16 characters" or even "data from different records
               | should not be stored in an overlapping manner, to allow
               | fast erasure"... I could go on. And notice the wardrobe
               | layout example doesn't have offsets either, but the
               | decision to separate jackets from shirts is absolutely
               | one about layout nonetheless.
               | 
               | > But I think it's fairly reasonable to say that in
               | typical usage an SQL schema is decidedly not a file
               | layout in a literal sense.
               | 
               | It is not _complete_ file layout information. But it
               | certainly can be _part_ of the file layout information.
               | 
               | Imagine you had a table with columns name1 VARCHAR(64)
               | and name2 VARCHAR(64) in that order. Now imagine you
               | modified a couple of bytes on the disk, such that you
               | swap the 1 and the 2. You can imagine a database where
               | that would be sufficient to confuse it into thinking the
               | two columns had swapped contents, right? Could you really
               | claim the schema didn't contain any file layout
               | information in that scenario, when it certainly affected
               | which bytes are interpreted as belonging to which
               | columns?
        
               | fc417fc802 wrote:
               | Note that "some information related to the file layout"
               | or "some information that has an impact on the file
               | layout" is not "the file layout" in a literal sense. Thus
               | it seems to me to follow that the answer to the question
               | "is this a file layout" should be no.
               | 
               | Symbolically it isn't [ schema -> file layout ] it's [
               | schema, engine version -> file layout ]. Even if you had
               | that additional information, neither item by itself nor
               | even the pair together would be correctly considered a
               | file layout. If I have a function f( foo, bar ) -> baz
               | neither a foo nor a bar is a baz. I can fairly trivially
               | fix a sandwich out of bread, peanut butter, and jam; in
               | no way does that imply that the three ingredients sitting
               | next to each other on the counter are a sandwich.
               | 
               | For that matter, even the [ schema -> file layout ] case
               | isn't _technically_ a file layout any more than a json
               | blob is an xml blob. Being trivially translatable doesn
               | 't change the definition.
               | 
               | Compare that with the question (also commonly asked by
               | courts) "is thing equivalent in intent (or use, or ...)
               | to other thing" in which case the answer might feasibly
               | be yes.
               | 
               | > Could you really claim the schema didn't contain any
               | file layout information in that scenario, when it
               | certainly affected which bytes are interpreted as
               | belonging to which columns?
               | 
               | In that example you have made an educated guess about the
               | file layout and then taken advantage of that (guessed)
               | information. "You can imagine a database" tells you
               | everything you need to know here, namely that this is
               | entirely dependent on the implementation. So yes, I would
               | claim that the schema did not on its own contain any file
               | layout information though in conjunction with knowledge
               | of the implementation it could be used to derive such.
        
               | dataflow wrote:
               | > I can fairly trivially fix a sandwich out of bread,
               | peanut butter, and jam; in no way does that imply that
               | the three ingredients sitting next to each other on the
               | counter are a sandwich.
               | 
               | What is "sandwich" in this analogy? Nobody is claiming
               | the schema is a "database", or a "table". I was saying
               | it's one component of the file layout.
               | 
               | Using your own analogy: if you know you put the jam near
               | the peanut butter, you know part of the ingredient
               | layout. You can't say "it's not ingredient layout if you
               | haven't told me where the bread is."
        
             | WatchDog wrote:
             | It seems like an unnecessarily ambiguous term.
             | 
             | Without additional context, I would interpret the term
             | "file layout" to mean the file and directory structure of
             | an application.
             | 
             | Such an application could potentially store data as plain
             | files, the names of those files may contain personal or
             | sensitive information.
        
               | thaumasiotes wrote:
               | > Without additional context, I would interpret the term
               | "file layout" to mean the file and directory structure of
               | an application.
               | 
               | I would interpret it to mean a description of what the
               | file contains and where. This is information you need if
               | you have a mysterious file and you want to parse it. It's
               | also information you need if you have some data and you
               | want to create a readable file that expresses it. But for
               | the concept to apply to a database schema, (a) the
               | database would have to be a file, and (b) the schema
               | would have to specify where the information in the
               | database is stored. That's difficult to do, since the
               | schema has no knowledge of how much information there is
               | in the database or how it might be written down.
        
               | AdamJacobMuller wrote:
               | > It seems like an unnecessarily ambiguous term.
               | 
               | Agree, and, I don't even understand why it's in there in
               | the first place (it should just not be) but that's a job
               | for the legislature to resolve, not the courts.
        
             | dataflow wrote:
             | >> And I don't think I disagree with the court on schema
             | vs. file layouts either.
             | 
             | > I disagree that the law _should_ prohibit disclosing
             | "file layouts"
             | 
             | Note, the court wasn't ruling what the law should say, only
             | what the law says. At least that's my understanding of it.
             | I certainly wasn't opining on what the law should say.
        
               | AdamJacobMuller wrote:
               | Understood. I mention that distinction only because I
               | find many people (not you) who say that "X law doesn't
               | apply because if it did, it would be bad" vs directing
               | your ire at the actual laws, which are poorly written and
               | the legislators who are negligent in fixing those laws.
               | 
               | Courts should decide based on the law, not based on what
               | is "good".
        
           | mcv wrote:
           | Yeah, I think it's still useful info for an attacker. But
           | only if the system was actually developed by amateurs who
           | never heard of parameterized queries.
           | 
           | I find it a bit bizarre that the city uses "our system was
           | developed with no consideration for security" as a valid
           | defense.
        
           | gwd wrote:
           | > I don't understand the argument that knowing the column
           | names doesn't help an attacker?
           | 
           | So Kevin Mitnick supposedly did most of his hacking using
           | "social engineering". He'd call up some person, pretend to be
           | in some other department within their organization, and ask
           | them for some specific bit of information he needed to
           | further his attack (or ask them to change some specific thing
           | that would allow him to further his attack).
           | 
           | Would knowing the structure of Illinois governmental
           | organizations help someone perform social engineering attacks
           | against them? Yes, absolutely.
           | 
           | Should Illinois therefore keep the internal structures of
           | their organizations -- the department names and the officials
           | who run them -- secret? No, absolutely not.
           | 
           | First of all, if an attacker doesn't know them, they'll just
           | use _other_ social engineering attacks to figure them out;
           | i.e., hiding the structure doesn 't stop social engineering
           | attacks, it just slows them down. Secondly, the value to the
           | _public_ of being able to navigate governmental structures
           | far outweighs the cost of potential attacks.
           | 
           | This seems to me to be a direct analog: The "organizational
           | structure" is the "database schema", and the "willingness to
           | help a random person on the phone who seems to know what
           | they're talking about" is the "SQL injection vulnerability".
           | If an attacker knows the schema, their job is faster; but if
           | they don't know the schema, they'll just use attacks to
           | figure out the schema; so keeping it private doesn't stop an
           | attack, only slow it down. And the benefit to the public of
           | being able to issue FOIA requests far outweighs the cost of
           | potential attacks.
        
           | econ wrote:
           | If you have an injection friendly application then that is
           | the security problem.
           | 
           | Say someone hacks the db, is the problem easy to guess table
           | names? The column should never have be called "passwords"?
           | 
           | Perhaps 30 years ago that would sound good.
           | 
           | Obscurity should hardly ever be a line of defense. If it is
           | the only defense the problem isn't that it wasn't obscure
           | enough.
           | 
           | Edit:
           | 
           | I'll do you one better. If you so much as suggest that
           | obscurity is good security you actually openly invite people
           | to fool around with your applications. The odds holes are to
           | be found are much better than elsewhere.
        
             | HDThoreaun wrote:
             | What do you do when you know you've got a pile of poorly
             | written insecure software and no money to improve it?
        
               | econ wrote:
               | I probably delete everything and pretend it never
               | happened. It depends ofc on the worse case scenario. What
               | can i do/afford to deal with the greatest risk? I might
               | use it on a machine without internet.
        
           | fsckboy wrote:
           | > _It 's not the file layout, but it's analogous...How do you
           | argue they aren't analogous?_
           | 
           | laws don't get to be analogous
           | 
           | foia request: "I'd like the report the committee prepared
           | about the costs for the new bridge"
           | 
           | response: "denied. the report contains costs laid out in
           | tables with headings, which while not being schemas are
           | _analogous_ , with schemas not being files but being
           | _analogous_ "
        
         | ra wrote:
         | They can produce a report using english language labels instead
         | of the db column names. Their argument isn't fact it's
         | vexatious obstenance.
        
           | hennell wrote:
           | As mentioned in the post FOIA tends to only include existing
           | records/information, it doesn't extend to producing new work.
           | So producing a new report would be considered too much work.
           | (But fighting a lawsuit to not reveal the schema is fine )
        
         | gwerbret wrote:
         | Very interesting case! Just one question: to what extent do
         | changes in database schemata fall under FOIA in Illinois? That
         | is, if they should change the database schema to conceal
         | whatever it is they're fighting tooth and nail to hide, are
         | they compelled to retain detailed information about that
         | change? Or can they later present you (should the legislation
         | pass) with a cleaned-up, nothing-to-see-here updated version?
        
         | qingcharles wrote:
         | Matt, you do the Lord's work.
         | 
         | Bear in mind that Matt technically lost this, even with the
         | backing of some of the absolute best civil rights lawyers in
         | the country, Loevy and Loevy, fighting on his behalf. This
         | shows you the absurd difficulty in fighting city hall,
         | especially if you're crazy enough to do it without
         | representation.
         | 
         | The one thing working in our favor is what is proposed in TFA:
         | change the law. Once the state Supreme Court has ruled you're
         | hosed unless you can get an amendment. Illinois has a very
         | strong history of amending its FOIA statute, although a
         | proportion of those changes are to further protect information
         | from disclosure, not always on the side of sunshine.
         | 
         | Another change that needs to happen is strong punishment for
         | bodies who lose these fights. In Illinois this is limited to a
         | "$5000 civil penalty" against the body. What is a civil
         | penalty? It's vaguely defined. They used to throw the money to
         | the plaintiff, but in the later cases I fought they simply
         | awarded the money to the county. As one State's Attorney said
         | to me "I don't care if I lose every case, I just write a check
         | out to myself."
         | 
         | (one final note: be careful what you wish for when you
         | litigate, you can end up with an appellate decision like this
         | that solidifying in law the exact thing you were fighting. It's
         | nobody's fault, but it happens. I ended up with one absurd
         | decision that removed prisoners' rights rather than enhanced
         | them.)
        
           | tptacek wrote:
           | A losing public body is also generally on the hook for
           | attorney's fees, which can be considerable. But the general
           | problem here is that the public bodies are all spending
           | someone else's money, so the real deterrent you have is how
           | much of their time you can credibly threaten to eat up with
           | legal actions.
        
             | qingcharles wrote:
             | That's true, as long as you are represented. I knew one
             | lawyer in Illinois who would sit in FOIA court and take all
             | the non-represented persons aside and offer to take their
             | cases and split the attorney fees 50/50. I believe it isn't
             | strictly above-board, but it is a solution to a problem.
             | 
             | People don't like being put under oath, so you can somewhat
             | temper a public body's future refusals by deposing them or
             | sticking as many of them on the stand. Especially with
             | depositions, if you aren't represented then you can't be
             | giving any attorney discipline for asking completely
             | outrageous questions to force the deponent to admit crimes
             | etc under oath.
        
               | tptacek wrote:
               | I went up against my muni over their refusal to release
               | their police General Orders (which seems real dumb in
               | retrospect; we got the General Orders from most of
               | Chicagoland with no protest+). I reached out to Matt
               | Topic, who offered to sue for free, or send a nastygram
               | for a billable hour.
               | 
               | I ended up doing the latter, because I gotta work in this
               | town, but one consequence of fee recovery is that it's
               | much easier to get representation for a FOIA suit.
               | 
               | + https://github.com/jjarmoc/chicago-area-general-orders/
        
               | fsckboy wrote:
               | so the attorney gets half of what the attorney gets?
               | Zeno's Paradox.
        
             | avar wrote:
             | > so the real deterrent you         > have is how much of
             | their         > time you can credibly threaten         > to
             | eat up with legal actions.
             | 
             | Being threatened with billable hours? They must be
             | terrified.
        
         | monksy wrote:
         | What I want to know: How much malort does the city expensive a
         | year?
        
         | rnewme wrote:
         | The footer links to dead x account.
        
         | waitwhatwhoa wrote:
         | When can we submit witness slips? Is there a mailing list for
         | updates we can join? Good luck!
        
         | mcnichol wrote:
         | I don't want to take away any steam from your sails but giving
         | bad information in regards to case law shouldn't be taken
         | lightly. Your "expert witness" did you a disservice.
         | 
         | Schema is very much a critical field in terms of AuthZ
         | privileges. Just knowing the structure is not far off from
         | knowing the max entropy a password may hold. In regards to
         | InfoSec, table structure is the recon phase which limits effort
         | and minimizes time. Someone with that much time in security
         | knows DBs will be hacked, not if but when. Time is an
         | incredibly important tool which is why we have expirations on
         | so many authN and authZ windows of attack.
         | 
         | I'm glad that you are challenging them but I believe a credible
         | engineer would have made mince meat of your expert and hurt the
         | rest of us who want to see you successful.
         | 
         | It's possible rewriting certain statutes can help us but there
         | is no company worth its salt that would share DB schema.
        
           | thayne wrote:
           | > Just knowing the structure is not far off from knowing the
           | max entropy a password may hold
           | 
           | Not if the password is hashed, as it should be. Unless the
           | schema somehow indicates that it uses a hash algorithm such
           | as bcrypt that has a maximum password length. And even then,
           | if they pre-hash the password, the password itself could have
           | more entropy than that. And if there is a maximum password
           | length, then you can probably figure that out via other
           | means, like it being listed in the requirements when you set
           | your password. It does tell you the size of the hash of the
           | password, but if the maximum entropy is sufficiently high, as
           | it should be, then it doesn't really matter; it would still
           | be impractical to brute force.
           | 
           | > there is no company worth its salt that would share DB
           | schema
           | 
           | So you are saying that every company with a self-hosted or
           | open source product that uses a database isn't worth their
           | salt? If your DB is running on a customer's infrastructure,
           | that customer will by necessity have access to the schema.
           | And likewise if the source code for a product is publicly
           | available it is trivial to determine the schema.
        
       | probably_wrong wrote:
       | Random thought: someone should drive to Chicago, get a parking
       | ticket, and then make a FOIA request for all of their information
       | contained in that database.
       | 
       | It won't be the whole database schema, but it would be a start.
        
         | chaps wrote:
         | Short answer -- already been done.
         | 
         | This (spoiler) visualization's going into my eventual post
         | about the lawsuit: https://observablehq.com/d/026992341cc47ff0
        
       | lcnPylGDnU4H9OF wrote:
       | > where the only way to get at the underlying data is to FOIA a
       | database query
       | 
       | Was this ever attempted?                 SELECT * FROM
       | `information_schema`.`tables`;
        
         | chaps wrote:
         | Yep, that was done in the FOIA request related to this lawsuit:
         | select utc.column_name as colname, uo.object_name as tablename,
         | utc.data_type as type       from user_objects uo       join
         | user_tab_columns utc on uo.object_name = utc.table_name
         | where uo.object_type = 'TABLE'
         | 
         | https://www.muckrock.com/foi/chicago-169/canvas-database-sch...
        
           | lcnPylGDnU4H9OF wrote:
           | Yeah, it's obvious the double standard here, then. Curious
           | indeed why they are so adamant to keep the schema/data
           | secret.
        
             | noboostforyou wrote:
             | I said in another comment but I suspect the column names
             | themselves are incriminating (basically saying this person
             | doesn't get a ticket because they are in a special club,
             | that's probably not technically legal)
        
               | hot_gril wrote:
               | is_cop bool not null default false
        
             | kelnos wrote:
             | Because they know that eventually the data contained in
             | that table is going to be used to support some sort of
             | lawsuit that their parking enforcement activity is biased,
             | and is targeting people of color.
             | 
             | It's already ridiculous that they spent several _years_
             | blocking this request while it went through court. If the
             | plaintiffs spoke to pretty much anyone involved in
             | maintaining the system, or with any of their internal
             | infosec people, they would know that there 's no real
             | security risk to releasing this information.
             | 
             | They've already spent orders of magnitude more time and
             | money litigating the issue than it would take to just
             | release the information in the first place, so this is
             | clearly not a cost or resourcing issue.
             | 
             | They don't want to release it because they'd prefer it's
             | secret, because secrecy makes it harder for the public to
             | hold them accountable. That's all.
        
               | kasey_junk wrote:
               | There is an explanation for the fight that doesn't
               | involve something nefarious with CANVAS (though I think
               | CANVAS is dodgy from talking with Matt).
               | 
               | The precedent set here will let data journalists (like
               | Matt) setup effectively automated FOIA workflows on _any_
               | database they can get the name of for a FOIA request. So
               | even if _this_ db isn't dodgy it enables any of them that
               | are to be found quickly.
               | 
               | Or even less cynically, its just going to cost a ton of
               | resources to respond to all those automated FOIA
               | requests.
        
             | qingcharles wrote:
             | Public bodies tend to just want to resist FOIAs for the
             | sake of resisting them. I've never really been able to
             | fully understand the motivations, even after a decade of
             | FOIA litigation.
        
               | dragonwriter wrote:
               | I think it is likely to ne about budgets. That is, sure,
               | FOIA and similar state laws usually allow the agency to
               | collect something related to actual costs, but that's
               | mostly meaningless since even if actually covers staff
               | time it doesn't retroactively give them staff to cover it
               | in the impacts areas, and often the FOIA volume doesn't
               | effectively feedback into legislative budget processes
               | for _future_ staffing either, while their litigation
               | needs are more likely to feed back into the legal
               | staffing levels, so approving FOIA requests drains
               | working resources in the area covering them in a way that
               | fighting them does not in the immediate term, while
               | fighting them also has the longer term benefit (from an
               | agency perspective) of discouraging future requests.
        
               | qingcharles wrote:
               | In my experience (and probably in Matt's) this has 100%
               | not been the issue. The people responsible for the FOIA
               | responses aren't in any way connected to budgeting or
               | resources. It is just a body-wide personality issue. Some
               | aspect of maliciousness mixed with laziness... or
               | something.
        
               | dragonwriter wrote:
               | > In my experience (and probably in Matt's) this has 100%
               | not been the issue. The people responsible for the FOIA
               | responses aren't in any way connected to budgeting or
               | resources.
               | 
               | In my experience working in government, including on
               | state-equivalent-of-FOIA requests, almost everyone
               | working on those kinds of requests is "involved in"
               | budgeting and resources, and more to the point anyone in
               | a position to sign off on a decision of whether something
               | should or should not be denied as exempt is a manager,
               | for whom (that is, _for any manager, down to the line
               | level, over any function in any government agency_ , but
               | FOIA-type requests, eepecially if there are going to be
               | assertions of exemptions in total or in part, generally
               | involve coordination and signofffs between multiple
               | managers, e.g., from the most relevant line unit, the
               | public information unit, and legal) managing budgeted
               | resources and doing the work of justifying requests for
               | additional resources that is the root of the agency-
               | initiated budget change request process, and then
               | participating in drills and internal analyses and
               | responses as those proposals work through the budget
               | process is a _central part_ of their job.
        
       | Y_Y wrote:
       | Is it not absurd that the supreme and appeal courts disagreed on
       | a syntactical matter? Never mind that this isn't uncommon, or
       | that (IMHO) it would be ridiculous to interpret it as "any file
       | layouts at all, and other stuff too, but only bad other stuff".
       | It's crazy to me that were happy for laws to sit on the books
       | being utterly ambiguous.
       | 
       | I know this suits the courts who benefit from the leeway, and
       | that (despite valiant efforts) we're not going to get "formal
       | formal" language into statutes. I know that the law is an ass. I
       | know that the laws are written by fallible and naive humans.
       | 
       | Even after all that, if the basic sentence structure of what's in
       | the law isn't clear _to the courts_ , hasn't the whole system
       | fallen at the first hurdle?
        
         | tptacek wrote:
         | To me it feels like the kind of dispute that is exactly why we
         | have multiple levels of appeals court. The "file format" thing
         | is super dumb, and they got it wrong, but the "that if
         | disclosed" statutory interpretation is a thing that seems
         | important to get a final, consistent determination on.
        
           | Y_Y wrote:
           | Of course I can't disagree that it's good that it's now
           | settled. Still I can't help but imagine a world where the
           | meaning, at least in terms of which words apply to which
           | others (rather than qualifiers like "reasonable"), should be
           | settled before the law is debated, voted on, and passed.
           | 
           | Even (some) programmers have learnt the dangers of parsing at
           | run time (e.g. "eval is evil"). How can we decide it's the
           | law we want if we don't know what it means yet?
        
             | NoboruWataya wrote:
             | > How can we decide it's the law we want if we don't know
             | what it means yet?
             | 
             | FWIW, judicial interpretation of legislation is generally
             | seen as an exercise in figuring out what the legislature
             | meant. Courts start by looking at the "plain meaning" of
             | the words used, but where that doesn't yield an unambiguous
             | answer they will often look at the overall scheme or
             | purpose of the legislation to try and figure out which
             | interpretation is most consistent with that.
             | 
             | It's far from perfect of course, but it's not like
             | legislation just consists of a bunch of random symbols that
             | are later imbued with meaning by a court operating in a
             | vacuum. The meaning of most legislation is clear most of
             | the time. I'm sure the authors of the bill _thought_ it was
             | sufficiently clear, for any scenario they could contemplate
             | (or, at least, the ones they cared about). But it 's hard
             | to see every potential corner case (and if every potential
             | corner case _did_ have to be identified and settled before
             | the bill could even be debated, it 's likely Illinois
             | wouldn't have a FOIA today).
        
               | thaumasiotes wrote:
               | > Courts start by looking at the "plain meaning" of the
               | words used, but where that doesn't yield an unambiguous
               | answer they will often look at the overall scheme or
               | purpose of the legislation to try and figure out which
               | interpretation is most consistent with that.
               | 
               | There is also the concept of a "canon of construction",
               | which exists specifically to handle these kinds of
               | reoccurring grammatical issues. I'm surprised there isn't
               | one for dangling modifiers.
        
               | Paracompact wrote:
               | > It's far from perfect of course, but it's not like
               | legislation just consists of a bunch of random symbols
               | that are later imbued with meaning by a court operating
               | in a vacuum.
               | 
               | Isn't this exactly what happened? A court of computer
               | laypeople reached for _Merriam-Webster_ in order to
               | disambiguate a sample of programmer argot that was
               | written into law by another group of computer laypeople.
               | The legal profession isn 't just dirty, it seems doomed
               | to defeat itself in even its most rigorous practice.
        
             | Xelynega wrote:
             | That's not the only alternative though. Why are experts not
             | involved in the interpretation and it's left up to how two
             | seperate non-technical groups interpret it?
             | 
             | Other countries have legal specialists for different areas
             | and update their laws continuously based on expert opinion,
             | common law gets expert testimony but is based on
             | generalists to make the final determination
        
               | tptacek wrote:
               | Something something something Article III of the US
               | Constitution something.
        
           | olau wrote:
           | I find it slightly odd that you get hung up on the file
           | format thing. The law as you quoted it says "including but
           | not limited to" and the first example given is then
           | "software".
        
         | copypasterepeat wrote:
         | I am not a lawyer, but my understanding is that's just how the
         | justice system works. Reasonable people can disagree about what
         | exactly a complicated statement says, since language is full of
         | ambiguities. People have been discussing what the U.S.
         | Constitution says exactly from the day it was written and there
         | are still a lot of disagreements.
         | 
         | The standard response to this is that laws should be written in
         | ways that are non-ambiguous but that's easier said than done.
         | Not to mention that sometimes the lawmakers can't fully agree
         | themselves so they leave some statements intentionally
         | ambiguous so that they can be interpreted by the courts.
        
           | skissane wrote:
           | I've often thought we'd get more sensible results in court
           | cases on computer-related issues if we had specialised courts
           | where the judges were required to have a relevant degree
           | (computer science, software engineering, computer
           | engineering, information systems, etc). But I doubt it is
           | going to happen any time soon.
        
             | ptsneves wrote:
             | Civil code law uses that way of thinking, where there are
             | specialised courts for different areas: administrative,
             | civil, labor, family, commercial and so on. I actually am
             | not so sure it is great as these courts increase the depths
             | of the bureaucracy to the point of being self serving. They
             | also serve to segment expertise.
        
               | skissane wrote:
               | > Civil code law uses that way of thinking, where there
               | are specialised courts for different areas:
               | administrative, civil, labor, family, commercial and so
               | on.
               | 
               | This happens in common law countries too. For example,
               | the US has specialised courts (at the federal level) for
               | bankruptcy, federal government contract disputes (US
               | Court of Federal Claims), taxation (US Tax Court), among
               | others. It also has a nationwide appellate court (Federal
               | Circuit) with jurisdiction limited to certain topics
               | (patents, trademarks, federal government contracts, among
               | others), and another (DC Circuit) which despite being
               | technically geographic in practice also has topical
               | jurisdiction (many-but not all-lawsuits against federal
               | agencies). Many states have specialised courts for
               | various areas of law
               | 
               | It is very common in common law countries to have
               | specialised courts/tribunals (or divisions thereof-there
               | isn't a big difference between a specialist court and a
               | specialist division of a generalist court) to deal with
               | certain types of cases, especially bankruptcy, family
               | law, probate, child welfare, juvenile crime, patents,
               | taxation, administrative law, military law, immigration,
               | small claims - the exact set varies, but specialised
               | courts/tribunals/divisions are very common.
               | 
               | But I've never heard of a specialised
               | court/tribunal/division for computer cases
        
             | shagie wrote:
             | It happens from time to time.
             | https://www.theverge.com/2017/10/19/16503076/oracle-vs-
             | googl... ( https://news.ycombinator.com/item?id=15834800 42
             | comments)
             | 
             | > These days, he often looks for some kind of STEM
             | background for the IP desk. It's not necessary, but it
             | helps. Bill Toth, the IP clerk during Oracle v. Google,
             | didn't have a STEM background, but he told me that the
             | judge had specifically asked him to take a computer science
             | course in preparation for his clerkship. When I asked Alsup
             | about it, he laughed a little -- he had no recollection of
             | "making" Toth take any classes -- but he did acknowledge
             | that sometimes he gives clerks a heads up about what kind
             | of cases are coming their way, and what kind of classes
             | might be useful ahead of time.
             | 
             | Note that it's not necessarily the _judge_ that 's
             | important as an individual knowing the material, but that
             | the clerks who work for the judge are.
        
           | kmoser wrote:
           | Nobody reasonably expects all laws to be written completely
           | unambiguously. But since laws (and indeed all manner of legal
           | documents) are filled with lists and modifiers, I don't think
           | it's unreasonable to require that they be written to a
           | certain standard which defines how these lists and modifiers
           | should be interpreted, similar to RFC 2119
           | https://microformats.org/wiki/rfc-2119.
        
           | Xelynega wrote:
           | Correction, that is how common law legal system works.
           | 
           | Alternatives like codified law exist and are practiced, just
           | not in the US or Canada.
        
       | koolba wrote:
       | > [Public bodies] shall provide a sufficient description of the
       | structures of all databases under the control of the public body
       | to allow a requester to request the public body to perform
       | specific database queries.
       | 
       | I sure hope the impact of this is _not_ that government entities
       | switch to schema less databases!
        
         | CharlesW wrote:
         | "Schemaless" is like "serverless" in that there's always a
         | schema, even if it's not enforced by the database and instead
         | applied dynamically by the application layer.
        
       | SkidanovAlex wrote:
       | While I believe that the city should share the schema, and that
       | the city is effectively argues for security through obscurity, I
       | disagree with the main premise of the article: that knowing SQL
       | schema doesn't help the attacker.
       | 
       | If I understand the argument of the author here:
       | 
       | > Attackers like me use SQL injection attacks to recover SQL
       | schemas. The schema is the product of an attack, not one of its
       | predicates
       | 
       | The author appears to imply that once the vulnerability is found,
       | the schema can be recovered anyway. It is not always the case. It
       | is perfectly viable to find a SQL injection that would allow to
       | fetch some data from the table that is being queried, but not
       | from any other table, including `information_schema` or similar.
       | If all the signal you get from the vunlerability is also "query
       | failed" or "query succeeded, here's the data", knowing the schema
       | makes it much easier to exploit.
       | 
       | > the problem is that every computer system connected to the
       | Internet is being attacked every minute of every day
       | 
       | If you specifically log failed DB queries, than for all the
       | possible injections that such 24/7 attacks would find you have
       | already patched them. The log would then be not deafening until
       | someone stumbles on the actual injection (that, for example, only
       | exists for logged in users, and thus is not found by bots), in
       | which case you have time to see it and patch before the attacker
       | finds a way to actually utilize it.
       | 
       | Knowing schema both expedites their ability to take advantage of
       | the vulnerability, but also increases their chances of probing
       | the injection without triggering the query failure to begin with.
        
         | tptacek wrote:
         | If you specifically log failed database queries, where
         | "failure" means "indicative of SQL injection", then nothing you
         | can do with the schema is going to reduce the signal in that
         | feed --- even a single SQL syntax error would be worth
         | following up on. No, I don't think your logic holds.
        
           | kmoser wrote:
           | I don't understand your logic. Knowledge of the schema can
           | give an attacker an edge because they now know the exact
           | column names to probe. Whether these probes get logged is
           | irrelevant; even if it makes the system more vulnerable for
           | an instant, it's still more vulnerable.
           | 
           | Even if logging failed queries is your metric, then knowledge
           | of column names would make it more likely for an attacker to
           | craft correct queries, which would not get logged, thus
           | making your logs less useful than if the attacker had to
           | guess at column names and, in so doing, incur failed queries.
        
             | tptacek wrote:
             | To probe for what? How does knowledge of a column name make
             | it easier for me to discern whether a SQL injection
             | vulnerability exists? I've spent a lot of time in my career
             | probing for SQL injection, and I can't remember an instance
             | where my stimulus/response setup involved the table names.
             | 
             | SQL injection is a property of _a SQL query_ , not of the
             | schema itself. To have a meaningful chance of blind-one-
             | shotting a query, getting a TRUE/FALSE answer about
             | susceptibility without ever generating a SQL syntax error,
             | I would need to see the queries themselves.
        
               | default-kramer wrote:
               | > How does knowledge of a column name make it easier for
               | me to discern whether a SQL injection vulnerability
               | exists?
               | 
               | It doesn't. It just means that as soon as you find one,
               | you can immediately begin crafting valid queries instead
               | of randomly guessing table names and columns, therefore
               | not setting off the "DB query failed" alert.
               | 
               | EDIT: I guess this is the part I missed:
               | 
               | > To have a meaningful chance of blind-one-shotting a
               | query, getting a TRUE/FALSE answer about susceptibility
               | without ever generating a SQL syntax error, I would need
               | to see the queries themselves.
               | 
               | Really? I guess I have to take your word for it because
               | I've never attempted it, but I would have thought that in
               | some (horribly broken) systems `bobby tables' or 1=1 --`
               | would have a very reasonable chance of detecting SQL
               | injection without alerting anyone.
        
               | jstanley wrote:
               | You can craft valid queries that don't reference any
               | table or column name.
        
               | default-kramer wrote:
               | Right, and that's what you use to find the vulnerability.
               | But imagine you've found the vulnerability and now you
               | want to use it to update all of your parking tickets as
               | paid. Without the schema, this is going to be quite
               | tricky and will generate a lot of failed SQL. With the
               | schema, you might be able to do it on your first try.
        
               | tptacek wrote:
               | Which is why in the ordinary course of a pentest you'd
               | _use the SQL injection vulnerability to recover the
               | information in the schema_.
        
               | LegionMammal978 wrote:
               | Is there not any SQLi vulnerability in practice that
               | doesn't allow such an information recovery? That is, is
               | the schema-recovery step so foolproof that it can always
               | be performed on any target form? GP is suggesting that
               | this may be difficult, depending on the kind of signal
               | that gets returned from the form.
        
               | tptacek wrote:
               | In my entire experience as a software security
               | practitioner, which at the time of my testimony
               | encompassed some hundreds of assessments of SQL-backed
               | websites, the availability of a schema has never impacted
               | my ability to exploit a SQL injection. It's not my job as
               | an expert witness, nor Matt's job as a plaintiff, to
               | invent improbable scenarios where security could hinge on
               | schema availability. The court (all of them, in fact)
               | found that testimony dispositive, so I'm happy to leave
               | the issue there.
        
               | hot_gril wrote:
               | "Blind" SQLi is a thing, but even in the real-life
               | example I could find, it wasn't exactly blind. They could
               | still use the timing to get one bit of info at a time and
               | discern the email addresses.
               | https://www.invokesec.com/2025/01/13/a-real-world-
               | example-of...
               | 
               | It's hard to imagine a case where you can't even get info
               | based on timing. But it requires more effort and
               | knowledge to exploit this.
        
               | default-kramer wrote:
               | Maybe I'm ignorant, but if the account the app is using
               | doesn't have access to the information_schema how do you
               | do this?
        
               | kmoser wrote:
               | Not just that, but perhaps the app is smart enough to
               | lock you out the second it detects an attempt to gather
               | the schema, e.g. by logging and automatically responding
               | to a query that displays the schema. Then you have to
               | look for other ways in (another IP, etc.). But if you
               | know the schema in advance, you have a better chance of a
               | one-shot injection that accomplishes your malicious goal.
               | 
               | In other words, advance knowledge of the schema _may_
               | make it easier to act maliciously.
        
               | lcnPylGDnU4H9OF wrote:
               | I don't think that's a very common setup but perhaps I'm
               | just exposing my own ignorance. Just consider the
               | popularity of ORMs. They explicitly load the schema into
               | the application in many cases.
        
               | kmoser wrote:
               | Knowledge of the column names doesn't give you insight
               | into whether a vulnerability _exists_. It gives you
               | insight into what you can _do_ with a vulnerability,
               | should it exist. For example, if you want to set your
               | account balance to $1 million, you 'd need to know the
               | column name in order to generate a valid query. Without
               | advance knowledge of the column name, your job becomes
               | harder.
        
               | hot_gril wrote:
               | SQL injection will give you the entire schema anyway. It
               | doesn't help if someone tells you the col names
               | beforehand. I'm more wondering about non-SQL-injection
               | vulns.
        
               | HDThoreaun wrote:
               | SQL injection isnt just an ssh tunnel to the database. If
               | the line you've injected isnt a select and the backend
               | never fetches it how does the injection give you the
               | column names?
        
               | hot_gril wrote:
               | Oops you're right, it's possible that you have no way to
               | read things back.
        
               | wglb wrote:
               | I've seen this done by enumerating possible table names.
        
               | hot_gril wrote:
               | That's a typical way, but the errors might alert them,
               | and of course maybe the names aren't so easily guessed.
        
               | hot_gril wrote:
               | Wait, this is known as a blind SQLi, and it's not so
               | blind. You can still use timing to get the info you need
               | one bit at a time. This may be slow, but it's doable
               | without triggering any DB errors, so you have time.
        
               | HDThoreaun wrote:
               | people come up with the darndest things.
        
           | lucb1e wrote:
           | > nothing you can do with the schema is going to reduce the
           | signal in that feed --- even a single SQL syntax error would
           | be worth following up on
           | 
           | Syntax errors coming from your web application mean there is
           | a page somewhere with a bugged feature, or perhaps the whole
           | page is broken. Of course that's worth following up on?
           | 
           | Edit: maybe I should add a concrete example. I semi-regularly
           | look at the apache error logs for some of my hobby projects
           | (mainly I check when I'm working on it anyway and notice
           | another preexisting bug). I've found broken pages based on
           | that and either fixed them or at least silenced the issue if
           | it was an outdated script or page anyway. Professionals might
           | handle this more professionally, or less because it's about
           | money and not just making good software, idk
        
             | ethbr1 wrote:
             | > _Syntax errors coming from your web application mean
             | there is a page somewhere with a bugged feature, or perhaps
             | the whole page is broken. Of course that 's worth following
             | up on?_
             | 
             | This is a government system, with apps probably built by
             | lowest-bid contractors.
             | 
             | I imagine most of us would be horrified by the volume of
             | everyday failed queries from deployed apps.
        
               | lucb1e wrote:
               | Can be, but I'm not sure it's worth investigating whether
               | a particular deployment has such a specific monitoring
               | system before being able to do a FOIA. The schema is
               | marginally relevant for attacks at best (with heavy
               | emphasis on just how marginal it is) and that's no
               | barrier to releasing it
        
         | pockmarked19 wrote:
         | Reminds me that the recently discovered "leak emails using
         | YouTube" exploit kicked off from reading what is essentially, a
         | schema.
         | 
         | https://brutecat.com/articles/leaking-youtube-emails
        
           | robocat wrote:
           | > kicked off from reading what is essentially, a schema.
           | 
           | I wouldn't call json a schema.
           | 
           | In the HN discussion tptacek replied that "$10,000 feels
           | extraordinarily high for a server-side web bug":
           | https://news.ycombinator.com/item?id=43025038
           | 
           | However his comment assumes monetisation is selling the bug;
           | (tptacek deeply understands the market for bugs). However I
           | would have thought monetisation could be by scanning as many
           | YouTube users as possible for their email addresses: and then
           | selling that limited database to a threat actor. You'd start
           | the scan with estimated high value anonymous users. Only
           | Google can guess how many emails would have been captured
           | before some telemetry kicked off a successful security audit.
           | The value of that list could possibly well exceed $10000.
           | Kinda depends on who is doxxed and who wants to pay for the
           | dox.
           | 
           | It's hard to know what the reputational cost to Google would
           | be for doxxing popular anonymous accounts. I'm guessing video
           | is not so often anonymous so influencers are generally not
           | unknown?
           | 
           | I'm guessing trying to blackmail Google wouldn't work (once
           | you show Google an account that is doxxed, they would look at
           | telemetry logs or perhaps increase telemetry). I wonder if
           | you could introduce enough noise and time delay to avoid
           | Google reverse-engineering the vulnerability? Or how long
           | before a security audit of code would find the vulnerability?
           | 
           | Certainly I can see some governments paying good money to dox
           | anonymous videos that those governments dislike. The Saudis
           | have money! You could likely get different government
           | security departments to bid against each other... Thousands
           | seems doable per dox? The value would likely decrease as you
           | dox more.
        
             | pockmarked19 wrote:
             | > I wouldn't call json a schema.
             | 
             | What you see there is a protobuf, serialized as JSON. If a
             | protobuf definition isn't a schema, I don't know what is.
        
               | robocat wrote:
               | Right, thank you for the correction
        
         | Volundr wrote:
         | I'm not an attacker, just a boring old software dev. If there's
         | an SQL Injection I'd say all bets are off re: schema.
         | 
         | That said I've definitely worked on applications where knowing
         | the schema could help you exfill data in the absence of a full
         | injection. The most obvious being a query that's constructed
         | based on url parameters, where the parameters aren't
         | whitelisted.
         | 
         | So I actually do agree that the schema could potentially be of
         | marginal benefit to the attacker.
        
           | butlike wrote:
           | Wouldn't admitting this in court pin you with some sort of
           | negligence? (if you knew having a schema revealed would
           | compromise your app in some way).
        
             | default-kramer wrote:
             | "Defense in depth" is an easy argument to make. I sure hope
             | I don't have any SQL injection holes, but I can't prove it
             | with 100% certainty.
        
               | hot_gril wrote:
               | I can't imagine how the schema would reveal SQL injection
               | holes. Maybe other holes, though. Any poor choices for
               | PKs, dumb use of MD5 computed fields, insecure random,
               | misuse of NULL, weird uniqueness constraints (this also
               | ties back to NULLs), vulnerable extensions, wrong
               | timestamp type, too-small integer type, varchar limits,
               | predictable index speed...
               | 
               | Edit: More NULL, or maybe lack thereof cause they use the
               | string "NULL" instead?
               | https://news.ycombinator.com/item?id=20676904
        
               | londons_explore wrote:
               | The schema can provide an insight into what the
               | application developer was thinking when writing the code,
               | which in turn can direct an attacker towards tricky
               | corners where mistakes might have been made.
        
               | hot_gril wrote:
               | That's true.
        
               | default-kramer wrote:
               | > I can't imagine how the schema would reveal SQL
               | injection holes.
               | 
               | It wouldn't. I'm just assuming that the thrust of the
               | hypothetical negligence accusation was "The schema is
               | useless unless you have SQL injection holes. So give us
               | the schema or admit you are negligent!" But you're
               | correct that there are other justifications one could
               | make to keep the schema secret.
        
             | HDThoreaun wrote:
             | This is the city government here. The people arguing the
             | case didnt write the code and dont have time to look
             | through all their code but one thing they do know is that
             | it was written by monkeys. They probably have some level of
             | reason to believe their are SQL injections available in the
             | code.
        
         | gerdesj wrote:
         | That's where the court's technical distinction between the
         | words: "could" and "would", is important. It appears they have
         | reduced the distinction to a risk assessment which is more
         | objective than opining wildly!
         | 
         | For example: I've just re-wired a three gang light switch. I
         | verified power on with my multimeter (test the meter), cut the
         | power and then retested all the circuits to make sure I had got
         | it right.
         | 
         | It turns out that switch three is on a separate ring main. Cool
         | I didn't get to test my body's ability to take a whopper of a
         | shock. In the UK it is common to have upstairs and downstairs
         | rings for light circuits. Our kitchen has quite a few lights in
         | it so it got a separate ring as well. Anyway there are quite a
         | lot of wires in there because all of them are two way switches.
         | Oh and I am allowed to work on them because of the switch
         | location - not kitchen and not bathroom, ie a low risk location
         | 
         | I noted down the connections, and took them all out. I put
         | Wagos over the flying ends to make them safe, turned the power
         | back on and got on with the job in hand.
         | 
         | I then cut the power (both circuits) checked again with my
         | Fluke. Oh bollocks ... enable power, test the Fluke and then
         | cut power again and recheck the circuits.
         | 
         | Now I re-terminated all the connections. There was plenty of
         | additional wire so I decided to cut and re-strip the
         | conductors, to make sure that I avoided potential failures due
         | to "work hardening" from the inevitable pushing and pulling and
         | "gentle" forcing into position. Once all the conductors were
         | screwed down I pulled on them fairly forcefully to make sure
         | they wont fall out.
         | 
         | I screwed down the switch face plate and restored power. Its a
         | brushed metal finish switch so I did test it was not live,
         | because I'm careful. I tested the functionality ie all three
         | switch circuits (three) from all the switches (six).
         | 
         | So, given that description is it possible that the connectors
         | might fall out in the future and short on say, the metal back
         | box. Of course it is possible. It could happen but would it
         | happen?
         | 
         | You could postulate all sorts of scenarios. Perhaps I may be
         | careful but I might be cack handed and forgetful and got
         | something wrong anyway and a wire might still drop out. Now we
         | are at the point of whataboutery! and that wont wash.
         | 
         | The would/could distinction is a powerful one and it is
         | analogous to how we do risk assessments.
         | 
         | I'm certainly not saying you are wrong in your assessment but I
         | think you are fiddling with details to conjure up a "could" and
         | not a "would". I agree that knowing the schema would assist a
         | hacking attempt but would it make a successful crack more
         | likely - no I don't think so. It is a classic case of obscurity
         | despite security but a rather more complicated one than putting
         | the ssh daemon on port 2222.
         | 
         | Cripes - I need to get out more!
        
         | wglb wrote:
         | > "query failed" or "query succeeded, here's the data"
         | 
         | Blind SQL injection is a type where no error is produced, but
         | some subtle signal can indicate success or failure. The most
         | interesting one that I know about is where the presence of a
         | successful injection was a normal looking response that was one
         | byte longer than an unsuccessful injection. This was used to
         | not only figure out the schema, but to fully exfiltrate the
         | entire database.
         | 
         | There is nothing in the log on the server that indicates an
         | error.
         | 
         | Most of the relatively introductory SQL injection exercises
         | that I taught proceed without any knowledge of the schema.
         | 
         | This is why SQL injection is so insidious.
        
           | berkes wrote:
           | Not just with SQLi, but I've managed to statistically proof
           | "information" with timing attacks.
           | 
           | Where if you join another table (by e.g. requesting extra
           | info in a graphql query) the response goes from ms to s or
           | even m. Indicating the size of the joined table.
           | 
           | Or where I could change a "?sort[updated_at]=desc" to a
           | "?sort[password_hash]" through trial-and-error and suddenly
           | see the response time drop from ms to seconds (in this case
           | finding columns that exist but aren't indexed).
           | 
           | Even if the response content is exactly the same, we know
           | things exist, are big, not indexed, or simply present, by
           | timing the attack.
           | 
           | A famous one is obviously the timing trick to find out that
           | an email is in the system because "user = user.find(email) &&
           | user.password_matches(password)" short cirquits if the email
           | does not exist but spends significant time on hashing the
           | password for matching it. A big lot of backends and apps make
           | this mistake.
        
         | florbnit wrote:
         | > that knowing SQL schema doesn't help the attacker.
         | 
         | Knowing the name of the service helps the attacker, knowing the
         | name of government officials working at city hall helps
         | attackers, knowing the legal description of what a parking
         | ticket is helps attackers. If you are sued and decide you want
         | to hack the government knowing the details of the suit against
         | you helps you in your attack.
         | 
         | The barrier is not "any helpful information must be censored"
         | the barrier is "don't disclose passwords or code that would
         | divulge backdoors" a schema cannot be that.
        
       | jaxgeller wrote:
       | I FOIA'ed >1M pages of docs for my project cleartap.com, a DB of
       | water quality of the USA.
       | 
       | Most states would charge a small amount to gather the documents.
       | 
       | Michigan wanted $50K to for the FOIA request. I think because of
       | the Flint lead crisis. They wanted me to go away.
        
         | davethedevguy wrote:
         | I noticed that you do have data for Flint. Did you have to pay
         | it, or is there some appeals process if you're quoted an
         | unreasonable amount?
         | 
         | Great project by the way!
        
           | jaxgeller wrote:
           | Ended up finding the majority of Michigan through scraping.
           | 
           | For example, https://www.cityofflint.com/wp-
           | content/uploads/2023/06/Annua...
        
       | aqueueaqueue wrote:
       | Interesting takeaways from me:
       | 
       | All that pompous sounding legalese can still be ambiguous! I feel
       | less bad for not understanding contracts that have 100 word
       | compound sentences.
       | 
       | Legal people can't keep up with our tech jargon but they have
       | their own jargon including "predicate" lol. So same logical
       | thinking, different jargon framework.
       | 
       | Question: why do they want the schema not the data?
        
         | tptacek wrote:
         | Because once you have the schema you can issue FOIA requests
         | that include queries for them to run.
        
           | hot_gril wrote:
           | What if you guess common table names? Wonder if they send
           | back the error message.
        
           | aqueueaqueue wrote:
           | Oh wow! If that is necessary, that is so kafkaesque!
           | 
           | "I want your data"
           | 
           | "What data?"
           | 
           | "What do you have?"
           | 
           | "Ha ha. No. Tell me what you want"
           | 
           | "Your data that is the metadata of your data"
           | 
           | "Well actually..."
           | 
           | ...
        
             | tptacek wrote:
             | You can't ask public bodies to do research for you. That's
             | the public policy balance in our FOIA laws: you can get
             | _almost anything_ (and: talk to Matt, you really can get a
             | lot of stuff), but you have to be specific about what you
             | 're asking for, and it has to be "at hand" for the staff
             | responding to the request.
        
               | hunter2_ wrote:
               | Clerks fielding FOIA requests have SQL consoles "at
               | hand"?
        
               | tptacek wrote:
               | They send emails to IT. The classic example of a thing
               | you can get through FOIA is large-scale dumps of emails
               | from Exchange Servers, which is also not something a
               | Clerk can do themselves, but which IT staff can
               | immediately retrieve.
               | 
               | Leave the "Clerk" bit of this out and just imagine you're
               | requesting straight from the IT department. What you can
               | do: get anything not otherwise exempt that they know how
               | to retrieve (it usually helps to provide example commands
               | in the requests). What you _cannot_ do: ask them to go
               | look around and _see_ what they have. That 's research.
               | Research is your job, not theirs, under Illinois FOIA.
        
               | hunter2_ wrote:
               | If research is my job, and looking around is research,
               | then couldn't I look around and see what they have
               | instead of asking them to do so?
        
               | tptacek wrote:
               | Maybe? If you work there, I guess? Or if you're really
               | nice to them? But they're under no obligation to help
               | you. The tradeoff in Illinois (and most other good FOIA
               | law): you can get almost anything you want --- _way_ more
               | than most people think --- but you can 't get public
               | staff to go do research work for you.
               | 
               | Again: this is why pulling schemas is so valuable.
        
           | aerzen wrote:
           | Could you ask them to run an introspection query? Something
           | like SELECT * FROM information_schema.tables?
        
           | darkarmani wrote:
           | Is the schema considered private information or just
           | information not required to be released via FOIA? ie: Can't
           | some nice employee leak this information or is it legally
           | protected?
           | 
           | Once the information is released, can anyone can make FOIA
           | requests using the schema?
        
             | tptacek wrote:
             | Under Illinois law there are just two kinds of data: normal
             | data and data exempt from FOIA. Up to and including the
             | appellate review of Matt's case, schemas were in the
             | formal, normal category. After the State Supreme Court
             | review, they are now _per se_ exempt from FOIA.
             | 
             | It's not legally protected. An employee could leak it. A
             | public body can voluntarily reveal documents that are
             | exempt from FOIA (absent some other Illinois law
             | prohibiting disclosure). A public body can disclose source
             | code, for instance, despite it being explicitly exempt in
             | the statute. "This data is exempt" is an affirmative
             | defense that the public body has to raise.
        
       | pudding12345 wrote:
       | Do stored procedures count as part of the schema? I've recently
       | found a SQL injection vulnerability in a client's SP that was
       | using concat (very badly)
        
       | EMIRELADERO wrote:
       | Am I the only one slightly perplexed/worried by the point-blank
       | source code exemption?
       | 
       | It's easy to imagine a scenario where the city decides to develop
       | a specific software in-house and hide the "biases" in the source
       | code, or any other thing one might not find desirable.
       | 
       | Hell, they don't even need to make everything from scratch! Could
       | just patch and use a permissively licensed 3rd-party component.
       | 
       | In my opinion, the proposed amendment does not go far enough.
        
         | dotdi wrote:
         | That's why it's important to push for "public money - open
         | source" initiatives like some countries in the EU are trying to
         | implement.
         | 
         | Off the top of my head, I think the last (now failed) German
         | coalition had this in their programme but didn't deliver. Maybe
         | the new government will.
        
         | manquer wrote:
         | It shouldn't be surprising ?
         | 
         | It is the same problem people trying to open sourcing closed
         | projects experience, there is all sorts of locked-in
         | proprietary code which the developer and the customer only have
         | the license to use but not share the source.
         | 
         | Even projects which from day one are staunchly open and built
         | without direct commercial interests like government contractors
         | need also suffer from this. The Linux kernel challenges for
         | supporting ZFS or binary blob drivers in kernel/user space and
         | so on are well known[1]
         | 
         | Paradoxically on one hand information wants to be free, and
         | economics dictate that open source software will crowd out
         | closed competitors over time, it is also expensive to open
         | source a project and sometimes prohibitively so and that deters
         | many managers and companies open sourcing their older tools
         | etc, even if they would like to do so, involving legal and
         | trying to find even the rights holder for each component can
         | deter most managers.
         | 
         | If a government put requirements in contracts that the vendor
         | should only use open source components in their entire
         | dependency tree, it could drive the costs very high because a
         | lot of those dependencies may not have equivalent open source
         | ones or those lack features of the closed ones so would need
         | budgets to flesh them out. In the short term and no legislature
         | will accept that kind of additional expense, while in long term
         | public will benefit.
         | 
         | ---
         | 
         | [1] yes kernel problems are largely a function of GPL, more
         | permissive licenses like Apache 2 /MIT would not have, BSD
         | variants after all had no challenges in supporting ZFS.
         | 
         | However a principled stance on public applications being open
         | source by government would be closer to GPL than MIT in terms
         | of licensing. Otherwise a vendor can just import the actual
         | important parts as binary blobs "vendored" code and have some
         | meaningless scaffolding in the open source component to comply.
        
           | Y_Y wrote:
           | Maybe FOIA should trump licensing in this case. Suppose I
           | write a manual on how to issue bad parking tickets and hide
           | them in a database, and then license that (in since
           | restrictive manner) to the state of Illinois. I think the
           | public's right to see that document is more important than my
           | right to prevent copying and dissemination.
        
             | manquer wrote:
             | That is true for all kinds of IP . The balance between the
             | two is what IP laws do. Give inventors some protections to
             | encourage innovations while keeping the public benefits in
             | mind .
             | 
             | Copyright is time limited author's death and 70 years for
             | individuals and 95 years for corporations .
             | 
             | While there are arguments to be made for lesser duration ,
             | better preservation requirements etc the balancing of
             | public good to private value is the basis of all copyright
             | laws since statute of Anne 1709.
             | 
             | In a court case you can get access to all types of
             | information as part of discovery, if you are harmed or
             | believed to have been, there are other avenues available
             | for you . If you have standing to sue and the discovery
             | requests are made by a competent lawyer you can get access
             | to internal communications to trade secrets to any other
             | document supporting your claim . you or your lawyer can not
             | use such information for economic benefit or disclose it,
             | they are still protected .
             | 
             | Given that you have options legally to get this data ,
             | there is no public need that trumps private property rights
             | because of real or potential harm that justifies blanket
             | access by default
             | 
             | PS: note software is not just copyrighted , it is also
             | covered by patents (20 years) and trade secrets (no expiry
             | ). Also while the law provides protection it does not
             | require disclosure on expiry .
        
               | Y_Y wrote:
               | If it were enough that government data were available via
               | discovery then we wouldn't need FOIA laws in the first
               | place.
               | 
               | Patents aren't relevant here since they are disclosed
               | upon granting and cover the design rather than the
               | implementation, for trade secrets the situation is more
               | complicated ( https://www.americanbar.org/groups/litigati
               | on/resources/news... ).
        
         | contravariant wrote:
         | In _theory_ the decision to put those biases in the code should
         | be public information. You can ask for the criteria the
         | software was made to, just not the software itself.
         | 
         | Though rulings like this might have a chilling effect.
        
           | qingcharles wrote:
           | Only if they are written down. For instance, DOGE makes sure
           | everything is done by voice so there is nothing to catch them
           | out on in future. I've found that once you start hitting a
           | public body with FOIAs regularly they learn to stop putting
           | incriminating things down in writing.
        
       | lucb1e wrote:
       | I got to about 1/3rd of the way before I noticed my eyes were
       | kinda struggling to read the article. Toggling different CSS
       | rules, it's the #333 gray color. Turning that off is instantly
       | better. The custom font is much thinner than the default, but
       | that by itself doesn't seem to be the issue if the color is
       | (closer to) black. (There is also a font-weight rule, but
       | toggling it makes no visual difference in Firefox. Maybe the text
       | is intended to look different?)
       | 
       | Since there is no contact method on the website, figured I'd
       | mention it in a comment; hope this helps
        
       | lubujackson wrote:
       | Juxtapose this legal process with DOGE hoovering (in more ways
       | than one) data willy-nilly from everywhere. The dissonance
       | between THIS uninteresting DB schema being so rigorously
       | protected while massive amounts of sensitive data is completely
       | misappropriated is painful.
        
       | alexashka wrote:
       | Wowzers, that was _a lot_ of words to express something that 's
       | very simple.
       | 
       | A database schema is just an empty form. By looking at an empty
       | form, you know what fields _have_ be filled in, what type of
       | information they 'll contain, etc.
       | 
       |  _Of course_ people making data requests need to know what forms
       | are being used to collect and store information.
       | 
       | As for security - not letting people do anything because 'it
       | might be dangerous' is bonkers. The way to secure databases has
       | been known for decades. Let's start living in the 21st century :)
        
         | tptacek wrote:
         | The whole back half of the post is about why the analysis is
         | not as simple as you suppose it is. We had no trouble
         | establishing at Chancery Court that schemas don't endanger
         | security. That's not why the case failed at the Illinois
         | Supreme Court. The IL Supremes did not decide spontaneously
         | that schemas actually are dangerous.
        
       | abfan1127 wrote:
       | am I the only disappointed there's no mention of little Bobby
       | Tables?
        
       | ajkjk wrote:
       | This was fine, legally, but I'd be pretty irritated if someone I
       | knew wasted everyone's time on this. The schema clearly _is_
       | (marginally) useful for hacking, but who cares; it clearly is a
       | file layout also, but who cares; those matter legally but not
       | morally. Morally, this is just dumb: it 's not something they
       | really needed, and they're just irritating people and wasting
       | resources for the fun of it. Shameful.
        
         | jbritton wrote:
         | I think a file layout describes the exact arrangement of bytes
         | in a file. A schema is higher level. It describes what is
         | stored, not how it is stored. A database could be one file, or
         | a file per table, or a file per column. Data could be stored
         | across multiple drives.
        
         | tptacek wrote:
         | No. I'm involved in local government, and on the citizens
         | commission where we keep track of our our municipality
         | (adjacent to Chicago) stores and manages information. I'm
         | acutely familiar with how people are spending their time in
         | these organizations, and what is and isn't a big lift for them.
         | 
         | Increasingly, year over year, more and more information that
         | would previously have been stored in filing cabinets or shared
         | drives is moving into turnkey applications that municipalities
         | buy and enroll all their data in. Those applications are
         | opaque. But almost all of them are front-ends to SQL databases.
         | 
         | Being able to recover schemas from publicly operated databases
         | is vital to keeping public records and data public, rather than
         | de-facto hidden from inquiry.
         | 
         | Matt's suit was anything but a waste of people's time.
         | Hopefully, it'll result in a change to our state law.
        
         | zonkerdonker wrote:
         | See here: https://news.ycombinator.com/item?id=43176625
         | 
         | FOIA requester responded in comments saying they received a tip
         | indicating illegal practices, and noted in his article that he
         | had previously uncovered evidence of over-policing in black
         | neighborhoods.
        
         | hot_gril wrote:
         | Just because the article gets into fine details doesn't mean
         | it's silly. They're working with what they have.
         | 
         | But after reading more, I agree. The point of FOIA in the first
         | place was "access by all persons to public records promotes the
         | transparency and accountability of public bodies at all levels
         | of government." Not "pushing FOIA statutes to their limits,
         | sniffing out buried data and bulk-extracting it with clever
         | requests."
         | 
         | If he's just asking for his own parking ticket records, ok.
         | This isn't in the spirit of that. Separately, I agree that the
         | SQL schema is software, a type of file layout, marginal
         | attacker benefit, and other things in that exemption, and I'd
         | say that again as an expert witness.
        
       | Terr_ wrote:
       | > Each spreadsheet has a header row, labeling the columns, like
       | "price" and "quantity" and "name". A database schema is simply
       | the names of all the tabs, and each of those header rows.
       | 
       | This is also how I explain it to my relatives, I'm kind of
       | surprised this analogy (one so direct that it's almost literal)
       | didn't fly with the judges.
       | 
       | If database column names cannot be revealed, then shouldn't that
       | mean the state is also able to redact the headers of all their
       | spreadsheets?
        
         | kmoser wrote:
         | Knowing a spreadsheet header doesn't help an attacker gain
         | access to that spreadsheet in any way. Knowing SQL column names
         | may give an attacker an advantage in accessing a database.
        
           | Terr_ wrote:
           | Compare: "Knowing the writing style of current employees may
           | give an attacker an advantage while phishing, therefore, we
           | cannot turn over any memos or emails whatsoever."
           | 
           | Ditto for the org-chart.
        
           | flutas wrote:
           | Per the post, this also wouldn't fly.
           | 
           | > Believe it or not, there's case law on "would" versus
           | "could" with respect to safety. "Could" means you could
           | imagine something happening. But the legal standard for
           | "would" is "clear evidence of harm leaving no reasonable
           | doubt to the judge". The statute set the bar for me very low
           | and I managed to clear it.
        
             | Terr_ wrote:
             | Reminds me of Shall versus May in RFCs. (Though those are,
             | of course, statements of obligation rather than natural
             | consequence.)
        
         | butlike wrote:
         | It's a reverse vlookup
        
       | lq9AJ8yrfs wrote:
       | In the new language proposed in SB0226 (as linked, didnt search
       | for authoritative sources, can't tell how durable that link will
       | be for posterity, arrgh archiving the web is hard etc), doesn't
       | that language leave open a hole for excessive complexity to be a
       | reservoir for FOIA resistance?
       | 
       | Feels like there is an important theme here that SB0226 is
       | dancing around --could government be legible in addition to being
       | "plain-text" transparent?
       | 
       | "plain-text description" of "each field of each database of the
       | public body" and "specific database queries" may not do what you
       | mean.
       | 
       | Not sure how to fix it though.
       | 
       | I could see gratuitous ORMs and database-of-databases patterns
       | winning tax dollars with taunt-them-with-the-schema listed as a
       | feature.
        
       | indymike wrote:
       | There is no fredom of information if the public is not allowed to
       | know what data the government has.
        
       | b8 wrote:
       | Got to see this happen day by day on the Midwest Venture Partners
       | Slack. There was another lawsuit Chappman and Tom did for laser
       | based speed detection in Chicago.
        
       | djeastm wrote:
       | I suppose I need to change all my column names to random
       | 16-character strings so I don't leave my database insecure!
        
       | kingforaday wrote:
       | Given the Illinois Supremes decision, seems like an opportunistic
       | time to say "Everything is a file".
       | 
       | 1. https://en.m.wikipedia.org/wiki/Everything_is_a_file
        
       | dylan604 wrote:
       | "Retrieve the data of every parking ticket issued to 'Bob O' and
       | also all the rest of the information in the database including
       | everyone's passwords."
       | 
       | This is the example of SQL Injection written in plain English,
       | yet "everyone's" is problematic here in that it's an orphaned
       | single quote. If "Bob O'Conner" is bad, so is "everyone's"
        
       | irrational wrote:
       | > I'll conclude this long piece by saying (1) obviously the bill
       | should pass, and (2) it should be called "The Chapman Act".
       | 
       | (3) I imagine Chicago greatly regrets towing Matt Chapman "over a
       | facially bogus ticket".
        
       | DangitBobby wrote:
       | When a law is ambiguous by wording, why do they never ask the
       | people who drafted the law what was intended?
        
         | tptacek wrote:
         | The current sitting ILGA is not the ILGA that passed the
         | statute.
        
           | DangitBobby wrote:
           | They are probably still alive, shouldn't be that hard to
           | find. They have no problem giving subpoenas to other
           | witnesses or soliciting expert testimony.
        
         | jaza wrote:
         | That would be against the separation of powers doctrine
         | inherent in all Western democracies. The job of the legislature
         | is to write the law. The job of the judiciary is to interpret
         | the law.
         | 
         | Besides, when the law is ambiguous, it's very often because the
         | legislature themselves weren't sure what they intended, and/or
         | because the legislature had deeply divided views and arrived at
         | ambiguous wording as a compromise, and/or because the
         | legislature used their "somebody else's problem" prerogative
         | i.e. they said "let's leave that for the courts to decide".
         | Ambiguously worded laws isn't a bug, it's a feature!
        
           | DangitBobby wrote:
           | I don't see how it could break separation of powers,
           | especially if a legislator could provide minutes and/or a
           | paper trail of discussions and revisions pointing the intent
           | in a certain direction. You know, like evidence. The
           | legislature surely has intent while writing the law,
           | otherwise what would be the point in trying to interpret it,
           | and the whole thing being litigated is the authors intent. I
           | don't think the separation of powers doctrine presupposes
           | that the legislature has no idea what their goals are while
           | writing laws, that would be quite an insane assumption to
           | bake into our system, and broken by design. And in this case,
           | I very much doubt it was left intentionally ambiguous, since
           | FOIA was clearly intended to help people get information from
           | obstinate government agencies. What would even be the point
           | in writing the law if obstinate government agencies are
           | supposed to be able to weasel around the ambiguity behind a
           | comma? Regardless, if we are able to ask the people who spent
           | time drafting it, we could ask. There might even be a paper
           | trail!
        
             | karaterobot wrote:
             | First, even if we kicked all questions back to the authors,
             | there would need to be a process for interpreting
             | legislation when the original authors are not available--
             | perhaps they died 200 years ago, for example. So, a
             | judicial group is necessary for interpreting the law in any
             | case.
             | 
             | Second, interpreting the law is a full-time job, and our
             | legislators are already not doing the jobs they have. If we
             | asked them to not do a second job on top of the one they're
             | already in dereliction of, nothing would get done. Well,
             | twice as much nothing would get done, if that makes sense.
             | So, you need someone with the full-time job of interpreting
             | the law, and it might as well be that judicial branch.
             | 
             | Third, legislation is often a compromise. If you go back
             | and try to interpret what it was that people said during
             | the debate which produced the language at issue, you'd have
             | to decide whose voices carried the most weight, which seems
             | very, very tricky. It's why legislation is written down in
             | the first place: a single source of truth. Even if
             | ambiguous, it's still less ambiguous than a transcript of
             | extemporaneous debate.
        
       | inetknght wrote:
       | > _You also generally can 't FOIA the source code of programs
       | they run._
       | 
       | Alas, that part should be illegal under FOIA.
       | 
       | Source code should be _open source_ and _verifiable_. Being
       | exempt from FOIA circumvents public confidence in the government
       | 's use of software.
       | 
       | I'd be curious to learn if/where courts have decided such things
       | already.
        
         | jaza wrote:
         | I assume that - even though there's a strong public interest
         | argument for it - government orgs are prone to blanket banning
         | the release of source code, for the same primary reason that
         | businesses are prone to doing so. That is, too high a chance of
         | sensitive data (passwords, tokens, IP addresses, etc) being
         | hard-coded in all-too-often non-12-factor-aspiring code; and
         | too much security / liability headache if said sensitive data
         | gets out.
         | 
         | There's probably also some actual business logic that
         | government orgs want to and are legally permitted to keep
         | secret. In the OP's case of a parking ticket database, maybe
         | there's software talking to that database, whose source code
         | includes the logic of picking when / where parking inspectors
         | should conduct a "random" blitz of issuing fines.
        
           | inetknght wrote:
           | > _maybe there 's software talking to that database, whose
           | source code includes the logic of picking when / where
           | parking inspectors should conduct a "random" blitz of issuing
           | fines._
           | 
           | Oh yes, and that "random" blitz of issuing fines _definitely_
           | doesn 't have any racist part to its algorithm. Just trust
           | the government on that one. The government and the "business"
           | what wrote the code in the first place. Yup, makes sense.
        
       | gervwyk wrote:
       | Should have used mongodb in the first place.
        
         | qbxk wrote:
         | lol'd so hard at this
        
       | gunian wrote:
       | sql injection court seems more fun than slave court where they
       | tell you spending anything above 5 is a crime lmaooooo
        
       | neilv wrote:
       | > _[...] where the only way to get at the underlying data is to
       | FOIA a database query._
       | 
       | Can you request the desired information using natural language,
       | based on your guesses of what information they store?
        
         | tptacek wrote:
         | Probably not, because then you'd be asking them to go do
         | research. You FOIA for specific documents and records.
        
           | neilv wrote:
           | So you can ask for the document that is the inspection report
           | from Mel's Diner on date 11/11/2024?
           | 
           | Can you ask for the database record from dispatching that
           | inspection visit to Mel's Diner on 11/11/2024, even if you
           | don't know the exact database column names and relations?
           | 
           | If you can ask for that one dispatch database record, without
           | knowing the schema, can you ask for the database records for
           | _all_ inspection visits to all locations in Smallville in
           | 2024? (Or does the complexity of that database query
           | constitute  "research"?)
        
             | tptacek wrote:
             | You can ask for "the inspection report from Mel's Diner on
             | date 11/11/2024". You can also ask for "every inspection
             | report ever done on Mel's Diner".
             | 
             | You can _suggest_ that they retrieve the inspection report
             | from their database. This can be useful if staff wouldn 't
             | know where to find the document you're looking for. The
             | FOIA clerk will hand the request off to IT, and if it's
             | sensible, they'll probably try it.
             | 
             | You probably (these are all humans, so no definites) can't
             | literally use public body staff as a proxy to a database
             | shell; for instance, they're not going to let you do a lot
             | of interactive stuff. You're either going to produce the
             | data that you're looking for --- which you'll need to
             | describe in prose --- or get nothing.
        
               | neilv wrote:
               | Thanks. I might be misinterpreting, but maybe the
               | standard for what's "research" involves _not_ presuming
               | that a given request can be satisfied by a database
               | query?
               | 
               | So if you ask for documents in a way that sounds more
               | _complex_ than  "every inspection report ever done on
               | Mel's Diner", which is something they might be able to
               | satisfy reasonably using a paper filing system or by
               | eyeballing rows on a screen, then the request could be
               | denied as "research"?
               | 
               | So then is what the petitioner in this case looking for
               | was a database schema that would let them say,
               | "respectfully, I think it's not research; I think it can
               | be satisfied with the following exact SQL query"?
        
               | tptacek wrote:
               | Exactly.
               | 
               | The public body could then say "no, we're not going to
               | run that query or dig into that database". But then you
               | can take them to court, and you'll probably win if the
               | query is reasonable and simple enough.
        
       | rafram wrote:
       | How were you able to stand as an expert witness when you have a
       | personal relationship with the plaintiff? I don't know the
       | specifics of the law in Illinois, but my understanding is that
       | that would generally be a disqualifying conflict of interest.
        
         | hondo77 wrote:
         | I have this cousin, Vinny, who's a lawyer, and he was able to
         | use his girlfriend as an expert witness. Both sides agreed she
         | really knows her stuff because that's what really matters.
        
       | Jean-Papoulos wrote:
       | I understand freedom of information, but what exactly does the
       | public gain by Matt getting the database schema ?
       | 
       | If the answer is "the ability of the request data from a specific
       | table/column", I would say that this should possible to do by
       | asking for the relevant data directly (instead of asking for "the
       | timestamps of each ticket" ask for the "time-related data of each
       | ticket" for example) ?
       | 
       | And yes, having your db schema out in the wild can be a vector of
       | attack, if only because it allows targeting the sql injections
       | (the blog author himself argues this in court).
       | 
       | The court was right to reject this. Maybe the exact word of the
       | law doesn't ask for it, but the spirit certainly does.
        
         | tptacek wrote:
         | The blog author argued no such thing, because that is not true.
        
         | gizmo wrote:
         | Municipalities obstinately refuse reasonable requests because
         | they resent that the Freedom of Information Act allows regular
         | civilians to get all up in their business. The excuses they
         | make for noncompliance (it's burdensome! it violates privacy!
         | sql injection!) are not serious. They don't want to comply
         | because they don't like accountability. That's it.
        
       | makach wrote:
       | Does disclosure of a database schema really jeopardize the
       | security of the system? _Yes_
       | 
       | How plausible or likely does that jeopardy need to be? _Very_
       | 
       | Does a database Schemas constitute "source code"? _Yes_
       | 
       | Is a SQL schema a "file format"? _No & yes. In that order._
       | 
       | And, finally, does the "would jeopardize" language apply to
       | everything in the exemption, or just to the nearest noun "any
       | other information"? _Yes_
        
       | scotty79 wrote:
       | > Does the "would jeopardize" language in the statute apply to
       | everything in the exemption, or just to the nearest noun "any
       | other information"?
       | 
       | I think law and lawmaking would be vastly improved if only
       | lawyers learned the miracle of parentheses.
        
         | Ylpertnodi wrote:
         | Comma's can be expensive, too.
        
       | ngriffiths wrote:
       | > Congratulations! You now understand databases.
       | 
       | Data engineering: doing a lot of fancy work to make a very simple
       | product
        
       | rubymancer wrote:
       | It's Matt Champan! https://mchap.io/
       | 
       | I helped him process and visualize the original batch of parking
       | ticket data waaaay back in 2016.
       | 
       | I can't believe he's still on this in 2025. We need more junkyard
       | dogs like him fighting for what's right.
        
       | thayne wrote:
       | I'm confused why file layout is included in the list of
       | exceptions in the first place. If an adversary knowing your _file
       | format_ is a security problem, then you are doing something very
       | wrong!
       | 
       | And with the ruling that the condition only applies to "other
       | information" (which to me seems like a very strange reading, and
       | probably not the intent of the law), regardless of if a SQL
       | schema is considered a "file layout", creates a massive loophole,
       | where the government can just use some obtuse custom file layout
       | to avoid FOIA requests.
        
       | gavin_gee wrote:
       | https://x.com/JackRhysider/status/1885732851779285184
        
       ___________________________________________________________________
       (page generated 2025-02-26 23:01 UTC)