[HN Gopher] IETF: The HTTP Query Method ( Draft)
___________________________________________________________________
IETF: The HTTP Query Method ( Draft)
Author : ksec
Score : 139 points
Date : 2022-01-04 14:05 UTC (8 hours ago)
(HTM) web link (www.ietf.org)
(TXT) w3m dump (www.ietf.org)
| intrasight wrote:
| I for one welcome the addition. And am sort of surprised that it
| wasn't made a verb earlier.
| JanLikar wrote:
| If we need GET requests with bodies, why not simply add a body to
| GET?
| mrcarruthers wrote:
| I'm fairly certain that (technically) nothing's stopping you
| from doing so. However, there are so many
| libraries/clients/etc... that do not allow it that it would be
| almost impossible to patch them all. Adding a new method and
| having libraries add it and support it properly would be
| better.
| throw_m239339 wrote:
| cause HTTP agents are allowed to ignore the body of a get
| request, per spec.
| dragonwriter wrote:
| Because then you don't know if something that supports GET
| supports the new broader definition or the old definition,
| whereas if does or does not support QUERY is more clear.
|
| Also because a different method means OPTIONS tells you
| information about what is supported, while overloading GET
| would not.
|
| And because "same guarantees" doesn't mean "means the same
| thing"; PUT and DELETE have the same guarantees (idempotent but
| not safe), but we don't use PUT with no body for DELETE.
| hodgesrm wrote:
| It seems as if the idea behind this proposal is to help out
| database folks. If so, that is misguided. POST is a better
| implementation than QUERY (or GET) at least for SQL databases.
| Here's why.
|
| In SQL this is a query: SELECT a, b, c FROM foo
| LIMIT 1
|
| But this is also a "query" in many if not most connectivity APIs.
| INSERT INTO foo VALUES (1, 2, 3)
|
| Most client libraries _don 't know_ and _don 't care_ about the
| content of the query. It's the database's job to parse it and and
| do the right thing. The different between the above queries is
| that the first one returns a result set and the second returns an
| update count. Here's a simple example using Python and the
| clickhouse-driver library. # An UPDATE to the
| database client.execute('INSERT INTO iris SELECT * FROM
| another_iris_table') # A harmless "query" result
| = client.execute('SELECT COUNT(*) FROM iris')
| print(result)
|
| For this to work you need to use something underneath that is
| generic and works regardless of output. POST does this already.
| The clickhouse-driver does not use HTTP protocol though other
| ClickHouse drivers do. I'm just using it as example of why you
| need a protocol than can handle any type of SQL "query" the same
| way on the wire. Otherwise the client will have to have a SQL
| parser to figure out which one to use. (Some clients actually do
| that but they are a very small minority.)
| wojcikstefan wrote:
| IMO this helps a lot more people than just database folks. Any
| web application which implements a fairly granular
| search/filtering mechanism for its resources may run into the
| URL character limit with GETs. QUERY sounds much more
| appropriate here than what most applications do today (i.e.
| abuse POSTs).
| hodgesrm wrote:
| In that case it's not helpful to tie it to SQL. As my
| examples demonstrated, it's pretty useless for SQL database
| connectivity. If you are looking a general query mechanism it
| would make more sense to have something that looks like GET
| with a body.
|
| ClickHouse also supports GET as a verb. In addition to URL
| length issues the query needs to be URL-encoded which makes
| it difficult to read and debug.
|
| p.s., It's interesting to see my post downvoted. It's more
| productive to show why it's wrong. I've worked on DBMS
| connectivity for over 30 years.
| [deleted]
| brightstep wrote:
| The examples in section 4 are just that, examples. They are not
| intended to be the only format a query may take. The issue with
| POST that QUERY solves is lack of an idempotency constraint.
| axiosgunnar wrote:
| Why not just allow bodies (that can be arbitrarily large) with
| GET requests?
| politelemon wrote:
| > Implementations are free to use any format they wish on both
| the request and response.
|
| The samples should include some non-SQL, completely made up ones,
| as I think a lot of people are going to fixate on the SQL-like
| syntax and its associated problems.
| dschep wrote:
| it does https://www.ietf.org/archive/id/draft-ietf-httpbis-
| safe-meth...
| danidiaz wrote:
| > QUERY requests are both safe and idempotent with regards to the
| resource identified by the request URI. That is, QUERY requests
| do not alter the state of the targeted resource. However, while
| processing a QUERY request, a server can be expected to allocate
| computing and memory resources _or even create additional HTTP
| resources through which the response can be retrieved_.
|
| The possible creation of extra HTTP resources (response
| resorces?) seems to me contrary to idempotency. That seems more
| like the territory of POST.
|
| If two identical QUERY requests might produce different response
| resources, how to square that with the the fact that QUERY will
| be cacheable?
| twic wrote:
| > The possible creation of extra HTTP resources (response
| resorces?) seems to me contrary to idempotency.
|
| If two repetitions of a QUERY request create the _same_ extra
| HTTP resource(s), then it can be idempotent.
|
| Idempotent means you can't tell the difference between 1 or N
| requests, not that you can't tell the difference between 0 and
| 1. Think about PUT, which is also idempotent.
| jasonhansel wrote:
| Still, it is odd that a GET-like method is allowed to have
| side effects.
| jpitz wrote:
| GET is allowed to have side effects, just not beyond the
| first invocation of a given request.
| twic wrote:
| I know what you mean. It feels like we're missing an idea
| of scope for resources. If there was some kind of
| transaction scope or session scope or something, then a
| QUERY could create resources within that scope, so we could
| know that in the long run, it has no side effects. But that
| would be antithetical to the idea of statelessness perhaps.
|
| Or maybe we just need distributed garbage collection for
| URLs.
| wizzwizz4 wrote:
| > _Or maybe we just need distributed garbage collection
| for URLs._
|
| Printing a URL on paper leaks it. Writing a URL down, or
| memorising it, _also_ leaks it - but the computer has no
| way of knowing this has happened.
|
| I don't think this is feasible.
| dragonwriter wrote:
| GET is (and all methods, including all safe and idempotent
| methods, are) allowed to have side effects, per the spec.
| Safe and idempotent are _not_ mathematical constructs as
| defined in HTTP, they are more "business" constructs.
| scheme271 wrote:
| But pretty much every http request has side effects if you
| consider logging.
| nybble41 wrote:
| The way I look at it is that the system must continue to
| meet its requirements (whatever they might be) whether it
| gets one GET request or many in response to a single
| action within the user agent (clicking a link, submitting
| a form, script making a request, etc.). In general,
| logging two requests instead of one does not violate any
| requirements and in fact logging _every_ request, even
| duplicates, is the expected behavior. Adding the same
| item to a list twice in response to a single UI
| interaction, on the other hand, would not give the
| desired effect.
| denton-scratch wrote:
| > idempotent with regards to the resource identified by the
| request URI
|
| That means that a QUERY request _can_ change the state of the
| server, for example by creating new resources; there 's exactly
| one resource it's not allowed to change.
|
| If I've read it right.
| unilynx wrote:
| That has always been the case ... requests get logged, and if
| the server exposes its access logs over HTTP, that's one
| thing for which a request won't be idempotent
|
| Idempotent etc in the HTTP specs has always been more or less
| an attempt at a promise to the client "you should be able to
| repeat this request if you're not sure about success/failure
| without anyone claiming to implement HTTP being able to throw
| the book at you".
|
| Just like GETs shouldn't have side effects. But in practice
| of course, things like
| https://thedailywtf.com/articles/the_spider_of_doom happen
| rhdunn wrote:
| A resource is defined by a path, so if you have a `QUERY
| /documents` or `QUERY /albums` endpoint, the resource is all
| documents or albums that you are searching across, so it
| cannot add one of those items (like `POST /album`). It is
| possible that this could affect some other resource (e.g. an
| audit trail), which would mean that a `QUERY /logs/audit`
| endpoint must not add an audit log entry per the idempotent
| requirement.
| dfox wrote:
| As I read it I think that the idea there is to allow usage of
| pattern where the resulting resource refers to other resources
| that somehow encode the contents of the QUERY request body in
| their URL (or even results in redirect to such resource). For
| example the result of QUERY is page with html table of the data
| which also includes server-side rendered chart of the same data
| as an external image.
|
| [Edit: the return redirect to URL that somehow encodes the
| query usage is even given as an example in section 4.2]
| anticristi wrote:
| I was wondering the same, but this seems to clarify that:
|
| > The cache key for a query (see Section 2 of [HTTP-CACHING])
| MUST incorporate the request content.
| dragonwriter wrote:
| > The possible creation of extra HTTP resources (response
| resorces?) seems to me contrary to idempotency.
|
| A GET request might create additional (or modify existing)
| resources, say if the API exposed it's own log via HTTP.
|
| Both safe and idempotent are less expensive than one might
| naively think in the HTTP spec (which is good, because the
| naive understanding, while aesthetically seductive, isn't very
| practical at all.) Some quotes from the relevant bits of RFC
| 7231:
|
| "This definition of safe methods does not prevent an
| implementation from including behavior that is potentially
| harmful, that is not entirely read-only, or that causes side
| effects while invoking a safe method. What is important,
| however, is that the client did not request that additional
| behavior and cannot be held accountable for it."
|
| "The purpose of distinguishing between safe and unsafe methods
| is to allow automated retrieval processes (spiders) and cache
| performance optimization (pre-fetching) to work without fear of
| causing harm. In addition, it allows a user agent to apply
| appropriate constraints on the automated use of unsafe methods
| when processing potentially untrusted content."
|
| "Like the definition of safe, the idempotent property only
| applies to what has been requested by the user; a server is
| free to log each request separately, retain a revision control
| history, or implement other non-idempotent side effects for
| each idempotent request."
|
| "Idempotent methods are distinguished because the request can
| be repeated automatically if a communication failure occurs
| before the client is able to read the server's response."
| brainwipe wrote:
| In HTTP idempotent is where the state of the server remains
| unchanged. A resource in HTTP is what you get back from an URL,
| so https://example.com/people/show/10 (where 10 is the page
| number) is a different resource to
| https://example.com/people/show/100.
|
| If I interpret it correctly - adding more resources isn't
| changing the server state but adding more ways of getting to
| the state.
| danidiaz wrote:
| So that means that, without a cache, repeating a QUERY might
| create two response resources but, with a cache, only one
| will be created. I find that odd. My understanding of HTTP
| idempotency is that it's more of a "whole-server" concept
| (excepting perhaps things like creation of log entries and
| metrics). Always creating a new resource for each request
| seems contrary to that.
|
| A way to square creation of response resources with
| idempotency could be: the second identical QUERY that arrives
| should _always_ reuse the result resource created by the
| first QUERY.
| CrazyStat wrote:
| What if the underlying data has changed?
|
| If I QUERY the current price of a stock, and then someone
| else sends an identical QUERY ten seconds later, they might
| get a different result. This is not because QUERY isn't
| idempotent.
| danidiaz wrote:
| I think that, when talking about idempotency, there's the
| implicit assumption that the "rest of the world" stays
| the same while the sequence of operations is performed.
|
| rfc2616 says:
|
| > Methods can also have the property of "idempotence" in
| that (aside from error or expiration issues) the side-
| effects of N > 0 identical requests is the same as for a
| single request.
|
| https://datatracker.ietf.org/doc/html/rfc2616#page-51
|
| Perhaps changes in the underlying data could be
| considered "expiration issues". Otherwise not even GET
| could be considered idempotent in many cases.
| dragonwriter wrote:
| > rfc2616 says:
|
| "Obsoleted by: 7230, 7231, 7232, 7233, 7234, 7235"
| samhw wrote:
| I think this is pointing to the problem with your
| definition of 'idempotent'. Idempotency simply means that
| any number of additional identical requests will have
| _the same effect_ on the state of the resource, not that
| they will have _no effect_. (And by 'have the same
| effect', we mean 'produce the same state', not 'alter
| state in the same way' - effects are algebraic
| projections.)
|
| That's why it's called idempotent - 'doing the same' -
| rather than impotent.
| CrazyStat wrote:
| Idempotency is not about "you get the same result", it's
| about the effects of your http request _on the server_.
| Notice that the definition you quoted is in terms of
| side-effects, not results.
|
| If a request changes the state of the server and another
| identical request changes the state of the server in a
| different way, it's not idempotent.
|
| If a request doesn't change the state of the server at
| all it is idempotent, even if subsequent requests might
| get different responses (e.g. the stock quote example in
| my previous post).
|
| If a request changes the state of the server but repeated
| identical requests don't have any different effect it is
| also idempotent. For example, DELETE is idempotent
| because DELETE-ing something N times is the same as
| deleting it one time.
| CrazyStat wrote:
| > In HTTP idempotent is where the state of the server remains
| unchanged.
|
| This is incorrect. DELETE is idempotent but changes the state
| of the server.
| dragonwriter wrote:
| Also PUT.
| dragonwriter wrote:
| > In HTTP idempotent is where the state of the server remains
| unchanged
|
| No, it's not. That's closer to "safe" than "idempotent" (safe
| also implies idempotent, but not the other way around), but
| even then it is not quite right, because even safe methods
| are allowed to have side effects, but their is guidance about
| the _kind_ and _impact_ of side effects that it shouldn 't
| have.
| marcosdumay wrote:
| Hum... You are complaining about a request having the side
| effect that a server may fork another process to answer it?
| That's not really much anybody can do about this.
| davidhariri wrote:
| I'm in favour of a QUERY verb. Always felt wrong to use GET or
| POST for it.
| jasonhansel wrote:
| > caches SHOULD first normalize request content to remove
| semantically insignificant differences, thereby improving cache
| efficiency
|
| This feels like a bad idea, since (a) different caches will
| support different content types and normalize them in different
| ways, leading to unexpected changes in behavior and (b) some
| servers may behave differently depending on something that a
| cache considers to be a "semantically insignificant" distinction.
| I'm not sure, in other words, if I trust caches to get this
| right.
|
| It seems like it might be better to require clients to submit
| requests in a "pre-canonicalized" form, or to have caches allow
| this behavior but disable it by default.
| mjevans wrote:
| This is what I would like to see as a logical conclusion that
| is future proof:
|
| The query and response MUST be transmitted and stored 'as is'
| (a sequence of octets).
|
| The query and response SHOULD be encoded in either UTF-8 or
| WTF-8 https://simonsapin.github.io/wtf-8/
|
| Future standards or non-standard systems MAY use different
| encodings; conformant implementations MUST NOT alter the
| sequence of bytes. They MAY perform a validation check and add
| additional headers.
| progval wrote:
| From your link:
|
| > WTF-8 _must not_ be used to represent text in a file format
| or for transmission over the Internet.
| amdelamar wrote:
| A year ago I had a coworker build a CRUD app with an API that
| required GET with a body. Something simple like:
| GET /api/query Host: example.org Content-Type:
| application/json { "a":
| "valueWith$pecialChars", "b": "valueWith$pecialChars",
| "limit": 100 }
|
| It worked fine for him because he used curl, which allows GET
| with a body. But I was using Paw (similar to Postman) which
| refused to send it. I mentioned the issue to him to which the
| reply was along the lines of "its a non issue, just use curl". I
| kid you not, 1 week after this coworker left for another job I
| fixed the service to accept POST requests.
|
| If QUERY was around I'm sure I could've made a stronger case to
| fix it sooner.
| spyspy wrote:
| I've never accepted GET with a body but I've been burned
| several times using curl to test APIs. curl and browsers are
| very different beasts.
| torgard wrote:
| > required GET with a body
|
| Elasticsearch also encourages GET with body. But a request
| payload is undefined, according to the RFC: A
| payload within a GET request message has no defined semantics;
| sending a payload body on a GET request might cause some
| existing implementations to reject the request.
| tester34 wrote:
| What is the difference between this and allowing HTTP GET with
| Body?
| soheilpro wrote:
| From RFC 7231: A payload within a GET request
| message has no defined semantics; sending a payload body
| on a GET request might cause some existing
| implementations to reject the request.
| tester34 wrote:
| Yea, that's what I meant
|
| What if we allowed HTTP GET Body?
| NovemberWhiskey wrote:
| Right. The HTTP RFCs have been backing off gently from the
| initial position that implementations should not sent bodies
| with GETs and that the semantics of the GET request were
| defined purely in the request URI.
|
| But presumably no-one is brave/foolhardy enough actually to
| redefine GET as having a semantic body because a bazillion
| different implementations (clients, servers and middle boxes)
| probably become non-compliant.
| tester34 wrote:
| >redefine GET as having a semantic body because a bazillion
| different implementations (clients, servers and middle
| boxes) probably become non-compliant.
|
| So what actually?
|
| apps that didnt use GET Body, will not care anyway
|
| apps that will use HTTP GET Body will be checked anyway
|
| So, unless somebody downgrades HTTP Server then what could
| be the problem?
| dragonwriter wrote:
| Aside from how much easier it is to identify whether a
| component supports QUERY than which forms of GET it
| supports, GET and QUERY (like PUT and DELETE) have
| similar guarantees have different meaning and are
| sometimes (but not always) useful against the same
| resource for different purposes. OPTIONS lets you tell
| the availability of that of they are different methods,
| but not if one is GET w/o body and the other is GET
| w/body.
| dagss wrote:
| Many, many services (most of the internet?) has the
| backend sitting behind a proxy that would throw away the
| payload before it gets to the "apps".
|
| Granted they need to get support for QUERY too but at
| least it is more explicit then.
|
| An official readonly flag to POST would have been more
| backwards compatible...
| saurik wrote:
| Are they more or less likely to be compliant, though, than
| with a new verb?
| detaro wrote:
| Handling a method or not is a much more obvious thing to
| discover/observe.
| dagss wrote:
| With an official readonly header to POST instead all middlewares
| and proxies would have automatic support and this can be adopted
| in months instead of years or decades...
| tsimionescu wrote:
| Why would that be easier to implement than a new method? It's
| literally just a string. Many non-standard request methods are
| already supported.
| glenjamin wrote:
| Because unknown headers are already passed through safely in
| existing implementations, whereas unknown methods are handled
| in a variety of different ways
| zestyping wrote:
| There's no way to do client-side caching with this, which seems
| like a fatal omission -- in any given situation where you would
| consider using QUERY, it'll almost always be more efficient to
| put the query in the parameters of a GET requset.
| remram wrote:
| Why do you think you can't do client-side caching?
| ferdowsi wrote:
| > The QUERY method provides a solution that spans the gap between
| the use of GET and POST. As with POST, the input to the query
| operation is passed along within the payload of the request
| rather than as part of the request URI. Unlike POST, however, the
| method is explicitly safe and idempotent, allowing functions like
| caching and automatic retries to operate.
|
| Is this really worth a change to every HTTP client library out
| there to support this? The limited applications that really need
| this can easily use POST and document their own semantics around
| this.
|
| If anything the trend with GraphQL is to ignore HTTP verbs
| outright because they are limited and inexpressive beyond simple
| CRUD tasks.
| jonwinstanley wrote:
| Agreed. Seems like a slight improvement over something that was
| decided and settled many years ago.
| cryptonym wrote:
| POST request caching is always tricky and often not allowed.
| It can also improve reliability: you can safely retry such
| request.
| jayd16 wrote:
| It also seems trivial to fallback to POST for backwards
| compatibility, no? I'm not sure it needs every lib to be
| updated before devs can gain value from this.
| brightstep wrote:
| Definitely. The HTTP spec has a gap that's being filled with a
| hack, albeit a widely accepted and implemented one. QUERY
| removes ambiguity, aids self-documentation of APIs, and
| improves caching.
| [deleted]
| badrabbit wrote:
| This is cool and all but why not just expand the scope of GET
| requests in newer HTTP standards? Maybe have a X-GET-QUERY header
| to indicate the type if GET request? the problem I see with a new
| method is that it isn't just webservers that need to support it,
| it is also webapps. Ideally this would be transparent to the
| webapp (which would just see really big arrays of GET params).
| The user-agents (browsers) would ideally support this
| transparently where as with a new method the JS/HTML would need
| to explicitly support it.
| brightstep wrote:
| One of the issues that QUERY solves is that POST is overloaded
| and is being used for purposes beyond its intended
| responsibility. Shifting that overloading to GET feels to me
| like just another hacky approach. I prefer the well-defined,
| single responsibility that QUERY brings and restores to POST.
| throwoutway wrote:
| Once browsers offer support, the web app+ server would need to
| support it. Both of those are in control of the devs. The real
| problem lies in getting infrastructure teams to update the
| expensive F5 load balancers, and PA/CP/TP firewalls to process
| the requests. Those aren't in control of the devs (unless
| they're operating together well as a team)
| wojcikstefan wrote:
| > Maybe have a X-GET-QUERY header to indicate the type if GET
| request?
|
| Note that using the "X-" prefix has been deprecated since 2012:
| https://datatracker.ietf.org/doc/html/rfc6648
| badrabbit wrote:
| Wow, news to me. Thanks
| wizzwizz4 wrote:
| Here's the relevant section of the guidelines:
| https://datatracker.ietf.org/doc/html/rfc6648#section-3
| Creators of new parameters to be used in the context of
| application protocols: 1. SHOULD assume
| that all parameters they create might become
| standardized, public, commonly deployed, or usable across
| multiple implementations. 2. SHOULD employ
| meaningful parameter names that they have reason to
| believe are currently unused. 3. SHOULD NOT
| prefix their parameter names with "X-" or similar
| constructs. Note: If the relevant parameter
| name space has conventions about associating
| parameter names with those who create them, a parameter
| name could incorporate the organization's name or primary
| domain name (see Appendix B for examples).
| indymike wrote:
| The HTTP Query method is problematic: every request to a web
| server is by definition a query, so it is at a minimum poorly
| named. Second, most queries are not idempotent and the return
| value can and will change. In other words, YAGNI.
| layer8 wrote:
| I don't agree that "every request to a web server is by
| definition a query", similar to how not every SQL statement is
| a query. In terms of command-query separation, commands may
| return information about the execution of the command; the fact
| that they return something doesn't make them a query. For
| example, an SQL UPDATE statement may return how many rows were
| updated, or that some error occurred; that doesn't make it a
| query.
| indymike wrote:
| > I don't agree that "every request to a web server is by
| definition a query",
|
| Query is a synonym for ask. Request is a synonym for ask.
|
| We are now making request requests?
|
| This is redundant and is not needed.
| layer8 wrote:
| "Request" and "query" are not synonymous. A query is a
| request for information. A request can also be a request
| for action.
| mjb wrote:
| > QUERY requests are both safe and idempotent with regards to the
| resource identified by the request URI.
|
| Is that really what you want from a query operation? I read
| 'idempotent' as implying that result sets don't change over time,
| which would be surprising behavior for queries for most database-
| like things.
|
| It's probably also worth mentioning that SQL's SELECT isn't
| idempotent in the way HTTP means it, because of the existence of
| session state, pessimistic locking, and the requirements of
| higher isolation levels. It would be useful for an RFC to define
| 'idempotent' in a way that clearly addressed these issues (and,
| for that matter, the larger topic of sessions/transactions) more
| clearly.
|
| > When doing so, caches SHOULD first normalize request content to
| remove semantically insignificant differences, thereby improving
| cache efficiency
|
| Unfortunately, again when you look at SQL by comparison, queries
| are not purely expressions of what to return. Practically, they
| also encode how to compute the query (either explicitly through
| hints, or implicitly through things like join order). These
| behaviors are weird, tricky, and change version-to-version.
|
| > The QUERY method is subject to the same general security
| considerations as all HTTP methods as described in
|
| As another commenter said, this is quite incomplete. Query
| parameter injection, DoS by locking, DoS by exploiting work the
| database needs to do to ensure isolation, DoS by extremely
| expensive query, etc.
|
| > 4.2. Simple QUERY with indirect response (303 See Other)
|
| At least the examples here are naive - most applications don't
| want query result sets to be easily accessible to others. The
| semantics of authn and authz need to be really crisp here to make
| sure that attackers can't access the location of other queries
| result sets purely by guessing.
|
| At least a "SHOULD use auth" or "SHOULD have large, unguessable,
| names" would be valuable here.
| detaro wrote:
| > _I read 'idempotent' as implying that result sets don't
| change over time, which would be surprising behavior for
| queries for most database-like things._
|
| That's not what idempotent means in HTTP.
|
| > _As another commenter said, this is quite incomplete. Query
| parameter injection, DoS by locking, DoS by exploiting work the
| database needs to do to ensure isolation, DoS by extremely
| expensive query, etc._
|
| Is application-dependent and applies to all other HTTP methods
| too.
| mjb wrote:
| RFC2616 says:
|
| > A sequence is idempotent if a single execution of the
| entire sequence always yields a result that is not changed by
| reexecution of all, or part, of that sequence.
|
| Which isn't, because of isolation, true in general of
| database queries. Obviously this is in context of RFC2616
| saying that sequences of idempotent HTTP operations may not
| be idempotent in themselves, but that definition seems very
| incomplete in the context of database queries.
|
| > Is application-dependent and applies to all other HTTP
| methods too.
|
| Sure. But I don't think that's a good argument in the modern
| world. Over the 22 years, we've learned a lot about the
| security concerns of running secure systems, and it seems
| reasonable to include those concerns in a section labelled
| "security considerations". SQL injection is a classic
| security bug, and should be a key concern of any reasonable
| new standard for sending queries between systems.
|
| A full security section should probably also mention cache
| timing side-channels, locking-related covert channels, and
| other similar concerns that come up when you increase the
| semantic power of HTTP. It's not that POST doesn't have these
| concerns, it's that we've learned in the last two decades
| that they are real problems for many kinds of real systems.
| detaro wrote:
| HTTP idempotence is only concerned with _effects_ of the
| request, not the result returned.
|
| RFC7231:
|
| > _A request method is considered "idempotent" if the
| intended effect on the server of multiple identical
| requests with that method is the same as the effect for a
| single such request._
|
| note the _on the server_.
|
| (or old specs, 2616: _Methods can also have the property of
| "idempotence" in that (aside from error or expiration
| issues) the side-effects of N > 0 identical requests is the
| same as for a single request._ - again, side-effects, not
| responses)
|
| How does a _read-only_ database query being repeated cause
| a change in the database?
| timwis wrote:
| Idempotency in the HTTP context is about the ability to make
| the same request multiple times without side effects.
|
| https://developer.mozilla.org/en-US/docs/Glossary/Idempotent
| zinekeller wrote:
| a) GET is also idempotent despite the fact that a re-request
| may retrieve a updated document.
|
| b) Some developers are ignoring HTTP idempotency. Not IETF's
| fault for them abusing GET for deletion.
| marcosdumay wrote:
| Well, GET is not idempotent if the document has been
| updated.
|
| The specs do not talk much about the document changing
| behind your back, only about the changes you cause by
| yourself.
| timwis wrote:
| But if you GET /posts/123 and then do it again, and in
| between, the author updated the post, you'd expect to get
| the latest version of the post, no? That doesn't make it
| non-idempotent, because your GET requests did not change
| the state at all.
| atuladhar wrote:
| I'm pretty sure timwis and zinekeller in this thread (and
| detaro in the sibling thread) are all saying the same
| thing. Idempotency implies the request in question does
| not change the state, not that the state would not have
| changed because of other operations in the meantime. GET
| and QUERY are meant to be idempotent but whether they
| really are in practice depends on how they've been
| implemented.
| user3939382 wrote:
| Maybe the distinction is that the request (your request)
| is itself not responsible for the change.
| brainwipe wrote:
| This is correct. Idempotent GET should not itself change
| the state of the server.
| tsimionescu wrote:
| > But if you GET /posts/123 and then do it again, and in
| between, the author updated the post, you'd expect to get
| the latest version of the post, no?
|
| Not necessarily - plenty of systems offer no such
| guarantees, and the Web is by design eventually
| consistent [0]. This is what content expiration and
| various other cache control mechanisms are for - it's not
| always so important to get the latest version of a
| document. For example, the HN logo or index.html can
| probably be safely cached for days, since they are very
| unlikely to change, and even if they do, it's unlikely to
| have a major problem if someone only sees the new version
| after a few days.
|
| [0] Note that, at the extreme, due to special relativity,
| there is no absolute notion of "latest version" on the
| scale of geographically distributed computers: it's
| physically impossible to say if a request made in China
| to a server in the USA happened before or after a change
| on the server, if they happened close enough together -
| order of tens of milliseconds, an eternity in compute
| time.
| squeaky-clean wrote:
| Idempotency only refers to state changes from subsequent
| invocations of the call.
|
| A command to add a user to the set of users that have
| upvoted a post would be idempotent. Because you can run
| it 20x and only the first call affects anything. A
| command to increase the upvote count for a comment by +1
| would not be idempotent.
| timwis wrote:
| Good point!
| tsimionescu wrote:
| But idempotency is relevant because of two things:
|
| 1) is it safe to automatically retry the request? - this
| meshes well with what you're saying
|
| 2) is it safe to return a cached version of the response,
| instead of sending the request again? Idempotence in your
| sense is necessary but not sufficient for this case - hence
| the various content expiration and If-Newer-Then etc headers.
| himinlomax wrote:
| > I read 'idempotent' as implying that result sets don't change
| over time
|
| Idempotent means the request itself doesn't change stuff, it
| doesn't mean something else won't.
| mjb wrote:
| You're right, and I was fuzzy about what I meant. I didn't
| mean (although wasn't clear) that QUERY would have to return
| the whole result set over time - clearly that's beyond the
| scope of HTTP's definition of idempotency.
|
| However, because of the existence of isolation and locking
| concerns in databases, even fairly simple queries are not
| idempotent. RFC2616 goes to some effort to (fuzzily,
| unfortunately) talk about sequences of operations, which
| would be useful here.
| anticristi wrote:
| My OCD is really happy that the symmetry is restored. I always
| felt that GET with the optional-but-actually-forbidden request
| body stands out. Once we transition from GET to QUERY, all HTTP
| transactions will be header+body in one direction, followed by
| header+body in the other.
| dragonwriter wrote:
| > Once we transition from GET to QUERY, all HTTP transactions
| will be header+body in one direction, followed by header+body
| in the other.
|
| Why would you put a body in a HEAD, OPTIONS, TRACE, or DELETE
| request (or GET, which will continue in use alongside QUERY)?
|
| And, no, a lot of response also don't have bodies, and that
| will continue.
| unilynx wrote:
| There will be no transition from GET to QUERY, they will exist
| next to each other.
|
| Odd proxy bugs/client quirks with zero/non-zero Content-Length
| headers in GET requests will remain with us forever
| efitz wrote:
| Remind me again why we are using a document transfer protocol for
| API use cases and trying to bolt on functionality like
| Frankenstein's monster?
| mpolichette wrote:
| I'll entertain this...
|
| Because, like it or not, you're often still sending/receiving
| documents? :)
|
| Also the semantics of the actions map really well to the use
| cases still.
|
| You're, of course, more than welcome to avoid the whole thing
| and use WebSockets or WebRTC data channels with a custom
| protocol.
| the_arun wrote:
| I am unable to understand - what is the difference between GET
| and QUERY? Just that in QUERY you can send parameters in the
| request body? Do we need a new method for that?
| tsimionescu wrote:
| Yes, because there are various assumptions about GET that won't
| fit if GET can suddenly contain a request body. For example,
| existing caching servers may continue to cache content based
| only on the URL and headers, ignoring the request body
| entirely, producing bad results. Additionally, there may be
| some more subtle problems related to the Content-Length header,
| which is supposed to NEVER be sent for GET, but would be
| required for QUERY (since all requests that can contain a body
| MUST have a Content-Length header, depending on encoding; while
| requests that can't contain a body MUST NOT have a Content-
| Length header).
| est wrote:
| HTTP middle boxes are going to love this.
| rvr_ wrote:
| This must be some form of late April 1st joke. The samples are
| even SQL-injection!
|
| We need to stop modifying basic protocols like HTTP. We should've
| stop at 1.1.
| nostoc wrote:
| Injection means you can modify an existing query. The example
| are not SQL injections, they are full queries.
| otabdeveloper4 wrote:
| HTTP is not just for the web. In fact, the vast majority of
| HTTP trafic doesn't involve the browser at all.
|
| The examples are realistic and useful. E.g., Clickhouse uses
| POST methods for queries, and a ridiculous `&readonly=2`
| parameter to differentiate modifying queries from readonly
| SELECT queries.
| Angharad wrote:
| This is only a proposal for a new HTTP verb. How you use it
| (and how you interpret is) is completely up to you.
| intrasight wrote:
| I don't see any sql injection in samples
| the_arun wrote:
| Would this help GraphQL to use HTTP in proper way? like use HTTP
| QUERY for non-mutational requests?
| dschep wrote:
| Doubtful given that GraphQL doesn't prescribe a transport
| layer. [0]
|
| [0] https://graphql.org/faq/#does-graphql-use-http
| wojcikstefan wrote:
| I'm very happy about this proposal. The only sad thing is that it
| has come so late, after so many tools and protocols (e.g.
| GraphQL) already abuse POSTs for this use case.
| hermitdev wrote:
| I agree. It takes me back arguments I had with my PM when I
| worked for a small SaaS close to 10 years ago. I _had_ to use
| POST for a query API because of the limitations around GET &
| URL encoding of the parameters for the exact reasons outlined
| in TFA. She insisted it be a GET until I showed real, existing
| client queries that couldn't be handled. Only then, did she
| relent. Same PM also insisted I send results of queries as a
| list of objects in JSON, instead of a more compact tabular
| format, because tables aren't REST-y. I lost that battle, and
| the serialized results of queries were an order of magnitude
| larger than they needed to be...
| wizzwizz4 wrote:
| > _lost that battle, and the serialized results of queries
| were an order of magnitude larger than they needed to be..._
|
| Before or after Content-Encoding: gzip?
| Spivak wrote:
| > She insisted it be a GET until I showed real, existing
| client queries that couldn't be handled... Same PM also
| insisted I send results of queries as a list of objects in
| JSON, instead of a more compact tabular format, because
| tables aren't REST-y.
|
| I think I'm on the side of the PM with this one on both
| counts. You sound like someone who really cares about
| efficiency, performance, and edge cases -- a proper engineer.
| But PMs are supposed to bring us down to earth and say that
| simplicity and maintainability are more important than saving
| bytes and to not waste time fixing things that aren't broken.
|
| In a past life I spent so much effort optimizing our stack to
| lower our AWS bill until a PM sat me down with the company's
| finances and showed me the teeeeny little bar that was our
| cloud expenses and then legitimately 20x taller bar that was
| salaries and basically said that spending money to buy back
| my or my team's time was more important.
___________________________________________________________________
(page generated 2022-01-04 23:01 UTC)