[HN Gopher] Smithy: A language for defining services and SDKs
       ___________________________________________________________________
        
       Smithy: A language for defining services and SDKs
        
       Author : politician
       Score  : 236 points
       Date   : 2021-05-07 21:31 UTC (1 days ago)
        
 (HTM) web link (awslabs.github.io)
 (TXT) w3m dump (awslabs.github.io)
        
       | peterthehacker wrote:
       | How does this compare to Google's protocol buffers? [0] It looks
       | like smithy has a broader set of applications.
       | 
       | [0] https://developers.google.com/protocol-buffers
        
         | aszen wrote:
         | To me it seems like protocol buffers are a serialisation format
         | while Smithy is just an idl used to describe services.
         | 
         | The services are free to serialize the data in any way be it
         | json, XML or even protocol buffers.
         | 
         | So Smithy is more comparable to OpenApi than Protocal buffers.
        
           | monstrado wrote:
           | I feel like OP might be thinking of gRPC, which seems to
           | closely resemble the use case Smithy is tackling.
        
             | aszen wrote:
             | gRPC to my knowledge uses protocol buffers to define
             | services. But it's more of a framework for doing RPC.
             | 
             | Smithy is a much more abstract notation for defining
             | services independent of any implementation details, and it
             | is resource based rather than message based.
        
           | inshadows wrote:
           | > while Smithy is just an idl used to describe services
           | 
           | What's the use of such descriptions aside from diagrams? I
           | realize there's e.g. terraform that use similar kind of
           | language to describe what to create/destroy.
        
             | aszen wrote:
             | The main use case is to auto generate clients for services
             | and even create stubs for service implementations.
             | 
             | AWS sdks need to be implemented in dozens of languages so
             | something like Smithy helps in code gen for multiple
             | languages, avoiding the massively manual task of creating
             | client apis.
        
           | gravypod wrote:
           | Protobuf is commonly used to refer to both the serialization
           | format and the IDL. You can use protos to specify JSON/REST
           | endpoints. In past companies I've also used it to specify DB
           | schemas.
        
         | mtdowling wrote:
         | Protobuf and gRPC are great, and AWS will continue to make sure
         | developers can be successful using them with AWS. I'll try to
         | explain how we ended up at Smithy instead of using other
         | existing tools.
         | 
         | We started working on Smithy around 2018 because we wanted to
         | improve the scale of our API program and the AWS SDK team to
         | deal with the growing number of services (over 250 now!) and
         | languages we want to support in official AWS SDKs (like the
         | newly released Rust SDK). We had a ton of existing services
         | that we needed to be compatible with, but we also wanted to add
         | new features to improve new services going forward too.
         | 
         | We needed a very flexible meta-model that allows us to continue
         | to evolve the model to account for things like integrating with
         | other systems and to model service-specific customizations that
         | each AWS SDK team can implement independently. Smithy's meta-
         | model is based on traits, a self-describing way to add more
         | information to models. Lots of validation can be built in to
         | custom traits, which helps to ensure that service teams are
         | using traits properly and adhere to their specifications.
         | Smithy's resource modeling helps us here too because it allows
         | AWS service teams, as they adopt Smithy, to essentially
         | automatically support CloudFormation resource schemas.
         | Resources also help us to point service teams in the right
         | direction to make their services work well over HTTP (which
         | methods to use, URIs, safety, idempotency, etc).
         | 
         | We needed an integrated model validation, linting, and diff
         | tool to keep services consistent and detect breaking changes,
         | and it needed to support company-wide standards as well as
         | service-specific standards. We use Smithy's validation system
         | to automatically enforce API standards, and service teams often
         | create their own service-specific rules to keep their own
         | internal consistency.
         | 
         | We needed built-in input validation constraints so that they're
         | standard across services and clients (e.g., length, range,
         | pattern, etc). We didn't want to rely on third-party extensions
         | to provide this feature since validating inputs is important.
         | AWS uses internal service frameworks that enforce these
         | constraints and are compatible with Smithy models. We're
         | working to create open source service frameworks for Smithy as
         | well.
         | 
         | We also wanted to support various serialization formats so that
         | clients work with all of our existing services spread across
         | JSON, XML, query strings, RPC, and HTTP APIs, but we also
         | wanted to be able to evolve our serialization formats in the
         | future as new technology comes along. That's why Smithy is
         | protocol agnostic (like gRPC actually). The serialization
         | format is an implementation detail. Smithy has some support for
         | MQTT as well.
         | 
         | And finally, we need our code generators to be really flexible
         | to support service customizations. There's quite a few
         | customizations across AWS services, and we needed a way to
         | inject custom code generation logic in various parts of our
         | generators.
         | 
         | Smithy is still in heavy development, and we're working on
         | building out more of the tooling so it can be used easily
         | outside of AWS SDKs too, including client and server code
         | generation.
        
           | lenkite wrote:
           | Does Smith support describing OAuth ?
        
           | peterthehacker wrote:
           | Thanks for the thorough response! This helps me understand
           | the motivations behind Smithy. I'm going to dig into the
           | project and keep an eye on it as the tooling develops.
        
           | rancar2 wrote:
           | The history and origins help frame the Smithy use case for
           | others and where it may not apply. Thanks for sharing!
        
           | oblio wrote:
           | > Smithy's resource modeling helps us here too because it
           | allows AWS service teams, as they adopt Smithy, to
           | essentially automatically support CloudFormation resource
           | schemas.
           | 
           | Hallelujah!
        
       | whalesalad wrote:
       | I gotta say this is really neat. Definitely going to take it for
       | a spin. I've always felt like a lot of these tools have too much
       | ceremony - this feels light and simple.
        
         | ellisv wrote:
         | It's elegant. It doesn't overreach.
        
       | TranquilMarmot wrote:
       | This looks pretty sweet; I've spent a lot of time the past few
       | years writing a LOT of OpenAPI specifications and the thing that
       | gets tiring is all of the boilerplate with each bit of the
       | definition (paths -> get -> responses -> 200 -> application/json
       | -> schema). It gets so exhausting and turns into nested YAML soup
       | pretty quickly. Having better support for multiple files will
       | also be nice; _technically_ you can split an OpenAPI
       | specification into multiple files but the tooling never quite
       | deals with it properly.
       | 
       | I'll definitely be keeping an eye on this; looking forward to
       | TypeScript code generation! One of our main use cases for OpenAPI
       | is to document our models and then generate TypeScript types from
       | them (that we then use in Node and React apps)
        
         | conradludgate wrote:
         | We use multiple files at work, but all of our make rules first
         | bundle the spec into a single file. We also save that bundle in
         | our VCS. CI will fail if the bundle hasn't been recreated since
         | updating the split files
        
         | RideAndWave wrote:
         | We were bothered by the same thing :) Many plugins for OpenAPI
         | had no codegens which would produce deterministic behavior for
         | compatibility across various languages. Also we wanted to have
         | something that could express complex business domains in
         | definitions.
         | 
         | Ended up building protoforce.io for the very purpose. It does
         | support typescript + nodejs, which we used for the website
         | itself as well.
        
       | mingodad wrote:
       | For curiosity and testing what I learned from CocoR
       | https://ssw.jku.at/Research/Projects/Coco/ I created a parser for
       | Smithy that anyone can see/use here
       | https://github.com/awslabs/smithy/issues/793 , also include a
       | transformed ABNF IDL to an EBNF accepted by
       | https://www.bottlecaps.de/rr/ui
        
       | aphexairlines wrote:
       | Smithy structure definitions don't have field indexes like in
       | protobuf or thrift. It also has required fields, where proto3
       | intentionally did away with them.
       | 
       | How do these differences impact backwards-compatibility-safety of
       | Smithy schema changes?
        
         | mtdowling wrote:
         | Smithy today doesn't support any serialization formats that
         | require fixed ordering. We do recommend that any additional
         | members added to structures are added at the end to help with
         | C++ and Rust codegen though.
         | 
         | That said, traits can be used in Smithy to enforce constraints
         | on structures, so if you ever needed explicit indexing like
         | that, it could be done via traits and protocols (Smithy's
         | nomenclature for describing how clients and servers
         | communicate). In fact, protocols are defined by traits, and
         | traits can enforce requirements on the rest of the model using
         | a DSL called selectors... Probably way too much info other than
         | -- it's possible and easy to support this in Smithy if it's
         | ever needed.
         | 
         | As for required vs optional -- today it's treated as server-
         | side validation only and not used in client codegen. This
         | allows service teams to remove the required trait from members
         | if something ends up needing to be optional in future without
         | breaking clients. We're working on some ideas too to see if we
         | can generate even better code for SDKs in languages like Rust
         | where optionality is very explicit, but without sacrificing the
         | ability of being able to remove the required trait.
         | 
         | And, in general, backward compatibility issues are caught with
         | Smithy diff, which also supports custom rules:
         | https://github.com/awslabs/smithy/tree/main/smithy-diff
        
       | PostThisTooFast wrote:
       | "The primary difference between Smithy and OpenAPI is that Smithy
       | is protocol-agnostic, allowing Smithy to describe a broader range
       | of services, metadata, and capabilities. Smithy can be used
       | alongside OpenAPI by converting Smithy models to OpenAPI."
       | 
       | How is OpenAPI not "protocol-agnostic" then? And if you can
       | convert to OpenAPI, then OpenAPI must be capable of the same
       | representations.
       | 
       | This demands explanation.
        
       | troelsSteegin wrote:
       | Smithy, Protoforce, Taxi - is the assumption synchronous
       | request/response? OpenAPI has callbacks
       | (https://swagger.io/docs/specification/callbacks/), I am not
       | seeing that in these others.
       | 
       | The other thing I am not seeing is a service registry, eg
       | something like UDDI. How do microservices know about each other?
       | Is that a build over common IDL and a coordinated deploy?
        
         | mtdowling wrote:
         | Smithy has something called event streams that send async
         | datagrams:
         | https://awslabs.github.io/smithy/1.0/spec/core/stream-
         | traits....
         | 
         | This is currently used in Amazon S3, Kinesis, Transcribe, and
         | other services.
         | 
         | Smithy doesn't have a service registry today. However, models
         | can be vended and shared via Maven. Client codegen was designed
         | explicitly to not require coordinated releases of clients and
         | servers (that's impossible for AWS SDKs).
        
           | troelsSteegin wrote:
           | Thank you. Some new vocabulary with Smithy: Prelude, Shapes,
           | and Traits
           | (https://awslabs.github.io/smithy/1.0/spec/core/model.html ).
           | With all the evolution, the lineage story would find an
           | audience, I bet.
        
         | RideAndWave wrote:
         | For protoforce, we provide completely asynchronous server and
         | client SDKs for scala and nodejs, also we suport websockets as
         | a transport.
         | 
         | Java sdk is built on Futures, so somehow it's async as well.
         | 
         | Also we support server-to-client calls, which, effectively, are
         | a better alternative to callbacks.
        
         | martypitt wrote:
         | Taxi+Vyne gives you a complete service registry, so both users
         | and systems can discover services and data.
         | 
         | Because Taxi lets you describe how data from services relate,
         | Vyne can work out how to connect services together
         | automatically, and handle the integration for you. This is a
         | realisation of the UDDI concept, where systems can autonomously
         | work out how to operate with each other.
        
           | troelsSteegin wrote:
           | Thank you. Your recent blog post
           | (https://blog.vyne.co/rethinking-api-consumer-patterns/ ) was
           | a positively interesting read. I wonder if there's a query
           | analyzer a ways down the road ...
        
       | haolez wrote:
       | Starting a project like this from scratch must have been a bold
       | move for the team involved. I imagine that, in a pitch, the
       | outcome of this tool seemed too good to be true and they actually
       | delivered it. Well done!
        
       | martypitt wrote:
       | This is great. I'm happy to see more evolution in the service
       | definition space, especially in using custom DSL's. The low
       | signal-to-noise ratio and boilerplate in OpenAPI and RAML is a
       | killer IMO.
       | 
       | We're building something similar with Taxi
       | (https://docs.taxilang.org), which provides a rich way to
       | describe services and data.
       | 
       | Similar to Smithy's CityId example, we provide the ability to
       | semantically describe attributes. Our product - Vyne
       | (https://vyne.co) can then use these Id's to automatically chain
       | and orchestrate any services together, without having to write
       | integration code.
        
       | VWWHFSfQ wrote:
       | Is this how the Python `aws` CLI and boto is implemented?
        
         | mtdowling wrote:
         | Kind of. The AWS CLI uses an Amazon internal modeling format
         | used to define services that's based on another Amazon internal
         | modeling format that has been in use for about 15 years (and
         | it's based on another internal model etc..). Smithy is
         | basically the open source v2 of both, but with a public spec
         | and tooling. Eventually all the AWS SDKs and the AWS CLI will
         | adopt Smithy. (I work on the AWS SDKs and created Smithy)
        
           | BrandonSmith wrote:
           | Are there examples of utilizing Smithy with WebSockets as the
           | transport? I found the documentation of MQTT bindings and the
           | general information about event streams. However, I'm
           | struggling to map it all together. I imagine there will need
           | to exist WebScoket bindings?
        
             | mtdowling wrote:
             | We haven't built a WebSockets based protocol yet. And yeah,
             | without actual server-side support, it is a little meta
             | right now. We're working on it and hope to roll out a few
             | languages this year.
        
         | Raesan wrote:
         | Pretty much all APIs at Amazon , internal and external, are
         | either defined using this or its precursor and then per-
         | language clients and server stub implementations are
         | autogenerated based on the model. That's true of boto3. Not
         | sure how much the of CLI is autogenerated, but the CLI uses
         | boto(core) under the hood so it's involved one way or the
         | other.
        
       | stevenhuang wrote:
       | Is this supposed to generate the language bindings too from the
       | Smithy schema? How is that done? Can't seem to find info on that
       | in the Guides or examples.
        
       | sunilkumar992 wrote:
       | It is the good thing
        
       | scrubs wrote:
       | Pardon my ignorance: once one gets to:
       | 
       | https://awslabs.github.io/smithy/quickstart.html#next-steps
       | 
       | well, what is it? What was autogenerated? I know JAVA from
       | previous jobs, but not Gradle and whatever the Smithy generated
       | artifacts are to be composed with is unclear to me.
        
         | mtdowling wrote:
         | It's not really a fully finished project yet, so not much. We
         | shipped the AWS SDK for JS v3 with Smithy, the AWS SDK for Go
         | v2 with Smithy, and just launched an alpha of the AWS SDK for
         | Rust using Smithy. More are in the works. We're currently
         | iterating on their code generators to make them easier to use
         | outside the AWS SDKs. AWS SDKs are being built in a layered
         | approach where there's a generic code generator that's really
         | extensible, and then the AWS SDKs extend it to add AWS-specific
         | stuff like regions and credential handling.
         | 
         | We're working to get projects like these to GA:
         | https://github.com/awslabs/smithy-typescript,
         | https://github.com/aws/smithy-go, and
         | https://github.com/awslabs/smithy-rs. And we're also working on
         | service code generation.
        
           | scrubs wrote:
           | I gather the impact of Smithy is to generate something like
           | an old style CORBA client/server stubs and vocabulary types
           | for a given serialization, and communication schema (HTTP2,
           | gRPC) in a target language? Is Smithy specific to
           | interoperating with AWS services or can used in pretty much
           | any distributed system?
        
             | mtdowling wrote:
             | Pretty much, but Smithy can be used for anything and isn't
             | specific to AWS. The AWS modeling support in Smithy is all
             | through extensions that aren't part of the core.
             | 
             | It's also protocol agnostic and can be used in a lot of
             | applications (HTTP, MQTT, and we are even experimenting
             | using Smithy to generate C ABI bindings for non client
             | server stuff).
        
           | oblio wrote:
           | Python?
        
             | stevesimmons wrote:
             | Python with interoperability with FastAPI and Pydantic
             | models would be fantastic
        
       | a-dub wrote:
       | what ever did happen to wsdl?
       | 
       |  _ducks_
        
         | qbasic_forever wrote:
         | IIRC SOAP helped give the browser world XmlHttpRequest which
         | eventually helped kickstart the whole AJAX web 2.0 / modern
         | webapp world we love today. So in some ways it was an important
         | stepping stone to get there, just a dead end from the general
         | demise of XML tooling.
        
         | speed_spread wrote:
         | Why duck? WSDL still beats the crap out of most newer would-be
         | contract definition standards. OpenAPI is ok, but the tooling
         | comes from all from the same place.
        
           | a-dub wrote:
           | soap was a really good interoperability protocol for adding
           | kilobytes of overhead to simple text based rest rpc where
           | interoperability was defined as being able to interoperate
           | with clients and servers from the exact same software
           | environment.
           | 
           | trying to get perl to talk to windows or windows to talk to
           | java or java to talk to perl never really worked at all.
           | 
           | but wsdl was like this tempting thing. self describing
           | services! (that only worked when you had a full definition of
           | the service on the client side anyhow)
           | 
           | i'd take a wild bet that this stuff at amazon grew out of
           | frustration with soap a long time ago...
           | 
           | edit: read new replies to thread. this is new as of 2018.
           | soap was circa 2002. ah, well.
        
             | oblio wrote:
             | Apparently this is V2 or even v3 of something internal. So
             | probably V1 did what you were saying :-)
        
               | mtdowling wrote:
               | Yup. Smithy is from a lineage of internal tools that have
               | been in use at Amazon since the early 2000s. From my dive
               | into software archaeology at Amazon (I work there), there
               | was a bit of SOAP in use in the 2000s, and some other
               | internal model formats that are now obsolete claimed to
               | be SOAP like. As the years went on, other internal
               | formats came out to replace other formats, until the
               | internal model format became something distantly inspired
               | by SOAP, but very practical and tuned for cross language
               | code generation so it could power AWS (that's my take at
               | least). That was in use for well over a decade before we
               | built Smithy to improve on and open source the internal
               | format.
        
         | johns wrote:
         | JSON
        
           | a-dub wrote:
           | ...and this thing looks like it scrapes out all the
           | complexity and leaves the good bits that wsdl was supposed to
           | be...
        
       | coward76 wrote:
       | "The primary difference between Smithy and OpenAPI is that Smithy
       | is protocol-agnostic, allowing Smithy to describe a broader range
       | of services, metadata, and capabilities. Smithy can be used
       | alongside OpenAPI by converting Smithy models to OpenAPI."
        
       | bpicolo wrote:
       | Reminds me a lot of this talk:
       | https://www.youtube.com/watch?v=j6ow-UemzBc
       | 
       | I think it's a really powerful paradigm. I think an org adopting
       | such patterns widespread leads to some really rad capabilities.
       | GDPR compliance, for example, is really tough in SOA
       | architectures in companies without a ton of engineering capacity,
       | but this sort of data/api introspection could do a lot to change
       | that.
       | 
       | Any plans to open source projection for languages other than
       | Java?
       | 
       | edit: I see that typescript and rust are available on github as
       | well
        
         | mtdowling wrote:
         | We expect to have Smithy code generation for every language we
         | support as official AWS SDKs.
        
       | malkia wrote:
       | I wonder if Google is going to extend this
       | https://fuchsia.dev/fuchsia-src/development/languages/fidl for
       | other consoles...
        
       | zmmmmm wrote:
       | Couldn't find a way to generate models for Python - not sure if
       | that is because it doesn't exist or I just cant find it. The code
       | gen part seems a bit under documented.
        
         | mtdowling wrote:
         | No Python codegen yet. Our plan is basically that as we migrate
         | AWS SDKs to Smithy, we'll also offer generic client generators
         | too. The Python migration hasn't started yet.
        
       | RideAndWave wrote:
       | We've built a similar thing at https://www.protoforce.io, which
       | auto-generates client and server side. It actually transpiles,
       | parsing the models definitions and emits actual code with a bit
       | of shared runtime.
       | 
       | Good amazon opened up their stuff, there should be more
       | competition on this front.
        
         | osdev wrote:
         | This is some impressive stuff! Congrats on a sizable
         | achievement. I just went through your website, have a few
         | questions...
         | 
         | Questions:
         | 
         | 1. SCALA: is the generated code ONLY scala? are other languages
         | supported?
         | 
         | 2. CODE-GEN: is it designed only for code generation for a
         | target http framework or does it actually provide an "API
         | Server" itself ( e.g. like graphQL )
         | 
         | 3. COMPARISON: the list of frustrations with other solutions,
         | listed in your intro/technology goals are somewhat high-level.
         | what are the "leaky abstractions" or "non-deterministic
         | behaviour"?
         | 
         | 4. SETUP: does someone need to know scala to use this ? (
         | depends on #1 above )
         | 
         | 5. DSL : is the protoforce code implemented as a DSL in Scala
         | or its a "language" in itself?
         | 
         | 6. IDE : how do you check/compile the code? does it integrated
         | with an IDE?
         | 
         | As a Scala Engineer myself ( though I mostly work in Kotlin now
         | for Android/Server ), this looks great, but most Scala
         | engineers i've met are focused on Spark, and using Play
         | Framework/Http4S, etc. How big is the actual market for Scala
         | API tools?
        
           | RideAndWave wrote:
           | Thank you.
           | 
           | 1. Scala, Typescript/Javascript, and Java at the moment.
           | 
           | 2. It does provide the runtime which allows to bootstrap a
           | server easily. (You can check out this post which has
           | modeling + scala setup example at the bottom
           | https://www.protoforce.io/ProtoForce/post/extensive-guide-
           | to...)
           | 
           | 3. Please take a look at the documentation, it has a good
           | outline of the features supported. There are many features,
           | most are well documented there.
           | 
           | 4. No, not really. You can do with other languages, it
           | provides both client & server sides, so no other language is
           | needed. Again, you can still generate client side stuff for
           | other languages and use them to connect to your server.
           | 
           | 5. protoforce website was implemented using the protoforce
           | DSL itself. The parser and transpilers are written in scala.
           | The portal is written in typescript + react.
           | 
           | 6. There is currently a sandbox at the website which you can
           | experiment in. There is no currently integration with other
           | IDEs, but language server can be added a bit later for VSCode
           | for instance.
           | 
           | Hope this answers a bit :)
        
         | smt88 wrote:
         | FYI - your website is an unreadable catastrophe on Firefox for
         | Android.
        
           | RideAndWave wrote:
           | We focused on the desktop because it is difficult to use from
           | mobile due to it being an online IDE. I'll take a look, thank
           | you for reporting.
        
             | smt88 wrote:
             | I get your thought process, but two things:
             | 
             | 1. If you use flexbox from the beginning, you can easily
             | have basic readability on mobile.
             | 
             | 2. A huge % of users will discover things on mobile, even
             | if those things are desktop apps.
             | 
             | I wasn't even able to tell what your product is at a basic
             | level.
        
               | RideAndWave wrote:
               | Noted, thank you.
        
       | jorgebucaran wrote:
       | What's the difference between the bracketed syntax, e.g. [City]
       | and `list`?
        
         | kbenson wrote:
         | I think, from reading around the examples/spec a bit,
         | collections of items use brackets to bound the collection. The
         | examples are collections of a single item though, so it's
         | slightly confusing at first glance. One hint at that is the
         | keys defining those collections seems to be pluralized in all
         | the cases I've seen so far.
        
         | mtdowling wrote:
         | Smithy is very focused on codegen, so the model is highly
         | normalized. So for example, defining a list of something needs
         | to be done using a `list` shape. This kind of list is something
         | you'd see directly serialized and sent over the wire. For
         | example:
         | 
         | list Messages { member: Messages }
         | 
         | Then you can reference Messages from other places in the model,
         | like from a structure:
         | 
         | structure Something { messages: Messages }
         | 
         | In contrast, the `[City]` syntax is used in other places in the
         | IDL to define a relationship to a shape. This isn't something
         | that gets sent over the wire, it's just used to form
         | essentially a relationship in the service graph from a service
         | to resources, a resource to operations, an operation to errors,
         | etc. For example:
         | 
         | service Weather { resources: [City, Sensors] }
        
       | 1f60c wrote:
       | AWS' new Rust SDK[0] is generated using Smithy models.
       | 
       | [0]: https://news.ycombinator.com/item?id=27080859
        
       ___________________________________________________________________
       (page generated 2021-05-08 23:03 UTC)