[HN Gopher] A Review of the Semantic Web Field
___________________________________________________________________
A Review of the Semantic Web Field
Author : hypomnemata
Score : 116 points
Date : 2021-01-25 19:11 UTC (1 days ago)
(HTM) web link (cacm.acm.org)
(TXT) w3m dump (cacm.acm.org)
| LukeEF wrote:
| We built a new semantic database first in university and then
| commercial open source (TerminusDB). We use the web ontology
| language (OWL) as a schema language, but made two important -
| practical - modifications: 1) we dispense with the open world
| interpretation; and 2) insist on the unique name assumption. This
| provides us with a rich modelling language which delivers
| constraints on the shapes in the graph. Additionally, we don't
| use SPARQL, which we didn't find practical (composability is
| important to us) and use a Datalog in its place (like Dataomic
| and others).
|
| Our feeling on interacting with the semantic web community is
| that innovation - especially when it conflicts with core ideology
| - is not welcome. We understand that 'open world' is crucial to
| the idea of a complete 'semantic web', but it is insanely
| impractical for data practitioners (we want to know what is in
| our DB!). Semantic web folk can treat alternative approaches as
| heresy and that is not a good basis for growth.
|
| As we came from university, I agree with comments that the field
| is too academic and bends to the strange incentives of paper
| publishing. Lots of big ideas and everything else is mere
| 'implementation detail' - when, in truth, the innovation is in
| the implementation details.
|
| There are great ideas in the semantic web, and they should be
| more widespread. Data engineers, data scientists, and everybody
| else can benefit, but we must extract the good and remove
| ideological barriers to participation.
| jerf wrote:
| "Our feeling on interacting with the semantic web community is
| that innovation - especially when it conflicts with core
| ideology - is not welcome."
|
| I wasn't a big fan of the "semantic web" community when it
| first came out, and the years have only deepened my disrespect,
| if not outright contempt. The entire argument was "Semantic web
| will do this and that and the other thing!"
|
| "OK, how exactly will it accomplish this?"
|
| "It would be really cool if it did! Think about what it would
| enable!"
|
| "OK, fine, but _how_ will this actually work! "
|
| "Graph structures! RDF!"
|
| "Yes, that's a data format. What about the algorithms? How are
| you going to solve the core problem, which is that nobody can
| agree on what ontology to apply to data at global scale, and
| there isn't even a hint of how to solve this problem?"
|
| "So many questions. You must be a bad developer! It would be
| _so cool_ if this worked, so it 'll work!"
|
| There has _always_ been this vacuousness in the claims, where
| they 've got a somewhat clear idea of where they want to go,
| but if you ever try to poke down even one layer deeper into
| _how_ it 's going to be solved, you get either A: insulted B1:
| claims that it's already solved just go use this solution (even
| though it is clearly not already solved since the semantic web
| promises are still promises and not manifested reality) B2:
| claims it's already solved and the semantic web is already
| _huge_ (even though the only examples some using this can cite
| are trivial compared to the grand promises and the "semantic
| web" components borderline irrelevant, most frequently citing
| "those google boxes that pop up for sites in search results"
| just like this article does despite the fact they're wafer-thin
| compared to the Semantic Web promises and barely use any
| "Semantic Web" tech at all) or C: a simple reiteration of the
| top-level promises, almost as if the person making this
| response simply doesn't fundamentally grasp that the ideals
| need to manifest in real code and real data to work.
|
| This article does nothing to dispel my beliefs about it. The
| second sentence says it all. For the rest, while just zooming
| in to the reality may be momentarily impressive, compared to
| the promises made it is nothing.
|
| The whole thing was structured backwards anyhow. I'd analogize
| the "semantic web" effort to creating a programming language
| syntax definition, but failing to create the compiler, the
| runtime, the standard library, or the community. Sure, it's
| non-trivial forward progress, but it wasn't _really_ the hard
| part. The real problem for semantic web and their community is
| the shared ontology; solve that and the rest would mostly fall
| into place. The problem is... that 's an unsolvable problem.
| Unsurprisingly, a community and tech all centered around an
| unsolvable problem haven't been that productive.
|
| A fun exercise (which I seriously recommend if you think this
| is solvable, let alone easy) is to just consider how to label a
| work with its author. Or its primary author and secondary
| authors... or the author, and the subsequent author of the
| second edition... or, what exactly is an _authored_ work
| anyhow? And how exactly do we identify an author... consider
| two people with identical names /titles, for instance. If we
| have a "primary author" field, do we _always_ have to declare a
| primary author? If it 's optional, how often can you expect a
| non-expert bulk adding author information in to get it correct?
| (How would such a person necessarily even _know_ how to pick
| the "primary author" out of four alphabetically-ordered
| citations on a paper?)
|
| (I am aware of the fact there are various official solutions to
| these problems in various domains... the fact that there are
| _various_ solutions is exactly my point. Even this simple issue
| is not agreed upon, context-dependent, it 's AI-complete to
| translate between the various schema, and if you speak to an
| expert using any of them you could get an earful about their
| deficiencies.)
| namedgraph wrote:
| It's not what you _have_ to do, or how, it 's that for the
| first time we have a common model for data interchange (RDF)
| with which you _can_ model concepts and things in your
| domain, or more-importantly across domains, and simply merge
| the datasets. Try that with the relational model or JSON.
| Integration is the main value proposal of RDF today, nobody
| sane is trying to build a single global ontology of the world
| .
|
| You can despise the fringe academic research, but how do you
| explain Knowledge Graph use by FAANG (including powering
| Alexa and Siri) as well as a number of Fortune 500 companies?
| Here are the companies looking for SPARQL (RDF query
| language) developers: http://sparql.club
| olliemath wrote:
| Yes. I had pretty much this conversation a while back with
| some non-technically minded people who had been convinced
| that by creating an ontology and set of "semantic business
| rules" - a lot of the writing of actual code could be
| automated away, leaving the business team to just create
| rules in a language almost like English and have the machine
| execute those English-like rules.
|
| I had to explain that they were basically on track to re-
| implementing COBOL.
| usrusr wrote:
| > "dispense with the open world interpretation"
|
| That can mean anything from "we have some conventional (e.g.
| plain old RDBMS) CWA systems but describe their schemas in an
| OWA DL to ease integration across independent systems" (in
| particular this means no CWA implications outside those built
| into the conventional systems with or without a semweb layer on
| top) to "we do a big bucket of RDF and run it all through a set
| of rules formulated in OWL syntax but applied in an entirely
| different way" (CWA everywhere). The former would be semweb as
| intended, or at least a subset thereof, but the latter could
| easily end up somewhere between simple brand abuse and almost
| comical cargo culting.
|
| Well, at least that's how I feel as someone who never had to
| face the realities of the vast unmapped territories between
| plain old database applications and fascinating yet entirely
| impractical academic mind games of DL (old school symbolic AI
| ivory tower that suddenly happened to find itself in the center
| of the hottest w3c spec right before w3c specs kind of stopped
| being a thing, with WHATWG usurping html and Crockford almost
| accidentally killing XML)
|
| (also, when has "assumption" turned into "interpretation"?
| Guess I missed a lot)
| wuschel wrote:
| > [...] but we must extract the good and remove ideological
| barriers to participation.
|
| Could you point to some resources that explain the tradeoff
| between the practical solutions and concepts and the ideologic
| cruft for an outsider?
| lou1306 wrote:
| Not the commenter, but I hope to add something to the
| discussion. Generally, expanding on the current state of the
| art is paramount in academia. In this case, I guess that
| defaulting on closed-world and unique names is frowned upon
| because academic people "know" that SemWeb concepts would be
| "easy" to implement under such conditions (for some
| interpretation of "know" and "easy"). A university lab would
| be reluctant to invest on such a project, because it would
| likely result in less publications than, say, a bleeding-edge
| POC.
|
| Of course, practical solutions based on well-understood
| assumptions are exactly what a commercial operation needs, so
| it's no wonder that TerminusDB chose that path. They might
| not publish a ton of papers, but they have something that
| works and could be used in production.
| Communitivity wrote:
| Almost every pragmatic implementation of semantic reasoning
| I've done involved both of the same modifications (closed world
| and unique names). A couple efforts used SPARQLX, something I
| created that was a binary form of
| SPARQL+SPARQLUpdate+StoredProcedures+Macros encoded using
| Variable Message Format. This was about 18 years ago, before
| SPARQL and SPARQL update merged, and before FLWOR. One of these
| days I'll recreate it again. The original work is not
| available, and I was not allowed to publish.
|
| Oh, and I forgot two things, SPARQLX had triggers, was
| customized for OWL DLP, and had commands for custom import and
| export using N3 (I was a big fan of the cwm software).
| tannhaeuser wrote:
| You're right to emancipate from the grab that SemWeb has had on
| the field for so long and turn to Prolog/Datalog and practical
| approaches IMO. Open world semantics and sophisticated theories
| may have been a vision for the semantic web of heterogenous
| data, but in reality RDF and co are only used in certain
| closed-world niches IME.
|
| Pascal Hitzler is one of the more prolific authors (especially
| with the EU-funded identification of description logic
| fragments of OWL2 which are some of the better results in the
| field IMO), but beginning this whole discussion with W3C's RDF
| is wrong IMO when description logic as more or less variable-
| free fragments of first-order logic with desirable complexities
| was a thing in 1991 or earlier already.
|
| Nit: careful with datomic. It's clearly not Datalog, but an ad-
| hoc syntax whereas Datalog is a proper syntactic subset of
| Prolog. And while I don't like SPARQL, it still gives quite
| good compat for querying large graph databases.
| j-pb wrote:
| NitNit: I think the term "Datalog" the prolog subset, has
| been pretty much replaced with "Datalog" the recursive
| consjunctive query fragment with recursion (and sometimes
| stratified negation) term.
|
| Most papers and textbooks I read these days use it as a
| complexity class for queries and not as a concrete syntax.
| LukeEF wrote:
| This is the sense in which I was using Datalog - and how
| others like Datomic, Grakn and Crux use it (there is a
| growing movement of databases with a 'Datalog' query
| language) - althou in our case, we can also use in the
| former sense as TerminusDB is implemented in prolog.
| nut-hatch wrote:
| I completed my PhD in the scope of Semantic Web technologies
| and I can share the same experience that the semantic web
| community is extremely closed (coming across as feeling
| "elite"). Having myself no supervisor from the field, it was
| still possible to publish my ideas (ISWC, WWW etc), but it was
| impossible to connect to the people and be taken seriously.
|
| I moved on from that field now, and I don't expect to come in
| touch with any Semantic Web stuff in a open-world context any
| time soon.
|
| I couldn't agree more with you that the strong ideology that
| drives this community is one of the main reason that these
| technologies are not widely adopted. This, and the failure to
| convince people outside academia that solving the problems it
| tries to solve is necessary in the first place.
|
| Good luck with TerminusDB, I think I listened to you at KGC.
| mark_l_watson wrote:
| Even though I have been working off and with SW and linked data
| tech for twenty years, I share some of the skeptical sentiments
| in comments here.
|
| I am keenly interested in fusion of knowledge representation with
| SW tech and deep learning. I wrote a short and effective NLP
| interface to DBPedia two weekends ago that you can experiment
| with on Google Colab
| https://colab.research.google.com/drive/1FX-0eizj2vayXsqfSB2...
| that leverages Hugging Face's transformer model for question
| answering. You can quickly see example use in my blog
| https://markwatson.com/blog/2021-01-18-dbpedia-qa-transforme...
| hyperion2010 wrote:
| The reason why tools like Protege have not been sufficiently
| developed is because of infighting in the academic ontology
| community in addition to the reasons listed by the author. It has
| set the whole community back at least 5 years.
| j-pb wrote:
| I think that's a symptom, not the cause.
|
| The complexity of web standards in general smother it with it's
| own weight. The common web has enough raw financial and person
| backing to grind through that. The semantic web does not.
|
| CURIEs and the depending standards alone are well over 100
| pages. Language tags alone has 90.
|
| RDF has like 100, Sparql has a combined of more than 300, and
| OWL has more than 500, even though it assumes that the reader
| is generally familiar with Description logics, so it's probably
| a couple thousand if you take the required academic literature
| into account.
|
| Nobody is going to read all of that, let alone build that.
|
| Especially not a bunch of academics who don't care about the
| implementation as long as it's good enough to get the next
| paper out the door.
|
| So everybody pools on these few projects, because they're the
| only thing that's kinda working. OWLAPI, Protege, ... uh that's
| it.
|
| Because everything else, is broken and unfinished.
|
| Here's a thought experiment, name one production ready RDF
| libray for every major programming language (C, Java, Python,
| Js), that doesn't have major, stale, unresolved issues in their
| issue tracker. It's all broken, and there is simply too much
| work required to fix things.
|
| It's only natural that people start to infight when there is
| only few hospitable oasis.
|
| What we need is a simpler ecosystem, where people can stake
| their claim on their niche, where they have the ability and
| power to experiment and explore.
| namedgraph wrote:
| OWLAPI, Protege - that's it? RDF libraries broken? Dude what
| rock are you living under? What about Jena, RDF4J, rdflib,
| redland, dotNetRDF etc? Most of these libraries have been
| developed and tested for 20+ years and are active. See for
| yourself: https://github.com/semantalytics/awesome-semantic-
| web#progra...
|
| Why are you spreading FUD?
| syats wrote:
| I agree with this. It is common to hear "Partial SPARQL 1.1
| support"... or "Partial OWL compatibility" or "A variant of
| SKOS is supported". While it is true that full
| ECMA6/HTTP2/IPv6/SQL is also rarely provided by
| implementations, this doesn't hinder their use in productive
| environments. I think it is rare to reach the parts of
| ECMAscript that aren't implemented, or the corners of SQL
| that Postgres/MariaDB don't support. In many of the "Semantic
| Web Stack", however, one quickly reaches a "not implemented"
| portion of the 500 page owl standard.
| cheph wrote:
| > CURIEs and the depending standards alone are well over 100
| pages.
|
| The curie standard is 10 pages long, and those "dependent
| standards" includes things like RFC 3986 (Uniform Resource
| Identifiers (URI): Generic Syntax) and RFC 3987
| (Internationalized Resource Identifiers (IRI)) - which are
| well established technologies that most people should be
| familiar with. And you really don't need to read all of the
| referenced standards to be able to understand and use CURIE
| quite proficiently.
|
| > RDF has like 100
|
| Normative specifications of RDF is contained in two
| documents:
|
| - RDF 1.1 Concepts and Abstract Syntax (
| https://www.w3.org/TR/rdf11-concepts/ ) = 20 pages
|
| - RDF 1.1 Semantics ( https://www.w3.org/TR/rdf11-mt/ ) = 29
| pages
|
| These page counts includes TOC, reference sections,
| appendices and large swathes of non-normative content also.
|
| And really the RDF 1.1 primer
| (https://www.w3.org/TR/rdf11-primer/) should be quite
| sufficient for most people who want to use it, and that is
| only 14 pages.
|
| RDF and CURIE is simple as dirt really, maybe too simple, but
| I think I can explain it quite well to someone with some
| basic background in IT in about 30 minutes.
|
| And while the other aspects (e.g. SPARQL, OWL) are not that
| simple, there is inherent complexity they are trying to
| address that you cannot just ignore. And not everybody needs
| to know OWL, and SPARQL is really not that complicated either
| and again most people can become quite proficient with this
| rather quickly if they understand the basics.
|
| > What we need is a simpler ecosystem, where people can stake
| their claim on their niche, where they have the ability and
| power to experiment and explore.
|
| What are the alternatives? Proliferation of JSON schemas
| which is yet to be ratified as a standard and does not
| address most of the same problems as Semantic Web Technology?
| I think there are some validity to your concerns, but
| semantic web technologies are being used widely in
| production, maybe not all of them, but to suggest it is not
| usable is not true.
|
| I have used RDF in Java (rdf4j and jena), Python (rdflib) and
| JS (rdflib.js) without serious problems.
| j-pb wrote:
| Familiarity isn't nearly enough if you want to implement
| something.
|
| Talking about RDF is absolutely meaningless without talking
| about Serialisation (and that includes ...URGH.. XML
| serialisation), XML Schema data-types, localisations,
| skolemisation, and the ongoing blank-node war.
|
| The semantic web ecosystem is the prime example of "the
| devils in the detail". Of course you can explain to
| somebody who knows what a graph is, the general idea of
| RDF: "It's like a graph, but the edges are also reified as
| nodes." But that omits basically everything.
|
| It doesn't matter if SparQL is learnable or not, it matters
| if its implementable, let alone in a performant way. And
| thats really really questionable.
|
| Jena is okay-ish, but it's neither pleasant to use, nor bug
| free, although java has the best RDF libs generally (I
| think thats got something to do with academic selection
| bias). RDF4J has 300 open issues, but they also contain a
| lot of refactoring noise, which isn't a bad thing.
|
| C'mon, rdflib is a joke. It has a ridiculous 200 issues / 1
| commit a month ratio, buggy as hell, and is for all intents
| and purposes abandonware.
|
| rdflib.js is in memory only, so nothing you could use in
| production for anything beyond simple stuff. Also there's
| essentially ZERO documentation.
|
| And none of those except for Jena even step into the realm
| of OWL.
|
| > What are the alternatives?
|
| Good question.
|
| SIMPLICITY!
|
| We have an RDF replacement running in production that's
| twice as fast, and 100 times simpler. Our implementation
| clocks in at 2.5kloc, and that includes everything from
| storage to queries, with zero dependencies.
|
| By having something that's so simple to implement, it's
| super easy to port it to various programming languages,
| experiment with implementations, and exterminate bugs.
|
| We don't have triples, we have tribles (binary triples, get
| it, nudge nudge, wink wink). 64 Byte in total, fits into
| exactly one cache line on the majority of Architectures.
|
| 16byte subject/entity | 16 byte predicate/attribute | 32
| byte object/value
|
| These tribles are stored in knowledge bases with grow-set
| semantics, so you can only ever append (on a meta level
| knowledge bases do support non-monotonic set operations),
| which is the only way you can get consistency with open
| world-semantics, which is something that the OWL people
| apparently forgot to tell pretty much everybody who wrote
| RDF stores, as they all have some form of non-mononic
| delete operation. Even SparQL is non-monotonic with it's
| optional operator...
|
| Having a fixed size binary representation makes this
| compatible with most existing databases, and almost trivial
| to implement covering indices and multiway joins for.
|
| By choosing UUIDs (or ULIDs, or TimeFlakes, or whatever,
| the 16byte don't care) for subject and predicate we
| completely circumnavigate the issues of naming, and schema
| evolution. I've seen so many hours wasted by ontologists
| arguing about what something should be called. In our case,
| it doesn't matter, both consumers of the schema can choose
| their own name in their code. And if you want to upgrade
| your schema, simply create a new attribute id, and change
| the name in your code to point to it instead.
|
| If a value is larger than 32 byte, we store a 256bit hash
| in the trible, and store the data itself in a a separate
| blob store (in our production case S3, but for tests it's
| the file stystem, we're eyeing a IPFS adapter but that's
| only useful if we open-sourced it). Which means that it's
| also working nicely with binary data, which RDF never
| managed to do well. (We use it to mix machine learning
| models with symbolic knowledge).
|
| We stole the context approach from jsonLD, so that you can
| define your own serialisers and deserialisers depending on
| the context they are used in. So you might have a
| "legacyTimestamp" attribute which returns a util.datetime,
| and a "timestamp" which returns a JodaTime Object. However
| unlinke jsonLD these are not static transformations on the
| graph, but done just in time through the interface that
| exposes the graph.
|
| We have two interfaces. One based on conjunctive queries
| which looks like this (JS as an example):
|
| ``` // define a schema const
| knightsCtx = ctx({ ns: { [id]: {
| ...types.uuid }, name: { id: nameId,
| ...types.shortstring }, loves: { id: lovesId },
| lovedBy: { id: lovesId, isInverse: true },
| titles: { id: titlesId, ...types.shortstring }, },
| ids: { [nameId]: { isUnique: true },
| [lovesId]: { isLink: true, isUnique: true },
| [titlesId]: {}, }, }); // add some
| data const knightskb = memkb.with(
| knightsCtx, ( [romeo, juliet], )
| => [ { [id]: romeo, name:
| "Romeo", titles: ["fool", "prince"],
| loves: juliet, }, { [id]:
| juliet, name: "Juliet", titles:
| ["the lady", "princess"], loves: romeo,
| }, ], ); // Query some data.
| const results = [ ...knightskb.find(knightsCtx, (
| { name, title }, ) => [{ name:
| name.at(0).ascend().walk(), titles: [title] }]), ];
|
| ```
|
| and the other based on tree walking, where you get a proxy
| object that you can treat as any other object graph in your
| programming language, and you can just navigate it by
| traversing it's properties, lazily creating a tree
| unfolding.
|
| Our schema description is also heavily simplified. We only
| have property restrictions and no classes. For classes
| there's ALWAYS a counter example of something that
| intuitively is in that class, but which is excluded by the
| class definition. At the same time, classes are the source
| of pretty much all computational complexity. (Can't count
| if you don't have fingers.)
|
| We do have cardinality restrictions, but restrict the range
| of attributes to be limited to one type. That way you can
| statically type check queries and walks in statically typed
| languages. And remember, attributes are UUIDs and thus
| essentially free, simply create one attribute per type.
|
| In the above example you'll notice that queries are tree
| queries with variables. They're what's most common, and
| also what's compatible with the data-structures and tools
| available in most programming languages (except for maybe
| prolog). However we do support full conjunctive queries
| over triples, and it's what these queries get compiled to.
| We just don't want to step into the same impedance mismatch
| trap datalog steps into.
|
| Our query "engine" (much simpler, no optimiser for
| example), performs a lazy depth first walk over the
| variables and performs a multiway set intersection for
| each, which generalises the join of conjunctive queries, to
| arbitrary constraints (like, I want only attributes that
| also occur in this list). Because it's lazy you get limit
| queries for free. And because no intermediary query results
| are materialised, you can implement aggregates with a
| simple reduction of the result sequence.
|
| The "generic constraint resolution" approach to joins also
| gives us queries that can span multiple knowledge bases
| (without federation, but we're working on something like
| that based on differential dataflow).
|
| Multi-kb queries are especially useful since our default
| in-memory knowledge base is actually an immutable
| persistent data-structure, so it's trivial and cheap to
| work with many different variants at the same time. They
| efficiently support all set operations, so you can do
| functional logic programming a la "out of the tar pit", in
| pretty much any programming language.
|
| Another cool thing is that our on-disk storage format is
| really resilient through it's simplicity. Because the
| semantics are append only, we can store everything in a log
| file. Each transaction is prefixed with a hash of the
| transaction and followed by the tribles of the transaction,
| and because of their constant size, framing is trivial.
|
| We can loose arbitrary chunks of our database and still
| retain the data that was unaffected. Try that with your
| RDMBS, you will loose everything. It also makes merging
| multiple databases super easy (remember UUIDs to prevent
| naming collisions, monotonic open world semantics keep
| consistency, fixed size tribles make framing trivial), you
| simply `cat db1 db2 > outdb` them.
|
| Again, all of this in 2.5kloc with zero dependencies (we do
| have one on S3 in the S3 blob store adapter).
|
| Is this the way to go? I don't know, it serves us well. But
| the great thing about it is that there could be dozens of
| equally simple systems and standards, and we could actually
| see which approaches are best, from usage. The semantic web
| community is currently sitting on a pile of ivory,
| contemplating on how to best steer the titanics that are
| protege, and OWLAPI through the waters of computational
| complexity. Without anybody every stopping to ask if that's
| REALLY been the big problem all along.
|
| "I'd really love to use OWL and RDF, if only the algorithms
| were in a different complexity class!"
| cheph wrote:
| > Talking about RDF is absolutely meaningless without
| talking about Serialisation (and that includes ...URGH..
| XML serialisation), XML Schema data-types, localisations,
| skolemisation, and the ongoing blank-node war.
|
| Don't implement XML serialization. The simplest and most
| widely supported serialization is n-quads
| (https://www.w3.org/TR/n-quads/). 10 pages, again with
| exaples, toc, and lots of non-normative content.
|
| You don't need to handle every data type, and you can't
| even if you wanted to because data types are also not a
| fixed set. And whatever you need to know about
| skolemisation, localization, and blank-nodes is in the
| standards AFAIK.
|
| > C'mon, rdflib is a joke. It has a ridiculous 200 issues
| / 1 commit a month ratio, buggy as hell, and is for all
| intents and purposes abandonware.
|
| It works, not all functionality works perfectly but like
| I said I have used it and it worked just fine.
|
| > rdflib.js is in memory only, so nothing you could use
| in production for anything beyond simple stuff. Also
| there's essentially ZERO documentation.
|
| For processing RDF in browser it works pretty well, not
| sure what you expect but to me RDF support does not imply
| it should be a fully fledged tripple-store with disk
| backing. Also not really zero documentation:
| https://github.com/linkeddata/rdflib.js/#documentation
|
| > > What are the alternatives?
|
| > SIMPLICITY!
|
| > But the great thing about it is that there could be
| dozens of equally simple systems and standards, and we
| could actually see which approaches are best, from usage.
|
| Okay, so you roll your own that fits your use case. Not
| much use to me and it is not a standard. Lets talk again
| when you standardize it. Otherwise do you mind giving an
| alternative that I can actually take off the shelf to at
| least the extent that I can with RDF?
|
| I am not going to roll my own standard, and if all the
| RDF data sets instead used their own standards instead of
| RDF it won't really improve anything.
|
| EDIT: If you compare support for RDF to JSON schema,
| things are really not that bad.
| j-pb wrote:
| > Don't implement XML serialization. The simplest and
| most widely supported serialization is n-quads
| (https://www.w3.org/TR/n-quads/). 10 pages, again with
| exaples, toc, and lots of non-normative content.
|
| You omit the transitive hull that the n-quads standard
| drags along, as if implementing a deserializer somehow
| only involved a parser for the most top-level EBNF.
|
| Also, you're still tip-toeing around the wider ecosystem
| of OWL, SHACL, SPIN, SAIL and friends. The fact that RDF
| alone even allows for that much discussion is indicative
| of it's complexity. It's like a discussion about SVG and
| HTML that never goes beyond SGML.
|
| And you can't have your cake and eat it too. You either
| HAVE to implement XML-Syntax or you won't be able to load
| half of the worlds datasets, nor will you even be able to
| start working with OWL, because they do EVERYTHING with
| XML.
|
| You're still coming from a user perspective. RDF will go
| nowhere unless it finds a balance between usability and
| implementability. Currently I'd argue, it focuses on
| neither.
|
| JS is a bigger ecosystem than just the browser, if you
| want to import any real-world dataset (or persistence)
| you need disk backing. So anything that just goes _poof_
| on a power failure doesn 't cut it.
|
| Sorry but "works pretty well", and 6 examples combined
| with an unannotated automatically extracted API, does not
| reach my bar for "production quality".
|
| It's that "works pretty well" state of the entire RDF
| ecosystem that I bemoan. It's enough to write a paper
| about it, it's not enough to trust the future of your
| company on. Or you know. Your life. Because the ONLY real
| world example of an OWL ontology ACTUALLY doing anything
| is ALWAYS Snowmed. Snowmed. Snowmed. Snowmed.
|
| [A joke we always told about theoreticians finding a new
| lower bound and inference engines winning competitions:
| "Can snowmed be used to diagnose a patient?" "Well it
| depends. It might not be able to tell you what you have,
| but it can tell you that your 'toe bone is connected to
| the foot bone' 5 million times a second!"]
|
| Imagine making the same argument for SQL, it'd be trivial
| to just point to a different library/db.
|
| And so far we've only talked about complexity inherent in
| the technology, and not about the complex and hostile
| tooling (a.k.a. protege) or even the absolut
| unmaintainable rats nests that big ontologies devolve to.
|
| Having a couple different competing standards would
| actually improve things quite a bit, because it would
| force them to remain simple enough that they can still
| somehow interoperate.
|
| It's a bit like YAGNI. If you have two simple standards
| it's trivial to make them compatible by writing a tool
| that translates one to the other, or even speaks both. If
| you have one humongous one, it's nigh impossible to have
| two compatible implementations, because they will diverge
| in some minute thing. See rich hickeys talk "simplicity
| matters", for an in-depth explanation on the difference
| between simple (few parts with potentially high overall
| complexity through intertwinement and parts taking
| multiple roles), and decomplected (consisting of
| independent parts with low overall system complexity).
|
| And regarding JSON Schema: I never advocated for JSON
| schema and the fact that you have to compare RDFs
| maturity to something that hasn't been released yet...
|
| You would expect a standard that work began on 25 YEARS
| ago to be a bit more mature in it's implementations. If
| it hasn't reached that after all this time, we have to
| ask the question, why is that? And my guess is that
| implementors see the standards _and_ their transitive
| hull and go TL;DR, and even if they try, they get
| overwhelmed by the sheer amount of stuff.
| namedgraph wrote:
| RDF is absolutely about triples and the converse graph
| form! The serialization formats are immaterial,
| orthogonal. The fact that you think it's about any
| particular format makes it obvious that you don't get it
| at all.
| orzig wrote:
| I don't even work with semantic technologies, but I just
| love the structure and completeness of arguments in the
| space. I suppose I should not make enemies by being
| specific, but compare this comment to the average (or
| even 90th percentile) argument on almost any other topic.
|
| Although it looks like HN now needs to implement a
| "download the Kindle" feature :-)
| j-pb wrote:
| I'm flattered <3
| wuschel wrote:
| Hi,
|
| thank you for the really cool post! I am trying to
| understand some key concepts here, so please forgive the
| simple questions:
|
| > 16byte subject/entity | 16 byte predicate/attribute |
| 32 byte object/value
|
| What would be a difference between subject and entity? Do
| you include a timestamp of your entries next to your
| trible?
|
| > Having a fixed size binary representation makes this
| compatible with most existing databases (...)
|
| Are you using a external look up table to identify the
| human language definition of the entry, and keep using
| the 2^128 possible entries for internal use?
|
| > (...) if you want to upgrade your schema (...)
|
| > We stole the context from jsonLD (...)
|
| What were the reasons you did not use jsonLD as a base
| for your software?
|
| Could you point perhaps point me to a case study of your
| system, or, if this is not possible, a similar case
| published in literature/www etc? I would love to learn
| more what you are doing (my contact is in the my
| profile).
|
| Wish I could upvote you a couple of times. Thank you.
| j-pb wrote:
| Glad that you like it :D This actually pushes me a bit
| more into the direction of open-sourcing the whole thing,
| we kinda have it planned, but it's not a priority at the
| moment, because we use it ourselves quite happily :D.
|
| Subject and Entity are the same thing, just different
| names for it. People with a Graph DB background will more
| commonly use [entity attribute value] for triples, while
| people from the Semantic Web community, commonly use
| [subject predicate object].
|
| We don't use timestamps, but we just implemented
| something we call UFO-IDs (Unique, Forgettable, Ordered),
| where we store a 1 second resolution timer in the first
| 16 bit, which improves data locality and allows us to
| forget irrelevant tribles within a 18h window (which is
| pretty nice if you do e.g. robotics or virtual personal
| assistants), while at the same time still practicing the
| overflow case regularly (in comparison to UUIDv1, ULID,
| or Timeflakes), and not loosing too many bits of entropy
| (especially in cases where the system runs longer than
| 18h).
|
| The 128bit is actually big enough though that you can
| just choose any random value, and be pretty darn certain
| that it's unique. (UUIDv4 works that way) 64 byte /
| 512bit not only fits into cache lines, it's also the
| smallest value, which is statistically "good enough". 128
| bit random IDs (entity and attribute) are statistically
| unlikely enough to collide, and 256bit hashes (the value)
| are likewise good enough for the foreseeable future to
| avoid content collision.
|
| And yeah well, the human language name, as well as all
| the documentation about the attribute is actually stored
| as tribles alongside the data. We use them for code
| generation for statically typed programming languages
| which allows us to hook into the languages type checker,
| to create documentation on the fly, and to power a small
| ontology editing environment (take that protege ;) ).
|
| We kinda use it as a middleware, similar to ROS, so it
| has to fit into the same soft realtime, static typing,
| compile everything nich, while at the same time allowing
| for explorative programming in dynamic languages like
| Javascript. We use observableHQ notebooks to do all kinds
| of data-analysis, so naturally we want a nice workflow
| there.
|
| jsonLD is heavily hooked into the RDF ecosystem. We
| actually started in the RDF space, but it became quickly
| apparent that the overall complexity was a show stopper.
|
| Originally this was planned to bring the sub projects in
| a large EU research project closer together, and
| encourage collaboration. We found that every sub project
| wanted to be the Hub that connected all the other ones.
|
| By having a 2.5kloc implementation we figured, everybody
| could "own" the codebase, and associate with it, make it
| so stupid and obvious that everybody feels like they came
| up with the idea themselves. The good old inverse Conway
| manoeuvre.
|
| jsonLD is also very static, in the ways that it allows
| you to reinterpret data, RDF in =churn=> JSON out, and we
| wanted to be able to do so dynamically, so that when you
| refactor code to use new attribute variants (e.g. with
| different deserialisations) you can do so gradually. Also
| dynamic is a lot faster.
|
| The tribles instead of triples idea came when we've
| noticed that basically every triple store implementation
| does a preprocessing step where CURI are converted to u64
| / 8byte integer to be stored in the indices.
|
| We just went: "Well, we could either put 24 byte in the
| index and still have to do 3 additional lookups. Or we
| could put 64 byte (2.5x) in there and get range queries,
| sorting, and no additional lookups, with essentially the
| same write and read characteristics.[Because our Adaptive
| Radix Tree index compresses all the random bits.]" 64 bit
| words are already pretty darn big...
|
| Currently there is nothing published (except for it being
| vaguely mentioned in some linguistics papers), and no
| studies done. They are planned though, but as this isn't
| our source of income it's lowish priority (much to my
| dismay :D).
|
| Keep an eye on tribles.space tho ;)
|
| Edit: Ah well, why wait, might as well start building a
| community :D
|
| https://discord.gg/KP5HBYfqUf
| stareatgoats wrote:
| This seems to me to be an insightful and comprehensive overview
| of the Semantic Web, both current status and how we got here.
| People like me, who have long been wanting to better understand
| the (obviously sprawling) concepts involved will be able to use
| the article as a good entry point.
|
| That said, the expressed hope of consolidation in the field is
| likely still some way off. AI has taken over a lot of the promise
| that the Semantic Web originally held. But AFAICS there are two
| drivers (also mentioned in the article) that potentially could
| provide the required impetus for a reignited interest in the
| Semantic Web:
|
| Firstly the need for explainable AI, and secondly the probable(?)
| coming breakthrough in natural language processing and automatic
| knowledge graph or ontologies from text.
|
| All in all, it seems way too early to write off the Semantic Web
| field at this point.
| tragomaskhalos wrote:
| My 10,000 ft layperson's view, to which I invite corrections, is
| broadly:
|
| - The semantic web set off with extraordinarily ambitious goals,
| which were largely impractical
|
| - The entire field was trumped by Deep Learning, which takes as
| its premise that you can _infer_ relationships from the exabytes
| of human rambling on the internet, rather than having to
| laboriously encode them explicitly
|
| - Deep Learning is not after all a panacea, but more like a very
| clever parlour trick; put otherwise, intelligence is more than
| linear algebra, and "real" intelligences aren't completely fooled
| by one pixel changing colour in an image, etc.
|
| - Hence, we have come back round to point 1 again
|
| ?
| cheph wrote:
| Deep Learning does not even operate in the same space as where
| most of Semantic Web is being used today, some examples:
|
| - https://schema.org/
|
| - https://www.wikidata.org/
|
| - https://lod-cloud.net/
|
| - http://www.ontobee.org/
|
| -
| https://catalog.data.gov/dataset?res_format=RDF&_res_format_...
|
| - https://ukparliament.github.io/ontologies/
|
| -
| https://ckan.publishing.service.gov.uk/dataset?res_format=SP...
|
| -
| https://ckan.publishing.service.gov.uk/dataset?res_format=RD...
|
| - https://data.nasa.gov/ontologies/atmonto/index.html
|
| - https://data.europa.eu/euodp/linked-data
| fauigerzigerk wrote:
| _> The entire field was trumped by Deep Learning, which takes
| as its premise that you can infer relationships from the
| exabytes of human rambling on the internet, rather than having
| to laboriously encode them explicitly_
|
| I don't think machine learning can ever replace data modeling,
| because data modeling is often creative and/or normative. If we
| want to express what data _must_ look like and which
| relationships there _should_ be, then machine learning doesn 't
| help and we have no other choice than to laboriously encode or
| designs. And as long as we model data we will have a need for
| data exchange formats.
|
| You could categorise data exchange formats as follows:
|
| a) Ad-hoc formats with ill defined syntax and ill defined
| semantics. That would be something like the CSV family of
| formats or the many ad-hoc mini formats you find in database
| text fields.
|
| b) Well defined syntax with externally defined often informal
| semantics. XML and JSON are examples of that.
|
| c) Well defined syntax with some well defined formal semantics.
| That's where I see Semantic Web standards such as RDF (in its
| various notations), RDFS and OWL.
|
| So if the task is to reliably merge, cleanse and interpret data
| from different sources then we can achieve that with less code
| on the basis of (c) type data exchange formats.
|
| But it seems we're stuck with (b). I understand some of the
| reasons. The Semantic Web standards are rather complex and at
| the same time not powerful enough to express all the things we
| need. But that is a different issue than what you are talking
| about.
| breck wrote:
| I think you are spot on.
|
| I think what we'll see is Deep Learning/Human Editor "Teams".
|
| DL will do the bulk of the relationship encoding, but human
| domain experts will do "code reviews" on the commits made by DL
| agents.
|
| Over time fewer and fewer commits will need to be reviewed,
| because each one trains the agent a bit more.
| ivan_ah wrote:
| Wow, what a great summary with lots of realism and nuances. I
| agree with the author's conclusions that what is missing is
| consolidation and interoperability between standards (e.g. make
| Protege easier to use and ensure libraries for RDF parsing and
| serializations exist for all languages). No technology will be
| adopted if it requires PhD-level ability to handle jargon and
| complexity... but if there were tutorials and HOWTOs, we could
| see big progress.
|
| Personally, I'm not a big fan of the "fancy" layers of the
| Semantic Web Stack like OWL (see
| https://en.wikipedia.org/wiki/Semantic_Web_Stack ), but the basic
| layers of RDF + SPARQL as a means for structured exchange of data
| seem like a solid foundation to build upon.
|
| It's really simple in the end: we've got databases and
| identifiers. INTERNALLY to any company or organization, you can
| setup a DB of your choosing and ensure data follows a given
| schema, with data linked through internal identifiers. When you
| want to publish data EXTERNALLY, you need to have "external
| identifiers" for each resource, and URIs are a logical choice for
| this (this is also a core idea of REST APIs of hyperlinked
| resources). Similarly, communicating data using the a generic
| schema capable of expressing arbitrary entities and relations
| like RDF and JSON-LD is also a logical next step, rather than
| each API using it's own bespoke data schema...
|
| As for making web data machine-readable, the key there is KISS:
| efforts like schema.org with opt-in, progressive enhancements
| annotations are very promising.
|
| For anyone wanting to know more about this domain, there is an
| online course here:
| https://www.youtube.com/playlist?list=PLoOmvuyo5UAeihlKcWpzV...
| The whole course is pretty deep (would take a month to go through
| it all), but you can skip ahead to lectures of specific interest.
| bryanrasmussen wrote:
| As always should look at metacrap
| (http://www.well.com/~doctorow/metacrap.htm) when discussing the
| semantic web
|
| - Certain kinds of implicit metadata is awfully useful, in fact.
| Google exploits metadata about the structure of the World Wide
| Web: by examining the number of links pointing at a page (and the
| number of links pointing at each linker), Google can derive
| statistics about the number of Web-authors who believe that that
| page is important enough to link to, and hence make extremely
| reliable guesses about how reputable the information on that page
| is.
|
| This sort of observational metadata is far more reliable than the
| stuff that human beings create for the purposes of having their
| documents found. It cuts through the marketing bullshit, the
| self-delusion, and the vocabulary collisions.
|
| in short, engineering triumphs over data entry.
| sammorrowdrums wrote:
| I found that Job Postings are an exception. Google picks up on
| them, has a special API to submit them direct (due to slow
| crawling) and close them.
|
| So long as you're a good actor that will get you far. If your
| data is low quality, wrong, error prone or otherwise you'll not
| get shown and will likely receive manual actions and end up in
| the Google proverbial sin bin.
|
| I have found that incentives align for job postings.
|
| That obviously doesn't prove that metadata is not flawed, just
| that there are areas where it seems to work well.
| actsofthecla wrote:
| I'm only a hobbyist in this area, but I wonder why the review
| wouldn't mention some of the graph databases as, at least,
| semantic web adjacent. Their relative success seems to lend
| credence to the overall vision of the semantic web and its
| supporting technologies. For example, are there really more than
| surface syntactical differences between SPARQL and Cypher?
|
| Even though it was over-hyped, I like the semantic web because it
| supports a conception for the future that includes something
| other than neural network black-boxes. However, whether the ideas
| deliver remains to be seen.
|
| If anyone is looking for an introduction, then I think the Linked
| Data book from Manning is worth mentioning--it might be a little
| dated at this point. The author provides a coherent introduction
| and helps, especially, in cutting through the confusing
| proliferation of acronyms that characterizes this field. As
| others have mentioned, reliable software is a major stumbling
| block. It's especially unfortunate that there isn't better
| browser support, of RDFa for example.
| namedgraph wrote:
| Check our SPARQL-driven Knowledge Graph management system :)
| https://atomgraph.github.io/LinkedDataHub/
| tammet wrote:
| The whole field has been dominated by research, i.e. the wish to
| make simple things complicated (in order to publish papers) as
| opposed to engineering, i.e. making complicated things simple (in
| order to produce usable software efficiently). As a result the
| standards are horrendously - and needlessly - complicated. The
| few major practical outcomes like the schema.org, json-ld and the
| google annotation system, are results of engineering, not
| research. Alas, json-ld has also taken a turn towards
| hypercomplexities.
| huskyr wrote:
| Yeah, this is an unfortunate consequence of having the whole
| ecosystem mostly within academia, including the lack of
| tutorials and proper documentation (e.g. not a 500 page
| standard).
|
| IMO the most interesting place right now for semantic web
| development is Wikidata. It's still pretty difficult for
| newcomers to contribute (as is the case for all Wikimedia
| projects) but at least it has many eyeballs and a very active
| community / ecosystem.
| krallistic wrote:
| Maybe a good indicator that there is only minor (industry)
| need/benefit. The "biggest" Knowledge Graph is Google, but it
| is unclear, how much there is actually Semantic Web and how
| much search, ML, NLP etc..
|
| They are all nice ideas, but the practical usecases are rare. I
| am skeptical of the often touted usecase in Medicine/Drug
| Interactions. The only time i saw it in the industry, it was
| not really used by the lab technicians. Because all questions
| the system could answer, were trivial. The promise of "the
| system can inference new combinations/interactions" was never
| fulfilled.
| cheph wrote:
| > The "biggest" Knowledge Graph is Google, but it is unclear,
| how much there is actually Semantic Web and how much search,
| ML, NLP etc..
|
| The second biggest is possibly WikiData, and it is not that
| small.
|
| As to the practical use cases, there are many, but it is the
| premier way of encoding metadata for search engines:
| https://schema.org/docs/about.html
|
| And the amount of datasets and ontologies that exist is quite
| vast:
|
| - https://lod-cloud.net/dataset
|
| - http://obofoundry.org/
|
| - https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_
| M...
|
| I would like to understand what other options you would
| consider better for these datasets, for the metadata and for
| the ontologies?
|
| I mean if not RDF for web metadata then what? If not semantic
| web for UK govt data
| (https://ukparliament.github.io/ontologies/,
| https://opendatacommunities.org/data_home, https://ckan.publi
| shing.service.gov.uk/dataset?res_format=RD..., https://ckan.p
| ublishing.service.gov.uk/dataset?res_format=SP...) then what?
|
| It would be nice to have something even better, but I much
| prefer RDF to a bunch of CSV files.
| namedgraph wrote:
| Explain this Knowledge Graph usage by Fortune 500 companies
| then: http://sparql.club/
| breck wrote:
| I agree, the research is overly complicated.
|
| So it's a lot of extra work to sift through, but I've found a
| lot of gold in there.
|
| If you're looking for a simple, noise-free way to do the
| semantic web, I'm very confident that Tree Notation will enable
| it (https://treenotation.org/).
|
| I've played around a bit with turning Schema.org into a Tree
| Language, and think that would be a fruitful exercise, but
| plenty more on the plate first.
|
| FWIW I've pitched this concept to W3C for 4 or 5 years to no
| avail yet. I think though if someone can put together a decent
| prototype the idea might start clicking.
|
| Imagine a noise free way to encode the semantic web with
| natural 3-d positional semantics. Could be cool!
| xkvhs wrote:
| Maybe for its time it seemed like a good idea.. Like SOAP or
| manual features for image classification. Today, it's clear that
| languages and knowledge don't really work like that, and it's not
| practical to approach them this way. I've learned about the OWL
| and SPARQL 12 years ago, and it already felt like a very dated
| idea. But then who knows... Everybody have given up on NNs once
| too.
| krallistic wrote:
| The comparisons to NLP presents a good view on the problems.
|
| Its "easy" to write some logic rules to parse input text for a
| 50% demo. But then you want to improve & scale, and suddenly
| all the nuances, bites you. The rules get bigger, nested and
| complicated. Traditional NLP tried that avenue for a while,
| with decent success in small usecases, but for larger problems
| without success. (Compared to stuff like BERT & GPT, which
| still have a lot of problems)
|
| Similar with Knowledge Graphs, you can show some nice
| properties on inferring knowledge on small problems, but the
| real world is much more approximate and unclear than some
| (binary) relationships.
|
| Personally i think we Humans lack the mental capacity to build
| large models with complex interactions.
| namedgraph wrote:
| Right... except that Uber, Boeing, JP Morgan Chase, Nike,
| Electronic Arts etc. etc. are looking for SPARQL developers
| right now: http://sparql.club/
| cheph wrote:
| > Today, it's clear that languages and knowledge don't really
| work like that, and it's not practical to approach them this
| way.
|
| There are many applications of Semantic Web that has little to
| do with natural languages. If you have a better option for all
| the existing RDF data sets (https://lod-cloud.net/,
| https://www.wikidata.org/) and ontologies
| (http://www.ontobee.org/, https://schema.org/) it would be good
| to be explicit about it.
|
| I would prefer to have more data (e.g. data from US federal
| reserve data, world bank data) as RDF and accessible via SPARQL
| endpoints than less, because it is much more useful as RDF than
| as CSV, in my opinion.
| mxmilkb wrote:
| Nice though nothing about Turtle or LV2
|
| https://www.w3.org/2007/02/turtle/primer/
|
| https://github.com/lv2/lv2/wiki
|
| Also, #swig (semantic web interest group) exists on freenode.
| smarx007 wrote:
| I do research in this field but I am a programmer by training
| before I entered this research field. I have talked to many
| academics and they agree that industry needs something simpler,
| more approachable and something that solves their problems in a
| more direct way, so it's definitely not an "academic exercise"
| for many researchers.
|
| However, I failed to convince people that we need to implement
| the 2001 SciAm use case (https://www-
| sop.inria.fr/acacia/cours/essi2006/Scientific%20..., see the
| intro before the first section) using 2021 technologies
| (smartphones are here, assistants are here, shared calendars are
| easy, companies have APIs, the only thing missing is a proper
| glue using semantic web tech). This goes to the core thesis of
| this paper that semantic web is awesome as the set of ideas and
| approaches but the Semantic Web as the result of all this work
| may look underwhelming or irrelevant today. I like to point
| everyone who disagrees with me to the 1994 TimBL presentation at
| CERN (https://videos.cern.ch/record/2671957) where he talks about
| the early vision of semantic web (https://imgur.com/aS2dbf6 or
| around 05:00 in the video), which looks awfully like IoT (many
| years before the term even existed). We simply cannot fault
| someone who envisioned communication technologies for IoT in 1994
| for getting the technology a bit wrong.
|
| Today's technologies simply cannot handle the use-cases for which
| SemWeb was designed for properly:
|
| 1) The web is still not suitable for machines. Yes, we have IoT
| devices that use APIs but nobody will say it's truly M2M
| communication at its best. When APIs go down devices get bricked,
| there is no way to get those devices to talk to any other APIs.
| There is no way for two devices in a house to talk to each other
| unless they were explicitly programmed to do so.
|
| 2) We don't have common definitions for the simplest of terms.
| Schema.org made a progress but it's very limited because it
| serves search engine interest, not the IoT community. There is no
| reason something like XML NS or RDF NS should not be used across
| every microservice in a company. Using a key (we call them
| predicates, but not important here) "email:mbox" (defined in
| https://www.w3.org/2000/10/swap/ very long time ago) you can
| globally denote the value is an email.
|
| 3) Correctness of data and endpoint definition still matters. We
| threw away XML and WSDL but came back to develop JSON Schema and
| Swagger.
|
| We are trying to get there. JSON Schema, Swagger etc. all make
| efforts in the direction of the problems SemWeb tried to address.
| One of the most "semantic" efforts I see done recently is GraphQL
| federation, which has been a semantic web dream for a long while:
| being able to get the information you need by querying more than
| one API. This only indicates the problems that semantic web tried
| to address are still viable.
|
| If anyone has attempted an OSS reimplementation of the 2001 "Pete
| and Lucy" semantic web use case (ie as an Android app and a bunch
| of microservices), please point me in the right direction.
| Otherwise, if anyone is interested in doing it, I am all ears
| (https://gitter.im/linkeddata/chat is an active place for the
| LOD/EKG/SW discussion).
| namedgraph wrote:
| We wasted 20 years by trying to replace one form of brackets
| with the other (XML vs. JSON). WHATWG and the browser vendors
| are responsible for this. Just like for the fact that we still
| don't have a machine-readable web. FAANG crawls the structured
| schema.org metadata like nobody else can and profits from it,
| and the rest of use are left with the HTML5 and Javascript
| crap.
| kmerroll wrote:
| Honestly disconcerting to see mostly negative responses in this
| thread: awful community, overly complicated, research focused,
| academic nitwits gone wild, etc. Pretty sure there's some truth
| here, but would suggest the deeper argument is against semantic
| web as evolution of the world-wide-web. Agree this isn't likely
| to happen in my lifetime.
|
| Right up there with, mostly hated, Javascript, I happen to think
| there are good parts of the semantic web technologies and that
| the pivot towards industry adoption of the graph data models
| related to knowledge graphs, ontologies, and SPARQL shows there
| are benefits outside of academic paper mills. I don't have a dog
| in this fight (TerminusDB), but applying some reasonable
| expectations and accepting the limitations of the semantic web
| tools has been very successful on many projects. Even more so,
| innovation and improvements in graph data repositories are making
| triple-stores and graph-based models compelling for some use
| cases. Not going back to CSV hell if there are better
| alternatives.
___________________________________________________________________
(page generated 2021-01-26 23:02 UTC)