[HN Gopher] Machine learning's crumbling foundations
___________________________________________________________________
Machine learning's crumbling foundations
Author : Anon84
Score : 154 points
Date : 2021-08-22 16:43 UTC (6 hours ago)
(HTM) web link (pluralistic.net)
(TXT) w3m dump (pluralistic.net)
| unbroken wrote:
| There's a nice substack I found that is precisely about this
| problem and wider variations of it, that is, the problem of
| figuring out numbers that actually tell you something about the
| universe:
|
| https://desystemize.substack.com
| 6gvONxR4sf7o wrote:
| > One common failure mode? Treating data that was known to be of
| poor quality as if it was reliable because good data was not
| available... they also use the poor quality data to assess the
| resulting models.
|
| This drives me nuts. Spend $10k getting high quality data and
| throw a simple model at it? Nah, let's spend a month of time from
| someone making $400k/yr for less trustworthy results. And on the
| blogosphere it's even worse. 'This is the best data available so
| here goes' justifies so much worse-than-worthless BS.
|
| And don't even get me started on the 'better than human'
| headlines that result.
| zmmmmm wrote:
| I feel like its even worse than just resorting to bad data when
| good data isn't available: the field of deep learning has
| cultivated the perception that it's robust to bad data as one
| of its hallmarks.
|
| That is, you can pump relatively raw data into it and it will
| self-select features and then self-regulate their use so
| therefore most of the initial steps of data cleaning, feature
| selection etc are not necessary, or require less expertise.
| This is now spilling over into general ML so that when quacks
| assert that their model just magically overcomes these things
| and people actually believe it.
| nerdponx wrote:
| The irony is that there _are_ techniques for dealing with
| noisy /mislabeled/bad data (e.g. gold loss correction [0],
| errors-in-variables models [1]), but that stuff isn't "sexy"
| and not enough practitioners know about it.
|
| 0: https://arxiv.org/abs/1802.05300
|
| 1: https://en.m.wikipedia.org/wiki/Errors-in-variables_models
| howmayiannoyyou wrote:
| Really depends on the domain and the engine. OpenAI code
| generation is staggering
| (https://www.youtube.com/watch?v=SGUCcjHTmGY&t=1214s), its
| summarization and classification is still very much a work in
| progress.
| safog wrote:
| I agree with this - when there's an economic incentive to get
| clean data, you get clean data.
|
| For instance, there's a lot of manual clean up work put into
| things like training data sets for speech recognition because
| there has been a lot of investment there. Same with self
| driving I assume because so much $$$ got invested there.
|
| Radiology scans or cough based COVID detectors or medical
| claims on the other hand? I wouldn't expect it. It's just
| researchers trying to get a quick paper without adequate
| funding.
| Orou wrote:
| Most ML requires collecting, cleaning, and transforming
| datasets into something that a model can train on for a
| specific domain. Codex and Copilot aren't good examples of this
| because they are training on terabytes of public code repos -
| meaning that there is no code cleaning step. It's relying on
| the sheer volume of data that is being processed to try and
| filter the 'unclean' data (think buggy code written by a human)
| out of the model.
|
| These are really the exception rather than the rule when it
| comes to collecting data for ML/AI applications.
| thaumasiotes wrote:
| > The disdain for the qualitative expertise of domain experts who
| produce data is a well-understood guilty secret within ML
| circles, embodied in Frederick Jelinek's ironic talk, "Every time
| I fire a linguist, the performance of the speech recognizer goes
| up."
|
| This reminds me of how the chimp sign language studies got much
| better results from hearing evaluators than from deaf ones.
| pjscott wrote:
| Doctorow seems to be missing the meaning of that quote (which
| is also not the title of a talk, ironic or otherwise). It was
| specifically a comment on the usefulness of computer language
| models created manually based on linguistic theories of grammar
| versus ones in the same model family created automatically from
| real-world data -- the latter tended to work better. These days
| I usually hear it quoted more broadly as a warning about the
| danger of encoding too much possibly-wrong domain knowledge in
| an ML system when a more generic model and the training data
| are sufficient to learn the useful parts on their own.
|
| Neither of those translates into disdain for _qualitative
| understanding of the underlying reality_ behind the data set,
| which is one of those things that everyone knows is important.
| The problem is that such understanding is actually hard, and
| easy to mess up even when you 're trying.
| taeric wrote:
| This hits so many of the greatest hits for how to speak to
| emotion and play on existing feelings more than following the
| data.
|
| Starts off with an appeal to so called technical debt. A nebulous
| concept that more plays on debt being bad then it shows anything
| to actually do.
|
| It then moves to comparing to other engineering, with the
| implicit idea that they have it together in ways that we don't.
|
| Oh, and I skipped the part of statistical abuse. Because, what?
| Turns out special cases abound in data driven efforts. Instead of
| looking for ways out, we are looking to blame those that tried?
| That... Doesn't seem productive.
|
| I also don't buy some of the argument. Focusing on voter purging
| as if that is a data science problem seems willfully ignorant.
| That is a blatant power grab that is just hiding behind data
| jargon.
| FridayoLeary wrote:
| This can explain why pollsters get things wrong so often.
| blululu wrote:
| This seems like a re-hashing of Michael Jordan's essay on the
| subject: https://medium.com/@mijordan3/artificial-intelligence-
| the-re...
| dang wrote:
| Discussed here:
|
| _Artificial Intelligence - The Revolution Hasn't Happened Yet
| (2018)_ - https://news.ycombinator.com/item?id=25530178 - Dec
| 2020 (120 comments)
|
| _The AI Revolution Hasn't Happened Yet_ -
| https://news.ycombinator.com/item?id=16873778 - April 2018 (161
| comments)
| thaumasiotes wrote:
| > Ethnic groups whose surnames were assigned in recent history
| for tax-collection purposes (Ashkenazi Jews, Han Chinese,
| Koreans, etc) have a relatively small pool of surnames and a
| slightly larger pool of first names.
|
| This is... not accurate. The reason the Chinese have a small pool
| of surnames is that their surnames are much _less_ recent than
| ours, not _more_ recent.
|
| And I don't think the Ashkenazi surnames are particularly more
| recent than the surnames of the Europeans they lived among.
| Rather, they have concentrated surnames mostly because they were
| occupationally concentrated.
| simonw wrote:
| https://www.familysearch.org/wiki/en/Jewish_Personal_Names has
| more information about compulsory adoption of surnames amongst
| European Jews for taxation purposes in the 18th century.
| taeric wrote:
| I think the assertion was more that that is when everyone was
| forced to take surnames?
| thaumasiotes wrote:
| From https://en.wikipedia.org/wiki/Surname#History :
|
| > By 1400, most English and some Scottish people used
| surnames, but many Scottish and Welsh people did not adopt
| surnames until the 17th century, or later.
|
| > During the modern era, many cultures around the world
| adopted family names, particularly for administrative
| reasons, especially during the age of European expansion
| and particularly since 1600. Notable examples include the
| Netherlands (1795-1811), Japan (1870s), Thailand (1920),
| and Turkey (1934).
|
| So that would put Ashkenazic surnames at healthily older
| than e.g. Dutch surnames.
| ProjectArcturis wrote:
| Also, the purpose of Republican voter purges is not,
| particularly, to find people who have double-registered. It is
| more useful to the GOP to have a ton of false positives. Having
| a huge headline number allows them to claim that voter fraud is
| rampant ("Over a million people double registered!!!"). It also
| allows them to challenge the registration of many voters that
| the GOP doesn't like. Whether they've actually succeeded or not
| in finding double registration, these challenges raise the bar
| of voting difficulty for the other side.
| ilamont wrote:
| Yeah, that was an odd claim about Han Chinese surnames. Many
| have been around for thousands of years (https://www.chinadaily
| .com.cn/ezine/2007-07/20/content_54412...) and almost all are
| single-character surnames based on a limited set of possible
| sounds (~400 in Mandarin, IIRC)
| thaumasiotes wrote:
| There are still double-character surnames, though not as many
| as there used to be.
|
| That's evolution for you. Some surnames are big winners, some
| go extinct.
|
| (It's also worth noting that Chinese surnames are highly
| concentrated in a sampling sense -- there are just a few
| surnames which cover large chunks of the population -- but if
| you made a list of names, as opposed to a list of people, the
| pool of names would look much larger.)
| mcswell wrote:
| I don't know anything about Chinese surnames, but their
| paucity cannot be due to a limited set of possible sounds.
| First, I would interpret "sounds" as phonemes (including
| tones), and there are far fewer of those than 400. More
| likely what you mean is the number of combinations of
| phonemes into valid Chinese Mandarin monosyllables, of which
| I cannot imagine there being only 400. In any case, there are
| (from what little I've heard) lots of bisyllabic Chinese
| words. Can't they be represented by single (or double)
| characters? There are thousands of commonly used Chinese
| characters, and tens of thousands more uncommonly used
| characters.
| thaumasiotes wrote:
| > More likely what you mean is the number of combinations
| of phonemes into valid Chinese Mandarin monosyllables, of
| which I cannot imagine there being only 400.
|
| That's your problem, not ilamont's. The limited syllable
| inventory of Mandarin Chinese is very well known. No need
| to stretch your imagination over it.
|
| That said, surnames are not limited by the number of
| syllables for the obvious reason that the spelling is part
| of the surname.
| kccqzy wrote:
| You are getting something here by mentioning characters.
| Indeed there are a lot of distinct Chinese surnames written
| in the Chinese script that _become_ identical after
| romanization, especially the romanization in the West where
| different tones are also ignored.
|
| Wikipedia has a nice list of common Chinese surnames at htt
| ps://en.wikipedia.org/wiki/List_of_common_Chinese_surname..
| . and one can easily find examples: like Xu and Xu both
| become Xu after romanization.
| ErikAugust wrote:
| "Everyone wants to do the model work, not the data work"
| civilized wrote:
| Which is sad because data work can lead to real domain
| knowledge, while fitting a grab bag of generic models teaches
| you nothing by itself (wooo, this thing has 0.0003 higher AUC
| than that thing!)
|
| Fitting generic data science predictive models is such a rote
| task these days that there's a crowd of start-ups begging you
| to pay them to automate it for you.
| mcswell wrote:
| I don't know how it is in other fields; I'm a linguist, who
| made the transition to computational linguistics back when you
| had to be a linguist to be a computational linguist (the
| 1980s). Slow forward to statistical (and now neural) comp ling;
| I find it incredibly boring. But the data work still needs to
| be done, and there are still linguists. And even more than me,
| they find computational linguistics (of whatever type) less
| interesting that "real" linguistics. So they will do data work,
| and willingly.
| simonw wrote:
| URL should be changed to
| https://pluralistic.net/2021/08/19/failure-cascades/ - same
| content on the author's site, without having to navigate around
| the Medium paywall.
| dang wrote:
| Ok, changed from https://doctorow.medium.com/machine-learnings-
| crumbling-foun.... Thanks!
| alecco wrote:
| https://archive.is/SoCQN
| dmix wrote:
| You could use this article's underlying thesis to explain why a
| lot of tech companies fail as well.
|
| Google Health is a good example of failing to appreciate
| specialization and domain-expertise. Trying to draw value from
| broad generic data collection when IRL it requires vertical-
| focused domain-oriented collection and analysis to really draw
| value.
|
| Funnelling everything into a giant pool of data only had so much
| value - reducing it to just a proprietary API integrations
| platform in exchange for valuable data.
|
| This AI analogy extends to healthcare in real life: The job of
| any generalist doctor is largely just triaging you to the
| specialists. You reach the limits of their care pretty quickly
| for anything serious.
|
| AI is much the same way, the generic multipurpose tools tend to
| quickly lose value after surface level stuff before requiring
| heavy specialization. Google's search engine is full of custom
| vertical categorization, where simple Pagerank wasn't enough.
|
| This is why startups can be very useful to society as they get
| forced to focus on smaller issues early on, out of pure
| practicality, or quickly die off if they try to bite a bigger
| problems than they can chew.
|
| Almost every major multi-faceted business started off with a few
| 'whales' on which they built their business.
|
| Most of the biggest startup flops have been the ones that took VC
| really early before doing the dirty hard work of truly finding
| and understanding the problems they are trying to solve.
| iamstupidsimple wrote:
| > Google Health is a good example of failing to appreciate
| specialization and domain-expertise. Trying to draw value from
| broad generic data collection when IRL it requires vertical-
| focused domain-oriented collection and analysis to really draw
| value.
|
| I'm not sure I agree with this statement. From what I've heard,
| Google Health employed a huge team of doctors and they were
| included through the entire feature development lifecycle,
| similar to how the product org functions in other software
| companies.
| dmix wrote:
| Hiring a broad set of domain experts != a domain/vertical
| focused business. 'Doctors' can cover a massive disparate
| field of study.
|
| My point is they did it backwards, they should have found
| real world healthcare problems to solve then built the common
| ground between them. Building a generic API platform or cloud
| database turned out to not be the problem anyone needed help
| solving. Most companies who _did_ the integration to Health
| did it for marketing, not because it was essential to any
| business value.
|
| How many companies have done "AI" merely for marketing too?
|
| Google search ranked websites better than anyone, they zeroed
| in on that one problem and removed all the cruft, while Yahoo
| and others were jamming as much crap into their 'portals' as
| possible. Google seemed to have forgot that lesson.
|
| Waymo fell for this too. They built an entirely new type of
| car and gambled on a whole new taxi service (among other
| promises) that would entirely disrupt transportation - as the
| starting point. Innovation rarely ever jumps ten steps ahead
| like that. They chose to solve a thousand problems at once
| while the rest of the world with actual delivered products
| are struggling to solve even assisted highway driving in
| high-end luxury cars.. cars people were going to buy anyway.
| AlbertCory wrote:
| Algorithm:
|
| 1) Decide to take over Domain X.
|
| 2) Hire a bunch of people from Domain X. Don't hire anyone
| who doesn't agree that you _can_ take over X.
|
| 3) Make them report to the people whose idea it was in the
| first place. If they say "Hey, maybe this wasn't such a great
| idea" then push them out, as an example to the rest.
|
| 4) FAIL.
|
| Note that #1 is the key. The decision to do it precedes the
| hiring.
| aledalgrande wrote:
| Seems to me that the foundations are not crumbling, but there
| should be a way to formally determine how good is a model going
| to be in the wild before it is used, especially in certain
| industries. Which I think it's where research is focused on these
| days? White box models, Bayesian distributions etc.?
| killjoywashere wrote:
| I ask the following as someone who builds and tests models and
| also annotates data as a domain expert. Is labeling really
| undervalued by society? Or just by VCs?
|
| I mean, if society depends more on the labeler (e.g.
| radiologists) why should society reward people for trying to
| replace the radiologists, regardless of the data quality?
|
| From a societal perspective where human factors scientists tell
| us that we need people to actually be employed to achieve a sense
| of self-worth and happiness, shouldn't we punish labelers who
| might otherwise only enrich the capitalists and undermine the
| health of the nation's workforces, and thus the wellbeing of the
| nation as a whole? Did we learn nothing from the underemployed,
| disaffected, demoralized, suicidally depressed Trump electorate?
|
| The Trump presidency may be a hot mess from which the country may
| never recover, but are these not the lessons that we ostensibly
| learned, that were the topic of millions of gallons of ink
| between 2016 and 2018?
| clircle wrote:
| I've always thought it was very sad and unfortunate that core
| data classes like sampling design and experimental design have
| fallen out of academic style.
| exo-pla-net wrote:
| TLDR: Many ML models in production are terrible, because they
| were trained on terrible data. These bad models are being used in
| high stakes situations, such as COVID-19 detection. ML engineers
| need professional ethos/regulation, analogous to how civic
| engineers seeking to build a bridge don't screw around.
|
| My take: Yep, if the model is used a high stakes situation, this
| is absolutely the case. The model should be required to undergo
| rigorous testing / peer review before it's released into the
| wild. In a high stakes situations, we have to ensure that a model
| is good before people get their hands on it, because people can
| be reliably depended on to treat the model as an oracle.
|
| The metaphor of a "crumbling foundation" is a bad one, though.
| It's just unregulated; models aren't leaning on one another, and
| there isn't a risk of wholesale collapse.
| axelroze wrote:
| It's a structural issue caused by the way wealth creation works
| for majority of people in tech. Job hopping, trendy frameworks in
| CV, "high-impact" projects done ASAP, etc.
|
| No one wants to do boring, slow pace work with lots of planning,
| reflection and introspection. And why would they do it? These
| kind of jobs are usually worst paid. We, the practitioners, have
| every economic incentive to go the other route.
|
| The problem goes far wider in tech than just ML. And unless the
| society collectively learns to appreciate patience and long-term
| thinking, as virtues above all else, it won't go away any time
| soon. What can be done is to discourage use of ML systems if an
| explainable deterministic system can be used (even one developed
| in a rush). For example credit scoring. Rules are good while
| black box artificial neural network isn't, even if the NN has
| some % more accuracy. Then if the rules are not good then can be
| amended and in special cases customer support could also override
| the rules based on human (hopefully unbiased) judgement.
|
| The problem mentioned in the article of COVID-19 detection based
| on radiology scans is an example of a system which needs ANNs due
| to the nature of image processing (very difficult problem for
| rules AI). While techniques such as ShAP could be helpful a
| radiologist still needs to check because ANNs learn a lot of
| useless noise very often and the prediction can be nonsensical.
| Here it would be best to use PCR tests, serology or any more
| traditional and "boring" tool as it works. Luckily that is the
| case and shit CNN models start and end their lives in some
| useless paper.
| TuringNYC wrote:
| I saw a large organization which was the epitome of this --
| Executive Directors would propose ambitious ML projects,
| Directors would create plans and teams, Managers would execute
| on budgets, create more detailed plans, and then...someone
| actually needed to do the work.
|
| Because of the length of the effort, the annual compensation
| would already have been handed out and the EDs, Directors,
| Managers had already "extracted" their compensation for the
| project, but usually had none left for the workers who
| eventually needed to do the actual work.
|
| Not unexpectedly, a rough job was somehow jammed thru with
| understaffed, underpaid, and unmotivated low-level workers to
| actually "deliver" on the "AI" projects -- so victory could be
| declared at the top level...and new projects could begin.
|
| This isnt an ML problem, i'm sure the whole cycle has been
| repeated with technology-of-the-day generation after
| generation. It has more to do with governance and
| organizational maturity to measure real impacts.
| bsanr2 wrote:
| Why would anyone care to fix things? The way they are are
| perfectly amenable to the blame- and conclusion-laundering many
| ML clients seek.
| [deleted]
| xyzzy21 wrote:
| Sadly PCR tests for COVID also test positive for flu and half a
| dozen other causes. That's why CDC/FDA are seeking proposals
| for a new test that actually works!
|
| https://www.cdc.gov/csels/dls/locs/2021/07-21-2021-lab-alert...
| maxerickson wrote:
| You've fallen for the internet. Please restart and try again.
|
| https://www.reuters.com/article/factcheck-covid19-pcr-
| test-i...
| nerdponx wrote:
| Sigh. The deniers and antivaxers will have made up their
| minds already, and this will just be perceived as part of
| the mass media coverup. It's hopeless.
| ramchip wrote:
| You've been repeating this, but it doesn't seem to be true...
|
| https://news.ycombinator.com/item?id=28262833
| data4lyfe wrote:
| The garbage in garbage out cascading failure generally seems to
| crash pretty fast. Given the U.S. is a capitalistic society the
| companies / institutions that do this and don't achieve their
| goals through data science should be apparent and then fail
| accordingly.
|
| Am I missing something here?
| wffurr wrote:
| The trail of devastation left by this process, in financial and
| human terms, when medical systems go awry or vendors to state
| judicial systems wrongly convict innocent people.
| mcswell wrote:
| I agree about your latter example, but about your first
| example: isn't it the case that these faulty AI systems for
| medical diagnosis have been rejected? Doctors don't like them
| because they don't want to be replaced or one-upped, and
| because they just don't trust them (rightly so, as it turns
| out). So the systems, which were put out for use on a trial
| basis, don't get used.
| nixpulvis wrote:
| Why is prose in monospace!? This style needs to die.
| simonw wrote:
| My favourite example of bad data in for machine learning is the
| tragic tale of Scots Wikipedia: https://www.theguardian.com/uk-
| news/2020/aug/26/shock-an-aw-...
|
| It turned out an enthusiastic but misguided US teenager who
| didn't actually know the Scots language was responsible for most
| of the entries on it... and a bunch of natural language machine
| learning models had already been trained on it.
| qweqwweqwe-90i wrote:
| Scots pretty much is a dialect of English that is phonetically
| spelt out - it's not surprising that a US teenager could write
| it.
| m-i-l wrote:
| > _" Scots pretty much is a dialect of English that is
| phonetically spelt out - it's not surprising that a US
| teenager could write it."_
|
| No, there are a number of distinct linguistic features of
| Scots, and it has its own regional dialects, e.g. Doric,
| Orcadian, Shetland (which is also in part based on the
| extinct Norn language). See e.g. https://dsl.ac.uk/about-
| scots/history-of-scots/ (and sub-pages such as
| https://dsl.ac.uk/about-scots/history-of-scots/grammar/ ) for
| further information. Simply doing a dictionary-lookup word-
| replacement completely misses all of this nuance.
| amelius wrote:
| Reminds me of:
| https://en.wikipedia.org/wiki/English_as_She_Is_Spoke
| simonw wrote:
| I've seen this in a corporate setting: a machine learning model
| trained to automatically apply categories to new content based on
| user-select categories for existing content... that failed to
| take into account that the category list itself was poorly
| chosen, so the user-selected categories didn't have a
| particularly strong relationship to the content they were
| classifying.
| notanzaiiswear wrote:
| It sounds like cherry picking bad examples to me. Likewise you
| could say "programming's foundations are crumbling" by citing all
| sorts of programming projects that use bad or faulty code.
|
| Meanwhile, speech recognition seems to work extremely well by now
| (I am a little bit older, so I remember when it didn't work so
| well).
|
| I am also not aware of any real world cases of AI being used to
| detect Corona, so that seems to be an example in favour of AI.
| People tried to use AI, but it didn't work out. So it isn't being
| used for that purpose.
| wffurr wrote:
| The infosec meltdown sure seems to indicate programming's
| foundations are crumbled. All of the unsafe C library code
| underlying nearly ever modern system is unsafe at any speed.
| wnoise wrote:
| > programming's foundations are crumbling
|
| That's also correct, and has been for some time (it got worse
| on each tech boom). This may just be a special case of that.
| notanzaiiswear wrote:
| What do you mean? At least from the point of view of the end
| user, apps seem to become better over time.
| AlbertCory wrote:
| I was tempted to just downvote this, but I thought I'd
| reply instead:
|
| No, they do not. An _existing_ version of an app may get
| better over time, but unfortunately it then gets replaced
| with a different version, which starts from the position of
| extreme bugginess.
|
| In the case of Microsoft Office apps, for instance, one
| could easily argue that they are steadily getting worse as
| more and more features are added.
|
| Google Chrome is pretty clearly getting worse in terms of
| the amount of memory it uses. I could go on.
| notanzaiiswear wrote:
| So why not go back to some old version of it? I don't
| think "memory consumption" is necessarily a good
| indicator, because sometimes using more memory is a sign
| of good optimization.
|
| Also how is the memory consumption if you turn off all
| modern features?
| qayxc wrote:
| > Meanwhile, speech recognition seems to work extremely well by
| now (I am a little bit older, so I remember when it didn't work
| so well).
|
| *provided you speak English or Mandarin, the former preferably
| of a continental US variety
|
| It's astonishing how bad things get again once you mix in an
| accent, local dialect (e.g. Swiss German) or a less frequently
| spoken language (like Croatian).
| notanzaiiswear wrote:
| Nevertheless, the huge jump is from "does not work at all" to
| "it works". It seems likely that the technology that worked
| for English will also work for many other languages.
|
| As for Chinese, it is also pretty amazing that you can visit
| a Chinese website, click "translate" in your browser's menu
| bar, and get a reasonably readable translated version.
|
| I wonder if people just take too many things for granted.
|
| Or internet search - they say quality of Google searches have
| been declining, nevertheless we had a pretty good run for the
| past 20 years or so with being able to find information on
| the internet. That is AI as well.
| alpaca128 wrote:
| > It seems likely that the technology that worked for
| English will also work for many other languages.
|
| It won't for the foreseeable future. Not for technical
| reasons; it's just that other languages are usually not
| handled correctly because most companies think they can
| just use the exact same approach as in English and they're
| done.
|
| Until they realise that non-English native speakers also
| use English words and abbreviations to some degree, both in
| IT-related contexts but also in everyday life. Now it
| doesn't just need to handle that one language but also
| English with an accent. If they're lucky it'll work
| reasonably well in most cases despite variations depending
| on the region.
|
| Right now even keyboard completion suggestions struggle
| with mixing languages and become completely useless in some
| cases. As English words may be mixed in at any location
| (and in wildly different frequencies depending on the user)
| the software now has to guess the language for every single
| word. The results are not great.
|
| > they say quality of Google searches have been declining,
| nevertheless we had a pretty good run for the past 20 years
| or so
|
| As long as Google continues with blunders like showing
| wrong pictures of people in infoboxes they'll keep failing
| hard. Their amazing AI shows wrong pictures for serial
| killers, rape victims and more, which already led to
| consequences for those people. What makes it much worse is
| that when someone complains about such a case Google will
| just replace that picture with another wrong portrait - if
| they react at all. It would be helpful if those big tech
| companies would for once trust in human intelligence
| instead of throwing larger models at the problem.
| notanzaiiswear wrote:
| Maybe people using Google should start to apply some
| common sense and not believe everything at face value.
| Nevertheless, the examples you cite are extremes that
| affect only few people. So you would rather have no
| internet search engines at all, so that those problems
| could be avoided?
|
| Isn't that a bit like saying cars are crap because people
| die in accidents? Maybe there are just upsides and
| downsides to most new technologies, and if the upsides
| outweigh the downsides by far, people will go for it?
|
| As for human intelligence, I am not convinced humans
| would necessarily fare better at such tasks. I mean they
| fall for the "same name, same person" fallacy.
| robbedpeter wrote:
| Why would you expect dialects with vastly fewer training
| examples to be on par with the most widely spoken languages?
| It's a simple matter of available data, and the state of the
| art architectures operate on a paradigm that scales quality
| of the model to quantity of training data.
|
| If you want better speech recognition for Swiss-German, then
| record and transcribe hundreds of thousands of hours or
| whatever level of parity you want to achieve with
| recognition.
|
| It's not "astonishing" at all. Models won't generalize well
| unless they have sufficient data, so to achieve multi accent
| functionality, we need lots more high quality data. Or we
| need better architectures, so identifying where models fail
| and engineering a better architecture could be a
| breakthrough. The shortcomings are not surprising or
| mysterious at all, it's simply a function of the nature of
| these algorithms.
| nerdponx wrote:
| > it's simply a function of the nature of these algorithms
|
| Addendum: don't overlook the incentives and biases of the
| people building said algorithms.
___________________________________________________________________
(page generated 2021-08-22 23:01 UTC)