[HN Gopher] My 2024 AI Predictions
___________________________________________________________________
My 2024 AI Predictions
Author : nichochar
Score : 40 points
Date : 2024-01-08 18:15 UTC (4 hours ago)
(HTM) web link (axflow.dev)
(TXT) w3m dump (axflow.dev)
| MichaelMug wrote:
| I am unable to click any of the links on this article, when
| reading with Safari iOS
| ParacelsusOfEgg wrote:
| Same with both Firefox and Chrome on Android.
| tamimio wrote:
| It gets the whole page selected instead, maybe some js to
| prevent copying it but poorly configured.
| oh_sigh wrote:
| On desktop there is a script running which, at the very least,
| tries to mimic custom styling of selecting text...for
| apparently absolutely no reason at all. I bet that script is
| configured for mouse events but not touch events. Very silly.
| nichochar wrote:
| Fixed, flex and z-index issue on my part, sorry
| armchairhacker wrote:
| Some of these seem reasonable but I disagree with this:
|
| > A decent engineer will likely be able to write a slack-like
| application, definitely good enough to cancel the 500k/year
| contract, in a couple of months.
|
| A decent engineer can already crank out a working Slack prototype
| within a couple of months, and there are mature Slack
| alternatives today. There's a reason companies are paying
| $500k/year, and I doubt it's the code: maybe it's the enterprise
| support, the external integrations, or even just the name
| recognition.
|
| Companies getting leaner may be true (it seems like this has
| already been happening the past couple years regardless of AI,
| and companies used to be lean in the 2010s).
| kjuulh wrote:
| While yes, an engineer could build a slack clone. Probably
| initially quite poor and lacking features. If you're not in the
| business of building chat applications, having to actually
| maintain such an application becomes a burden. While you may
| save 500k for a few years. A few years down the road, when said
| engineer leaves, you will end up having to pay the cost either
| to exit said app, or spend a lot more engineering effort on it.
|
| There definitely is room for such applications, but a chat
| application probably won't set the business apart.
| sosodev wrote:
| Totally agreed. If companies didn't want to pay for Slack they
| would have jumped to the alternatives already.
| resource0x wrote:
| Is the author famous for correctly predicting anything before?
| datameta wrote:
| Does this count as a form of appeal to authority?
| nichochar wrote:
| I predicted that if I made the front page of HN someone might
| be a little toxic about me!
| mjr00 wrote:
| > Unstructured document parsing
|
| If I had to invest in any one area of LLM usage, it would be
| this. There is _so_ much unstructured data in the world, and
| converting things like legal contracts or chatlogs into
| structured, queryable data is absurdly powerful. Nobody wants to
| talk about this usage for LLMs because they 're too busy making
| TikToks about how GPT4 actually has a soul or whatever, but this
| will be the lasting legacy of LLMs after the hype around
| generative AI dies out.
|
| > A decent engineer will likely be able to write a slack-like
| application, definitely good enough to cancel the 500k/year
| contract, in a couple of months.
|
| And this is why generative AI is massively overhyped: the people
| hyping it don't understand the true value of the products they
| allegedly replace. Very similar to the crypto/blockchain hype
| where people who understood nothing about banking or logistics
| insisted that blockchain would solve all the problems there. If
| you think a corp is paying Slack $500k/year because it's hard to
| write a piece of software that can send messages between people
| in an organization, you're completely off base. (IRC exists, can
| do this and is free by the way.)
| akudha wrote:
| _There is so much unstructured data in the world_
|
| Can you give a couple of specific examples? There are already
| sites like pdf.ai that let users chat with documents, are you
| thinking of something different?
| mjr00 wrote:
| The point isn't to have users chat with documents, it's to
| automatically parse unstructured data into structured data
| and store that somewhere for later use. As a real example, I
| deal with piles of legal contracts from which I need to
| extract specific information so I can perform later analysis,
| to be able to answer questions like "what specific model of
| widget is under contract here" and "what's the average
| contract value" and "what's the average contract value of
| widget XYZ". All stuff that's very easy to answer in SQL[0],
| once you have the data -- but extracting that data from legal
| documents, many of which are not in English, previously
| required a small army of contractors. Now it's been replaced
| with a local LLM that parses those relevant contract details
| into JSON which gets stored into a database. The accuracy is
| suitable for my use case, though not 100%.
|
| [0] and very hard to answer with an LLM, as they are
| notoriously awful at doing any sort of math.
| akudha wrote:
| Thank you, I understand your use case better.
|
| If possible, could you talk a bit about your process for
| training your model? It looks like it is specific to legal
| documents, how easy/hard would it be to do the same for
| other types of documents?
|
| Also, what level of accuracy is good enough, for your use
| case?
| mjr00 wrote:
| The beauty of it is there's no model training involved:
| it's quite literally a prompt that reads something like
| "given the following document, output JSON that contains
| the following information: a field name `"widget"` that
| contains widget name, etc..." then including the doc
| right into the prompt. This was written a while ago so it
| just iterates in a python script dividing the source
| document up and aggregating to get around context length
| limits. Extremely simple, but works great.
|
| I don't have a really specific accuracy target but since
| my main interest is in the aggregated results it's not a
| huge deal if the aggregations are slightly off. (The
| manual human approach is not 100% accurate either, after
| all; it's extremely common for humans to make data entry
| mistakes.)
| bitshiftfaced wrote:
| An example would be if you had a large article of text
| recording the history of a given subject, and you wanted a
| table of years and the number of times a given event happened
| in each year. It's now possible to do a task in less than a
| minute that which used to take hours.
| jakderrida wrote:
| > If I had to invest in any one area of LLM usage, it would be
| this. There is so much unstructured data in the world, and
| converting things like legal contracts or chatlogs into
| structured, queryable data is absurdly powerful.
|
| I tested poorly OCR'd text from a late 1800s magazine and was
| pretty impressed with the results from both GPT-4 and even
| Bard. In addition to making inferences based on the letters, it
| was able to infer from context the historically accurate terms.
| However, one prompt asking it to correct words didn't work as
| well as two prompts with the first one listing candidates for
| bad OCR'd words.
|
| While that might not seem like what you're talking about, it
| has the benefit of adding to the overall corpus that the future
| models are trained on. Also, with the NYT lawsuit, I'd presume
| fixing OCRs of old magazines and articles would be a pretty
| good way to fill the gap left behind.
| rgbrgb wrote:
| > I predict non-smartphone AI devices will fail. The AI device of
| the future is likely an iPhone or android phone with a dedicated
| GPU chip for AI.
|
| I go back and forth on this. While I see this being the case for
| data collection wearables like humane or tab, it makes sense to
| have a personal AI computer like bedrock [0], tinybox [1], or a
| mac studio for running background tasks on personal data. If
| you're running agents that do more than chat, you need something
| that's going to be able to handle doing inference for extended
| periods of time without worrying about heat or battery life. You
| likely also want something capable of doing fine-tune level
| training on your personal inputs. A lot of the more interesting
| use-cases are on data you probably don't want to expose to a
| cloud provider. That said, probably Apple is eventually going to
| crush here as well, but maybe there's room for a challenger to
| develop as this niche opens up.
|
| [0]: https://www.bedrock.computer/gal [1]: https://tinygrad.org
| datameta wrote:
| I think distributed TinyML(aka AIoT) with multiple Oura-like
| wearables on the body for (near)total health monitoring is al
| likely contender.
| joshuahedlund wrote:
| Am I the only one who does not immediately see a quality
| difference between the two photos in the embedded tweet?
| bl0b wrote:
| > I personally regularly use the "voice" version of chatGPT to
| brainstorm with it while I walk my dog. We sped past the Turing
| test so fast that no one even beat an eyelash about it
|
| I don't think that just because the author has a pseudo-
| conversation with ChatGPT using voice as the interface means
| we've passed the Turing test.
|
| They don't seem to be actively interrogating ChatGPT to determine
| whether it's a human or not - something that I'd expect would
| still be quite easy to do. And, as I understand it, the Turing
| test could be administered over text.
| rogerclark wrote:
| The truth is that the Turing test turned out to be useless.
| Whether we have passed it or not has no bearing on my life or
| anyone else's. The way I talk to ChatGPT isn't the way I talk
| to a real person, despite it already being capable of
| communicating with human language, teaching me things, and
| helping with my work and daily life. No real person would
| tolerate a turn-by-turn exchange of 2 minute monologues, but
| that's (apparently) what I want from an AI.
|
| And millions of people are fooled into thinking GPT is a real
| person every day, with spam and robocalls and social media
| bots. Maybe it won't fool everyone all the time, but it can
| fool some people a lot of the time. And it's only going to get
| more sophisticated. The only ones concerned about the Turing
| test are 70 year old GOFAI professors -- everyone else is
| dealing with the practical realities of computers suddenly
| having language capabilities.
| xnx wrote:
| > A decent engineer will likely be able to write a slack-like
| application, definitely good enough to cancel the 500k/year
| contract, in a couple of months.
|
| People are rightfully calling out this bit. It still wouldn't
| make sense for a Slack customer to make their own version of
| Slack in-house, but it does lower the bar for a lot of Slack
| competitors to get to feature parity much faster.
| nerdponx wrote:
| Also if this were true, we'd see this happening with existing
| platforms like Rocket Chat and Zulip. And likewise we should
| see velocity of open-source projects skyrocket.
| mrloba wrote:
| Has anyone had any success in code generation? I feel like
| chatgpt usually completely fails to write even a small function
| correctly unless it's a very trivial or well known problem. I
| usually have to go back and forth for a good long while
| explaining all the different bugs to it, and even then it often
| doesn't succeed (but often claims it's fixed the bugs). The types
| of things it gets wrong makes it a bit hard to believe it could
| improve enough to really boost dev productivity this year.
| nichochar wrote:
| Hi, author here.
|
| This is a pretty hard problem. And I haven't found anyone
| that's too good at this, but here are some interesting players:
|
| - https://www.phind.com/ is a custom model fine-tuned on code,
| and pretty damn good - https://codestory.ai is a VSCode fork
| with an assistant built in. One of the things it does for you
| is write code, but imo that's not its biggest strength yet. -
| https://sweep.dev have a bot where you create a GitHub comment
| and it writes the PR to fix it. They have between 30% and 70%
| success rate. This is pretty bad but they're one of the best
| today - https://sourcegraph.com is pivoting and building a
| copilot application (named Cody). This is pretty good, since
| sourcegraph is great at understanding your code
| adoxyz wrote:
| Have you tried Cody (https://cody.dev)? Cody has a deep
| understanding of your codebase and generally does much better
| at code gen than just one-shotting GPT4 without context.
|
| (disclaimer: I work at Sourcegraph)
___________________________________________________________________
(page generated 2024-01-08 23:02 UTC)