[HN Gopher] My 2024 AI Predictions
       ___________________________________________________________________
        
       My 2024 AI Predictions
        
       Author : nichochar
       Score  : 40 points
       Date   : 2024-01-08 18:15 UTC (4 hours ago)
        
 (HTM) web link (axflow.dev)
 (TXT) w3m dump (axflow.dev)
        
       | MichaelMug wrote:
       | I am unable to click any of the links on this article, when
       | reading with Safari iOS
        
         | ParacelsusOfEgg wrote:
         | Same with both Firefox and Chrome on Android.
        
         | tamimio wrote:
         | It gets the whole page selected instead, maybe some js to
         | prevent copying it but poorly configured.
        
         | oh_sigh wrote:
         | On desktop there is a script running which, at the very least,
         | tries to mimic custom styling of selecting text...for
         | apparently absolutely no reason at all. I bet that script is
         | configured for mouse events but not touch events. Very silly.
        
         | nichochar wrote:
         | Fixed, flex and z-index issue on my part, sorry
        
       | armchairhacker wrote:
       | Some of these seem reasonable but I disagree with this:
       | 
       | > A decent engineer will likely be able to write a slack-like
       | application, definitely good enough to cancel the 500k/year
       | contract, in a couple of months.
       | 
       | A decent engineer can already crank out a working Slack prototype
       | within a couple of months, and there are mature Slack
       | alternatives today. There's a reason companies are paying
       | $500k/year, and I doubt it's the code: maybe it's the enterprise
       | support, the external integrations, or even just the name
       | recognition.
       | 
       | Companies getting leaner may be true (it seems like this has
       | already been happening the past couple years regardless of AI,
       | and companies used to be lean in the 2010s).
        
         | kjuulh wrote:
         | While yes, an engineer could build a slack clone. Probably
         | initially quite poor and lacking features. If you're not in the
         | business of building chat applications, having to actually
         | maintain such an application becomes a burden. While you may
         | save 500k for a few years. A few years down the road, when said
         | engineer leaves, you will end up having to pay the cost either
         | to exit said app, or spend a lot more engineering effort on it.
         | 
         | There definitely is room for such applications, but a chat
         | application probably won't set the business apart.
        
         | sosodev wrote:
         | Totally agreed. If companies didn't want to pay for Slack they
         | would have jumped to the alternatives already.
        
       | resource0x wrote:
       | Is the author famous for correctly predicting anything before?
        
         | datameta wrote:
         | Does this count as a form of appeal to authority?
        
         | nichochar wrote:
         | I predicted that if I made the front page of HN someone might
         | be a little toxic about me!
        
       | mjr00 wrote:
       | > Unstructured document parsing
       | 
       | If I had to invest in any one area of LLM usage, it would be
       | this. There is _so_ much unstructured data in the world, and
       | converting things like legal contracts or chatlogs into
       | structured, queryable data is absurdly powerful. Nobody wants to
       | talk about this usage for LLMs because they 're too busy making
       | TikToks about how GPT4 actually has a soul or whatever, but this
       | will be the lasting legacy of LLMs after the hype around
       | generative AI dies out.
       | 
       | > A decent engineer will likely be able to write a slack-like
       | application, definitely good enough to cancel the 500k/year
       | contract, in a couple of months.
       | 
       | And this is why generative AI is massively overhyped: the people
       | hyping it don't understand the true value of the products they
       | allegedly replace. Very similar to the crypto/blockchain hype
       | where people who understood nothing about banking or logistics
       | insisted that blockchain would solve all the problems there. If
       | you think a corp is paying Slack $500k/year because it's hard to
       | write a piece of software that can send messages between people
       | in an organization, you're completely off base. (IRC exists, can
       | do this and is free by the way.)
        
         | akudha wrote:
         | _There is so much unstructured data in the world_
         | 
         | Can you give a couple of specific examples? There are already
         | sites like pdf.ai that let users chat with documents, are you
         | thinking of something different?
        
           | mjr00 wrote:
           | The point isn't to have users chat with documents, it's to
           | automatically parse unstructured data into structured data
           | and store that somewhere for later use. As a real example, I
           | deal with piles of legal contracts from which I need to
           | extract specific information so I can perform later analysis,
           | to be able to answer questions like "what specific model of
           | widget is under contract here" and "what's the average
           | contract value" and "what's the average contract value of
           | widget XYZ". All stuff that's very easy to answer in SQL[0],
           | once you have the data -- but extracting that data from legal
           | documents, many of which are not in English, previously
           | required a small army of contractors. Now it's been replaced
           | with a local LLM that parses those relevant contract details
           | into JSON which gets stored into a database. The accuracy is
           | suitable for my use case, though not 100%.
           | 
           | [0] and very hard to answer with an LLM, as they are
           | notoriously awful at doing any sort of math.
        
             | akudha wrote:
             | Thank you, I understand your use case better.
             | 
             | If possible, could you talk a bit about your process for
             | training your model? It looks like it is specific to legal
             | documents, how easy/hard would it be to do the same for
             | other types of documents?
             | 
             | Also, what level of accuracy is good enough, for your use
             | case?
        
               | mjr00 wrote:
               | The beauty of it is there's no model training involved:
               | it's quite literally a prompt that reads something like
               | "given the following document, output JSON that contains
               | the following information: a field name `"widget"` that
               | contains widget name, etc..." then including the doc
               | right into the prompt. This was written a while ago so it
               | just iterates in a python script dividing the source
               | document up and aggregating to get around context length
               | limits. Extremely simple, but works great.
               | 
               | I don't have a really specific accuracy target but since
               | my main interest is in the aggregated results it's not a
               | huge deal if the aggregations are slightly off. (The
               | manual human approach is not 100% accurate either, after
               | all; it's extremely common for humans to make data entry
               | mistakes.)
        
           | bitshiftfaced wrote:
           | An example would be if you had a large article of text
           | recording the history of a given subject, and you wanted a
           | table of years and the number of times a given event happened
           | in each year. It's now possible to do a task in less than a
           | minute that which used to take hours.
        
         | jakderrida wrote:
         | > If I had to invest in any one area of LLM usage, it would be
         | this. There is so much unstructured data in the world, and
         | converting things like legal contracts or chatlogs into
         | structured, queryable data is absurdly powerful.
         | 
         | I tested poorly OCR'd text from a late 1800s magazine and was
         | pretty impressed with the results from both GPT-4 and even
         | Bard. In addition to making inferences based on the letters, it
         | was able to infer from context the historically accurate terms.
         | However, one prompt asking it to correct words didn't work as
         | well as two prompts with the first one listing candidates for
         | bad OCR'd words.
         | 
         | While that might not seem like what you're talking about, it
         | has the benefit of adding to the overall corpus that the future
         | models are trained on. Also, with the NYT lawsuit, I'd presume
         | fixing OCRs of old magazines and articles would be a pretty
         | good way to fill the gap left behind.
        
       | rgbrgb wrote:
       | > I predict non-smartphone AI devices will fail. The AI device of
       | the future is likely an iPhone or android phone with a dedicated
       | GPU chip for AI.
       | 
       | I go back and forth on this. While I see this being the case for
       | data collection wearables like humane or tab, it makes sense to
       | have a personal AI computer like bedrock [0], tinybox [1], or a
       | mac studio for running background tasks on personal data. If
       | you're running agents that do more than chat, you need something
       | that's going to be able to handle doing inference for extended
       | periods of time without worrying about heat or battery life. You
       | likely also want something capable of doing fine-tune level
       | training on your personal inputs. A lot of the more interesting
       | use-cases are on data you probably don't want to expose to a
       | cloud provider. That said, probably Apple is eventually going to
       | crush here as well, but maybe there's room for a challenger to
       | develop as this niche opens up.
       | 
       | [0]: https://www.bedrock.computer/gal [1]: https://tinygrad.org
        
         | datameta wrote:
         | I think distributed TinyML(aka AIoT) with multiple Oura-like
         | wearables on the body for (near)total health monitoring is al
         | likely contender.
        
       | joshuahedlund wrote:
       | Am I the only one who does not immediately see a quality
       | difference between the two photos in the embedded tweet?
        
       | bl0b wrote:
       | > I personally regularly use the "voice" version of chatGPT to
       | brainstorm with it while I walk my dog. We sped past the Turing
       | test so fast that no one even beat an eyelash about it
       | 
       | I don't think that just because the author has a pseudo-
       | conversation with ChatGPT using voice as the interface means
       | we've passed the Turing test.
       | 
       | They don't seem to be actively interrogating ChatGPT to determine
       | whether it's a human or not - something that I'd expect would
       | still be quite easy to do. And, as I understand it, the Turing
       | test could be administered over text.
        
         | rogerclark wrote:
         | The truth is that the Turing test turned out to be useless.
         | Whether we have passed it or not has no bearing on my life or
         | anyone else's. The way I talk to ChatGPT isn't the way I talk
         | to a real person, despite it already being capable of
         | communicating with human language, teaching me things, and
         | helping with my work and daily life. No real person would
         | tolerate a turn-by-turn exchange of 2 minute monologues, but
         | that's (apparently) what I want from an AI.
         | 
         | And millions of people are fooled into thinking GPT is a real
         | person every day, with spam and robocalls and social media
         | bots. Maybe it won't fool everyone all the time, but it can
         | fool some people a lot of the time. And it's only going to get
         | more sophisticated. The only ones concerned about the Turing
         | test are 70 year old GOFAI professors -- everyone else is
         | dealing with the practical realities of computers suddenly
         | having language capabilities.
        
       | xnx wrote:
       | > A decent engineer will likely be able to write a slack-like
       | application, definitely good enough to cancel the 500k/year
       | contract, in a couple of months.
       | 
       | People are rightfully calling out this bit. It still wouldn't
       | make sense for a Slack customer to make their own version of
       | Slack in-house, but it does lower the bar for a lot of Slack
       | competitors to get to feature parity much faster.
        
         | nerdponx wrote:
         | Also if this were true, we'd see this happening with existing
         | platforms like Rocket Chat and Zulip. And likewise we should
         | see velocity of open-source projects skyrocket.
        
       | mrloba wrote:
       | Has anyone had any success in code generation? I feel like
       | chatgpt usually completely fails to write even a small function
       | correctly unless it's a very trivial or well known problem. I
       | usually have to go back and forth for a good long while
       | explaining all the different bugs to it, and even then it often
       | doesn't succeed (but often claims it's fixed the bugs). The types
       | of things it gets wrong makes it a bit hard to believe it could
       | improve enough to really boost dev productivity this year.
        
         | nichochar wrote:
         | Hi, author here.
         | 
         | This is a pretty hard problem. And I haven't found anyone
         | that's too good at this, but here are some interesting players:
         | 
         | - https://www.phind.com/ is a custom model fine-tuned on code,
         | and pretty damn good - https://codestory.ai is a VSCode fork
         | with an assistant built in. One of the things it does for you
         | is write code, but imo that's not its biggest strength yet. -
         | https://sweep.dev have a bot where you create a GitHub comment
         | and it writes the PR to fix it. They have between 30% and 70%
         | success rate. This is pretty bad but they're one of the best
         | today - https://sourcegraph.com is pivoting and building a
         | copilot application (named Cody). This is pretty good, since
         | sourcegraph is great at understanding your code
        
         | adoxyz wrote:
         | Have you tried Cody (https://cody.dev)? Cody has a deep
         | understanding of your codebase and generally does much better
         | at code gen than just one-shotting GPT4 without context.
         | 
         | (disclaimer: I work at Sourcegraph)
        
       ___________________________________________________________________
       (page generated 2024-01-08 23:02 UTC)