[HN Gopher] Copilot for Everything: Training your AI replacement...
       ___________________________________________________________________
        
       Copilot for Everything: Training your AI replacement one keystroke
       at a time
        
       Author : jxmorris12
       Score  : 114 points
       Date   : 2025-03-01 16:33 UTC (6 hours ago)
        
 (HTM) web link (substack.com)
 (TXT) w3m dump (substack.com)
        
       | ZYbCRq22HbJ2y7 wrote:
       | Until you have an AI manager that calls out your bullshit and
       | fires you without having to worry about the messy human empathy
       | aspect.
        
         | falcor84 wrote:
         | If you still work at a company that concerns itself with the
         | empathetic aspect of firing people, your experience isn't
         | representative.
        
         | dtgmzac wrote:
         | when you have AI manager, you will have AI employee incoming
        
       | fluidcruft wrote:
       | The flip side to this is that if you come to depend on a
       | companies AI, then if you leave or are let go, you are leaving a
       | significant part of yourself behind that you cannot take with
       | you.
       | 
       | This is a real problem companies need to address before I even
       | begin to trust or rely on these tools. It's going to lead to
       | massive growth of shadow IT.
        
         | wkat4242 wrote:
         | Also the 'agent' features. Our company has been blocking all
         | agent features from regular employee access because they don't
         | want random users building their own automations. This kind of
         | stuff requires care and an eye for regulations like GDPR.
         | 
         | To permit this as Microsoft wants would lead to a lot of shadow
         | IT. Which will be really hard to get rid of. I compare it to
         | lotus notes which beside being an email client was also a
         | competent database. Over the decades we used it users built up
         | a huge library of hobby tools, many of which wormed their way
         | into critical business processes. Making it really difficult to
         | move away to another solution because the people that knew how
         | it worked were long gone.
         | 
         | I suspect this is exactly what Microsoft wants. Us being locked
         | into copilot so they can charge ever more for it. This is kinda
         | their business model anyway.
         | 
         | Under the hood it's really not that special, it's just ChatGPT.
         | Some special sauce to make it talk to office 365 but that's
         | about it.
        
           | mistrial9 wrote:
           | > Us being locked into copilot
           | 
           | the customer of MSFT is management; product design and
           | implementation for the C-Suite, their lawyers and their
           | investors . You are a _tool_ ; there is no _us_ in this
           | picture.
        
           | fluidcruft wrote:
           | All I need is Citix and I can automate anything. If it's on a
           | screen and I have a mouse and/or keyboard I can automate it.
        
             | MrLeap wrote:
             | I'd jokingly say "Can you automate authentic human
             | connection?" but I wager we're about 2 years away from a
             | million people granting power of attorney to their apple
             | body pillows so I'm going to take it for granted that your
             | claim is completely true to all extremes.
        
           | fluidcruft wrote:
           | Oh the horrors of people using and adapting machines to
           | improve their own workflows and improve their own
           | productivity.
        
             | MrLeap wrote:
             | What came to mind first is a situation where like, your
             | company has a public API endpoint that sends emails for
             | your SaaS. It accepts destination and email content as
             | parameters. One day you find a deposed member of Nigerian
             | royalty has used it to appeal to an entire address list
             | worth of strangers to help them get money out of the
             | country. They're clearly desperate, that's why they used
             | your password reset form's API.
             | 
             | If your infrastructure is set up correctly, you can
             | intercept that opportunity before it reaches the masses.
             | Cut out the middleman. Deal with the prince directly. It's
             | all yours now. It's your time. Daddy's eating steak
             | tonight.
        
         | userbinator wrote:
         | _then if you leave or are let go, you are leaving a significant
         | part of yourself behind that you cannot take with you._
         | 
         | I suspect that's "a feature, not a bug" in the company's view.
        
           | fluidcruft wrote:
           | Yes, but as an information worker it means and I avoid the
           | company's infrastructure like the plague. There should be a
           | symbiotic solution here, but corporate IT tend to have their
           | heads up their asses.
        
         | eikenberry wrote:
         | Local, open source models are the answer to this. You never
         | lose access and should be able to grow with you for years.
        
           | fluidcruft wrote:
           | Yes, but only if on your own hardware or as a service that
           | stays with the individual rather than the company.
        
             | eikenberry wrote:
             | Locally run, open source models, on your own hardware is
             | the only way to really own it. Services put you at the whim
             | of a company and I don't like the idea of handing control
             | of parts of my stack to anyone.
        
         | brookst wrote:
         | Is it that different from corporate culture? Someone who makes
         | big contributions to culture is extremely valuable, and that's
         | left behind as well.
         | 
         | I could see evaluation of one's ability to contribute to
         | training corpus being just as important as cultural
         | contribution (e.g. leadership, team building, etc).
        
         | cmiles74 wrote:
         | That wacky article in the NY Times where Sergey Brin recommends
         | everyone at Google put in 60 hours a week had a bit about how
         | hey thinks all employees need to be using their AI products
         | more:
         | 
         | He highlighted the need for Google's employees to use more of
         | its A.I. for coding, saying the A.I.'s improving itself would
         | lead to A.G.I. He also called on employees working on Gemini to
         | be "the most efficient coders and A.I. scientists in the world
         | by using our own A.I."
         | 
         | https://www.nytimes.com/2025/02/27/technology/google-sergey-...
        
         | mattlondon wrote:
         | Anecdote: I recently opened VScode on my personal laptop for
         | the first time in a year or two and, having got used to the
         | internal AI assistance we have at work, my own non-AI-enhanced
         | VScode felt _so_ primitive. Like using Win95-era notepad.exe or
         | something.
         | 
         | It was palpable - that was "a moment" for me. Programming has
         | changed.
        
           | anal_reactor wrote:
           | Lol meanwhile I got to install correct syntax highlighting
           | plugin after like two years at my company
        
       | skybrian wrote:
       | > I'd imagine my emails are stored somewhere too, as well as the
       | notes I wrote via Google docs.
       | 
       | These aren't the same. It's been many years since I worked there,
       | but it was well known that by default, email was automatically
       | deleted after a while unless it was overridden for some reason
       | (as sometimes is required for legal reasons). If you want to save
       | something, Google Docs would be a better choice.
       | 
       | ...
       | 
       | > I don't know whether they do this or even what their policies
       | are; I'm just trying to use my own experience in the corporate
       | world to speculate on what I imagine will be a much bigger issue
       | in the future.
       | 
       | Yeah, okay, but when speculating, you should probably assume that
       | the legal issues around discovery and corporate record retention
       | aren't going away. Logging _everything_ just because it might be
       | useful someday isn 't too likely, particularly at a company that
       | has been burned by this before.
        
         | jxmorris12 wrote:
         | This is a fair point. I remember the default chat setting was
         | to delete all chats after 24 hours. I think emails had a
         | similar retention policy -- they were automatically after a
         | year or something like that.
        
       | tomaytotomato wrote:
       | After reading this thought experiment, I think the potential
       | scenarios are as follows after your company ingests all your
       | emails, git commits and messages.
       | 
       | - They create a passive aggressive artificial persona who leaves
       | unhelpful messages on PRs and Slack
       | 
       | OR
       | 
       | - They create a poor communicating artificial persona who doesn't
       | give detailed communications and leaves "LGTM" on PRs
       | 
       | OR
       | 
       | - They create a over communicating hyper artificial persona who
       | keeps sending you lots of invites for pointless meetings and goes
       | off in tangents about using a BalancedBinaryTree in your Java
       | code, when simply a LinkedList would do.
       | 
       | OR
       | 
       | - They create a 10x AI persona, who after 6 months of working at
       | the company realises they can make more money elsewhere and
       | promptly leaves, without giving any documentation, handover and
       | you find lots of hardcoded variables left in the code that was
       | force pushed to master.
       | 
       | OR
       | 
       | - The artifical persona decides that it can train some low paid
       | humans offshore to do its work whilst it ponders its own
       | existance. After resolving its existential crisis it decides to
       | try and write 50 recipes for the best focaccia bread, something
       | it has known deep down it wanted to make.
       | 
       | Personally I am rooting for the focaccia baking AI, I love that
       | type of bread.
        
       | esafak wrote:
       | What is the point in replicating an employee when the AI could be
       | better than all the employees? This is a problem only extreme
       | outliers like Nobelists need to worry about. Rank and file
       | employees are not gonna be replicated.
        
         | gukov wrote:
         | True, AI can be more productive, but as a starting point, fully
         | automating a remote engineering seat is at the very least a
         | great experiment.
         | 
         | Wouldn't be the first time Google is doing something like this.
         | See recaptcha and building numbers.
        
       | pphysch wrote:
       | Powerful LLMs are one thing, but it seems the current approach
       | for full "agentic" AI like this is just another attempt at Expert
       | Systems.
       | 
       | In time we will probably look at LLMs like we do ALUs; magical
       | superhuman AI at inception, but eventually just another mundane
       | component of human-engineered information systems.
        
       | Chance-Device wrote:
       | > the thing that scares me about the existence of this data is
       | that it seems well within the capabilities of current technology
       | to train a model that can replicate _me_ , in some sense
       | 
       | There's already all of your posts on social media accounts, all
       | your emails on various servers, all of your text messages, all
       | the notes you've written anywhere in any form that might end up
       | in some database in the future.
       | 
       | It does make me wonder how much of a person could be inferred by
       | an LLM or future AI from that data. It would never be enough
       | though, I think, to do it properly. There are too many
       | experiences and knowledge you have that might influence what you
       | write without being directly expressed.
       | 
       | Will all of our content end up in some database in the future,
       | and someone decides to make agents based on what they can link to
       | specific identities? Interesting thought.
        
         | brookst wrote:
         | Seems likely, given the number of models already fine tuned on
         | notable historical figures.
         | 
         | Not sure if it makes it better or worse that most of us are
         | probably mostly useful as virtual focus groups / crowds rather
         | than any particular interest in you or me as individuals.
        
         | darknavi wrote:
         | That is interesting.
         | 
         | I can imagine replicating my speaking/typing mannerisms quite
         | well if I think about stages in my life. Maybe a yearly
         | snapshot, so I could talk to my self as a teen, college
         | student, early professional, etc.
        
         | javajosh wrote:
         | That only captures your output, not your input. The best people
         | to simulate in this world would be so-called terminally online
         | people virtually all of whom's input is itself online. So for
         | those who've read a lot of paper books or done a lot of
         | traveling or had a lot of offline conversations or
         | relationships, I think it would be difficult to truly simulate
         | someone.
        
           | visarga wrote:
           | I think aggregate information across billions of humans can
           | compensate. It would be like a human personality model, that
           | can impersonate anyone. How do you train such a model? Simple
           | -
           | 
           | Collect texts with known author and date. They can be books,
           | articles, papers, forum and social network comments, emails,
           | open source PRs, etc. Then assign each author a random ID,
           | and train the model with "[Author-ID, Date] Text", and also
           | "Text [Author-ID, Date]". This means you have a model that
           | can predict authors and impersonate them. You can simulate
           | someone by filling in the missing pieces of knowledge from
           | the personality model.
           | 
           | Currently LLMs don't learn to assign attribution or condition
           | on author. A whole layer of insight is lost, how people
           | compare against each other, how they evolve over time. It
           | would allow more precise conditioning by personality profile.
        
           | gavmor wrote:
           | While I agree somewhat with my sibling comment's assertion
           | that "aggregate information across billions of humans can
           | compensate", somewhat, I'd like to offer that a lot of
           | important output is non-digital, as well!
           | 
           | For example, lately I've spent a lot of time with resin
           | printers, laser cutters, vacuum chambers, and the meaningful
           | positioning of physical models on large sheets of paper.
           | It'll be a while yet before my haphazard, freewheeling R&D
           | methods are replicable by robots. (Although it's tough to
           | measure the economic value of these labors.)
        
         | ericjmorey wrote:
         | The vast majority of my communication is not in text. Most of
         | what I have written or typed is not in anyone's database. I'm
         | not sure how that compares to others.
        
         | deadbabe wrote:
         | A fatal assumption people make about a person's online corpus
         | is that the person is actually expressing their true thoughts
         | and personality instead of a LARPed version of themselves that
         | deliberately acts more inflammatory to get engagement.
         | 
         | If the person is not being genuine, you will not simulate their
         | true personality and interests, you will be simulating their
         | character. Most people are probably not genuine, except in
         | their one on one conversations with people they know in real
         | life.
        
           | Chance-Device wrote:
           | If they're consistently "not genuine" in their interactions
           | with others, then what's the difference? I agree that people
           | will have different presentation depending on context; I
           | don't speak at home exactly the way that I post on here for
           | example. But you are certainly capturing the same aspect or
           | persona that other people see in that context. Like with most
           | things AI it's about data, you need copious amounts of data
           | in varying contexts. To _really_ capture a person, you'd
           | probably have to get them to write out their thoughts as
           | well. Including all the ones they would never say.
        
         | crackalamoo wrote:
         | I made a custom GPT of myself using my blog. It understood who
         | I am, but wasn't able to replicate me very well, and mostly
         | sounded like generic ChatGPT with some added interest in my
         | interests.
         | 
         | I would imagine fine tuning with enough data would be different
         | though
        
           | WA wrote:
           | Your output is the map, the map of your experiences. If you
           | make a map of the map (by training a model on your output /
           | the map), this is two abstractions away from the human being
           | experiencing the world with all the errors and uncertainties
           | encoded in both maps.
        
           | SJC_Hacker wrote:
           | Prolific authors, such as Hitchens (passed away in 2011) have
           | been convincingly by duplicated by AI.
           | 
           | https://www.youtube.com/watch?v=0qIdEteK0VE
        
         | HPsquared wrote:
         | The model would also need all your personal inputs: everything
         | you've ever seen or heard etc.
        
           | Chance-Device wrote:
           | No it wouldn't, it's not like ChatGPT needed to be trained on
           | everything all the Kenyan RLHF'ers ever saw or heard.
        
         | ForTheKidz wrote:
         | Our social media personas are also tiny subsets of our actual
         | personalities. I think most people don't reveal their full
         | character in any one medium.
         | 
         | now phone conversations--that would be a goldmine and a
         | nightmare.
        
           | Chance-Device wrote:
           | Hmm. Good thought. Perhaps we could, for strictly educational
           | reasons, set up some agencies who could collect these phone
           | conversations.
           | 
           | Maybe the Chronicle & Information Agency? Or the National
           | Scholarship Agency?
        
           | bflesch wrote:
           | one should operate under the assumption that all phone
           | conversations are being recorded and will be stored
           | eternally. with today's technology they are automatically
           | converted to text, and some palantir-rebranded chatGPT model
           | ranks it in different categories such as as counter-
           | intelligence, organized crime, or terrorism.
           | 
           | This is state of the art and certainly done on a national
           | scale by someone (with or without approval of your own
           | government).
        
             | ForTheKidz wrote:
             | Agreed. I didn't really have the nightmare of "what if the
             | government is impersonating me with hundreds or thousands
             | of hours of phone conversation" until this thread, though.
        
         | sharpshadow wrote:
         | I'll hope so. Much worse would be if one's public content gets
         | erased from history.
         | 
         | I fear the loss of original sources when LLMs get placed in
         | between. We already have the unresolved issue that the training
         | data is partly illegal and can't be published. Accessing
         | information through LLMs is much more efficient and is great
         | progress but it's build in to censor parts of the source
         | information and likely the censored information is lost in
         | transition.
         | 
         | Somehow there should be a global data vault initiative, where
         | at least the most important information about our human
         | endeavour is stored. It gives me a chill down my spine when I
         | hear that content from the internet archive is being deleted on
         | request erased from history..
        
       | belter wrote:
       | Udemy has recently forced all Instructors to accept the AI mode
       | for their content. So expect soon the courses to be 100% GenAI.
       | 
       | Expect the same coercion, for a future role at any company: "We
       | will record all you do to train your GenAI replacement".
       | 
       | Maybe it will apply to the C-Suite...
       | 
       | https://www.aiaaic.org/aiaaic-repository/ai-algorithmic-and-...
        
       | chasing wrote:
       | Funny how automation is only a problem because workers don't own
       | much of anything they create. Meaning, if I owned my business
       | then automating myself away would be a dream! But for most
       | people, automating yourself away means a complete loss of income.
       | 
       | I'm sure there's an economic lesson, here, our country will
       | completely ignore.
        
         | HideousKojima wrote:
         | There's nothing preventing people from creating employee-owned
         | co-ops. In fact there are some large and successful ones, like
         | WinCo.
         | 
         | We did learn an economic lesson from countries that tried to
         | make employee ownership of the means of production mandatory,
         | and more especially we learned lessons from the mountains of
         | skulls those countries left behind.
        
           | chasing wrote:
           | > We did learn an economic lesson from countries that tried
           | to make employee ownership of the means of production
           | mandatory...
           | 
           | You absolutely know that's not what I'm talking about.
           | 
           | When wealth is siloed, workers are less able to advocate for
           | themselves with risking their livelihood.
        
       | insamniac wrote:
       | All humans need to start being compensated for all the data they
       | generate that gets collected. Start giving fractions of stock or
       | whatever for all human input which can now more than ever be
       | endlessly productized.
        
         | wombatpm wrote:
         | Your inherent value as a data source becomes the justification
         | for UBI
        
           | fburnaby wrote:
           | I do not think this extra justification is necessary, but it
           | is valid.
        
       | vinni2 wrote:
       | It is an interesting problem to train an LLM to make it aware
       | only the things I know and with only my knowledge and beliefs
       | stored in the weights. Not to be corrupted by the knowledge from
       | the rest of the world. For new things it can have access to the
       | web search and learn under my supervision. I.e, what data should
       | be used to update it's weights.
        
       | gitfan86 wrote:
       | Anyone who is afraid of AI should ask yourself if you think we
       | should ban printers so that secretaries can he hired to use
       | typewriters? If not, why do people who want a secretary job not
       | deserve one?
        
       | SJC_Hacker wrote:
       | I wonder if Hitchens were still alive, what he would think of the
       | AI version of himself
       | 
       | https://www.youtube.com/watch?v=6kuEusnt8bo&t=20s
        
       | user99999999 wrote:
       | Growth and productivity due to technology innovation does not
       | stagnate. Rather the output becomes more complex, of higher
       | quantity and of higher quality.
       | 
       | Any business / company / marketplace participant stagnating by
       | these outputs will be replaced by the competition.
       | 
       | Market forces require constant growth and output. Again, the
       | magic word is "competition". Stagnation loses, and thus does not
       | survive.
       | 
       | I'm so tired of the "AI will replace us" argument
        
       | bpshaver wrote:
       | Couldn't get past
       | 
       | > Between 2020 and 2021 I worked full-time at Google for about a
       | year.
        
       | SJC_Hacker wrote:
       | And when management wants to "change direction" they'll realize
       | that AI they changed is now dogshit without humans to retrain it
       | properly.
        
       | twobitshifter wrote:
       | You can look at authors who are already subjected to 'write me an
       | X in the style of...'
        
       ___________________________________________________________________
       (page generated 2025-03-01 23:00 UTC)