[HN Gopher] Copilot for Everything: Training your AI replacement...
___________________________________________________________________
Copilot for Everything: Training your AI replacement one keystroke
at a time
Author : jxmorris12
Score : 114 points
Date : 2025-03-01 16:33 UTC (6 hours ago)
(HTM) web link (substack.com)
(TXT) w3m dump (substack.com)
| ZYbCRq22HbJ2y7 wrote:
| Until you have an AI manager that calls out your bullshit and
| fires you without having to worry about the messy human empathy
| aspect.
| falcor84 wrote:
| If you still work at a company that concerns itself with the
| empathetic aspect of firing people, your experience isn't
| representative.
| dtgmzac wrote:
| when you have AI manager, you will have AI employee incoming
| fluidcruft wrote:
| The flip side to this is that if you come to depend on a
| companies AI, then if you leave or are let go, you are leaving a
| significant part of yourself behind that you cannot take with
| you.
|
| This is a real problem companies need to address before I even
| begin to trust or rely on these tools. It's going to lead to
| massive growth of shadow IT.
| wkat4242 wrote:
| Also the 'agent' features. Our company has been blocking all
| agent features from regular employee access because they don't
| want random users building their own automations. This kind of
| stuff requires care and an eye for regulations like GDPR.
|
| To permit this as Microsoft wants would lead to a lot of shadow
| IT. Which will be really hard to get rid of. I compare it to
| lotus notes which beside being an email client was also a
| competent database. Over the decades we used it users built up
| a huge library of hobby tools, many of which wormed their way
| into critical business processes. Making it really difficult to
| move away to another solution because the people that knew how
| it worked were long gone.
|
| I suspect this is exactly what Microsoft wants. Us being locked
| into copilot so they can charge ever more for it. This is kinda
| their business model anyway.
|
| Under the hood it's really not that special, it's just ChatGPT.
| Some special sauce to make it talk to office 365 but that's
| about it.
| mistrial9 wrote:
| > Us being locked into copilot
|
| the customer of MSFT is management; product design and
| implementation for the C-Suite, their lawyers and their
| investors . You are a _tool_ ; there is no _us_ in this
| picture.
| fluidcruft wrote:
| All I need is Citix and I can automate anything. If it's on a
| screen and I have a mouse and/or keyboard I can automate it.
| MrLeap wrote:
| I'd jokingly say "Can you automate authentic human
| connection?" but I wager we're about 2 years away from a
| million people granting power of attorney to their apple
| body pillows so I'm going to take it for granted that your
| claim is completely true to all extremes.
| fluidcruft wrote:
| Oh the horrors of people using and adapting machines to
| improve their own workflows and improve their own
| productivity.
| MrLeap wrote:
| What came to mind first is a situation where like, your
| company has a public API endpoint that sends emails for
| your SaaS. It accepts destination and email content as
| parameters. One day you find a deposed member of Nigerian
| royalty has used it to appeal to an entire address list
| worth of strangers to help them get money out of the
| country. They're clearly desperate, that's why they used
| your password reset form's API.
|
| If your infrastructure is set up correctly, you can
| intercept that opportunity before it reaches the masses.
| Cut out the middleman. Deal with the prince directly. It's
| all yours now. It's your time. Daddy's eating steak
| tonight.
| userbinator wrote:
| _then if you leave or are let go, you are leaving a significant
| part of yourself behind that you cannot take with you._
|
| I suspect that's "a feature, not a bug" in the company's view.
| fluidcruft wrote:
| Yes, but as an information worker it means and I avoid the
| company's infrastructure like the plague. There should be a
| symbiotic solution here, but corporate IT tend to have their
| heads up their asses.
| eikenberry wrote:
| Local, open source models are the answer to this. You never
| lose access and should be able to grow with you for years.
| fluidcruft wrote:
| Yes, but only if on your own hardware or as a service that
| stays with the individual rather than the company.
| eikenberry wrote:
| Locally run, open source models, on your own hardware is
| the only way to really own it. Services put you at the whim
| of a company and I don't like the idea of handing control
| of parts of my stack to anyone.
| brookst wrote:
| Is it that different from corporate culture? Someone who makes
| big contributions to culture is extremely valuable, and that's
| left behind as well.
|
| I could see evaluation of one's ability to contribute to
| training corpus being just as important as cultural
| contribution (e.g. leadership, team building, etc).
| cmiles74 wrote:
| That wacky article in the NY Times where Sergey Brin recommends
| everyone at Google put in 60 hours a week had a bit about how
| hey thinks all employees need to be using their AI products
| more:
|
| He highlighted the need for Google's employees to use more of
| its A.I. for coding, saying the A.I.'s improving itself would
| lead to A.G.I. He also called on employees working on Gemini to
| be "the most efficient coders and A.I. scientists in the world
| by using our own A.I."
|
| https://www.nytimes.com/2025/02/27/technology/google-sergey-...
| mattlondon wrote:
| Anecdote: I recently opened VScode on my personal laptop for
| the first time in a year or two and, having got used to the
| internal AI assistance we have at work, my own non-AI-enhanced
| VScode felt _so_ primitive. Like using Win95-era notepad.exe or
| something.
|
| It was palpable - that was "a moment" for me. Programming has
| changed.
| anal_reactor wrote:
| Lol meanwhile I got to install correct syntax highlighting
| plugin after like two years at my company
| skybrian wrote:
| > I'd imagine my emails are stored somewhere too, as well as the
| notes I wrote via Google docs.
|
| These aren't the same. It's been many years since I worked there,
| but it was well known that by default, email was automatically
| deleted after a while unless it was overridden for some reason
| (as sometimes is required for legal reasons). If you want to save
| something, Google Docs would be a better choice.
|
| ...
|
| > I don't know whether they do this or even what their policies
| are; I'm just trying to use my own experience in the corporate
| world to speculate on what I imagine will be a much bigger issue
| in the future.
|
| Yeah, okay, but when speculating, you should probably assume that
| the legal issues around discovery and corporate record retention
| aren't going away. Logging _everything_ just because it might be
| useful someday isn 't too likely, particularly at a company that
| has been burned by this before.
| jxmorris12 wrote:
| This is a fair point. I remember the default chat setting was
| to delete all chats after 24 hours. I think emails had a
| similar retention policy -- they were automatically after a
| year or something like that.
| tomaytotomato wrote:
| After reading this thought experiment, I think the potential
| scenarios are as follows after your company ingests all your
| emails, git commits and messages.
|
| - They create a passive aggressive artificial persona who leaves
| unhelpful messages on PRs and Slack
|
| OR
|
| - They create a poor communicating artificial persona who doesn't
| give detailed communications and leaves "LGTM" on PRs
|
| OR
|
| - They create a over communicating hyper artificial persona who
| keeps sending you lots of invites for pointless meetings and goes
| off in tangents about using a BalancedBinaryTree in your Java
| code, when simply a LinkedList would do.
|
| OR
|
| - They create a 10x AI persona, who after 6 months of working at
| the company realises they can make more money elsewhere and
| promptly leaves, without giving any documentation, handover and
| you find lots of hardcoded variables left in the code that was
| force pushed to master.
|
| OR
|
| - The artifical persona decides that it can train some low paid
| humans offshore to do its work whilst it ponders its own
| existance. After resolving its existential crisis it decides to
| try and write 50 recipes for the best focaccia bread, something
| it has known deep down it wanted to make.
|
| Personally I am rooting for the focaccia baking AI, I love that
| type of bread.
| esafak wrote:
| What is the point in replicating an employee when the AI could be
| better than all the employees? This is a problem only extreme
| outliers like Nobelists need to worry about. Rank and file
| employees are not gonna be replicated.
| gukov wrote:
| True, AI can be more productive, but as a starting point, fully
| automating a remote engineering seat is at the very least a
| great experiment.
|
| Wouldn't be the first time Google is doing something like this.
| See recaptcha and building numbers.
| pphysch wrote:
| Powerful LLMs are one thing, but it seems the current approach
| for full "agentic" AI like this is just another attempt at Expert
| Systems.
|
| In time we will probably look at LLMs like we do ALUs; magical
| superhuman AI at inception, but eventually just another mundane
| component of human-engineered information systems.
| Chance-Device wrote:
| > the thing that scares me about the existence of this data is
| that it seems well within the capabilities of current technology
| to train a model that can replicate _me_ , in some sense
|
| There's already all of your posts on social media accounts, all
| your emails on various servers, all of your text messages, all
| the notes you've written anywhere in any form that might end up
| in some database in the future.
|
| It does make me wonder how much of a person could be inferred by
| an LLM or future AI from that data. It would never be enough
| though, I think, to do it properly. There are too many
| experiences and knowledge you have that might influence what you
| write without being directly expressed.
|
| Will all of our content end up in some database in the future,
| and someone decides to make agents based on what they can link to
| specific identities? Interesting thought.
| brookst wrote:
| Seems likely, given the number of models already fine tuned on
| notable historical figures.
|
| Not sure if it makes it better or worse that most of us are
| probably mostly useful as virtual focus groups / crowds rather
| than any particular interest in you or me as individuals.
| darknavi wrote:
| That is interesting.
|
| I can imagine replicating my speaking/typing mannerisms quite
| well if I think about stages in my life. Maybe a yearly
| snapshot, so I could talk to my self as a teen, college
| student, early professional, etc.
| javajosh wrote:
| That only captures your output, not your input. The best people
| to simulate in this world would be so-called terminally online
| people virtually all of whom's input is itself online. So for
| those who've read a lot of paper books or done a lot of
| traveling or had a lot of offline conversations or
| relationships, I think it would be difficult to truly simulate
| someone.
| visarga wrote:
| I think aggregate information across billions of humans can
| compensate. It would be like a human personality model, that
| can impersonate anyone. How do you train such a model? Simple
| -
|
| Collect texts with known author and date. They can be books,
| articles, papers, forum and social network comments, emails,
| open source PRs, etc. Then assign each author a random ID,
| and train the model with "[Author-ID, Date] Text", and also
| "Text [Author-ID, Date]". This means you have a model that
| can predict authors and impersonate them. You can simulate
| someone by filling in the missing pieces of knowledge from
| the personality model.
|
| Currently LLMs don't learn to assign attribution or condition
| on author. A whole layer of insight is lost, how people
| compare against each other, how they evolve over time. It
| would allow more precise conditioning by personality profile.
| gavmor wrote:
| While I agree somewhat with my sibling comment's assertion
| that "aggregate information across billions of humans can
| compensate", somewhat, I'd like to offer that a lot of
| important output is non-digital, as well!
|
| For example, lately I've spent a lot of time with resin
| printers, laser cutters, vacuum chambers, and the meaningful
| positioning of physical models on large sheets of paper.
| It'll be a while yet before my haphazard, freewheeling R&D
| methods are replicable by robots. (Although it's tough to
| measure the economic value of these labors.)
| ericjmorey wrote:
| The vast majority of my communication is not in text. Most of
| what I have written or typed is not in anyone's database. I'm
| not sure how that compares to others.
| deadbabe wrote:
| A fatal assumption people make about a person's online corpus
| is that the person is actually expressing their true thoughts
| and personality instead of a LARPed version of themselves that
| deliberately acts more inflammatory to get engagement.
|
| If the person is not being genuine, you will not simulate their
| true personality and interests, you will be simulating their
| character. Most people are probably not genuine, except in
| their one on one conversations with people they know in real
| life.
| Chance-Device wrote:
| If they're consistently "not genuine" in their interactions
| with others, then what's the difference? I agree that people
| will have different presentation depending on context; I
| don't speak at home exactly the way that I post on here for
| example. But you are certainly capturing the same aspect or
| persona that other people see in that context. Like with most
| things AI it's about data, you need copious amounts of data
| in varying contexts. To _really_ capture a person, you'd
| probably have to get them to write out their thoughts as
| well. Including all the ones they would never say.
| crackalamoo wrote:
| I made a custom GPT of myself using my blog. It understood who
| I am, but wasn't able to replicate me very well, and mostly
| sounded like generic ChatGPT with some added interest in my
| interests.
|
| I would imagine fine tuning with enough data would be different
| though
| WA wrote:
| Your output is the map, the map of your experiences. If you
| make a map of the map (by training a model on your output /
| the map), this is two abstractions away from the human being
| experiencing the world with all the errors and uncertainties
| encoded in both maps.
| SJC_Hacker wrote:
| Prolific authors, such as Hitchens (passed away in 2011) have
| been convincingly by duplicated by AI.
|
| https://www.youtube.com/watch?v=0qIdEteK0VE
| HPsquared wrote:
| The model would also need all your personal inputs: everything
| you've ever seen or heard etc.
| Chance-Device wrote:
| No it wouldn't, it's not like ChatGPT needed to be trained on
| everything all the Kenyan RLHF'ers ever saw or heard.
| ForTheKidz wrote:
| Our social media personas are also tiny subsets of our actual
| personalities. I think most people don't reveal their full
| character in any one medium.
|
| now phone conversations--that would be a goldmine and a
| nightmare.
| Chance-Device wrote:
| Hmm. Good thought. Perhaps we could, for strictly educational
| reasons, set up some agencies who could collect these phone
| conversations.
|
| Maybe the Chronicle & Information Agency? Or the National
| Scholarship Agency?
| bflesch wrote:
| one should operate under the assumption that all phone
| conversations are being recorded and will be stored
| eternally. with today's technology they are automatically
| converted to text, and some palantir-rebranded chatGPT model
| ranks it in different categories such as as counter-
| intelligence, organized crime, or terrorism.
|
| This is state of the art and certainly done on a national
| scale by someone (with or without approval of your own
| government).
| ForTheKidz wrote:
| Agreed. I didn't really have the nightmare of "what if the
| government is impersonating me with hundreds or thousands
| of hours of phone conversation" until this thread, though.
| sharpshadow wrote:
| I'll hope so. Much worse would be if one's public content gets
| erased from history.
|
| I fear the loss of original sources when LLMs get placed in
| between. We already have the unresolved issue that the training
| data is partly illegal and can't be published. Accessing
| information through LLMs is much more efficient and is great
| progress but it's build in to censor parts of the source
| information and likely the censored information is lost in
| transition.
|
| Somehow there should be a global data vault initiative, where
| at least the most important information about our human
| endeavour is stored. It gives me a chill down my spine when I
| hear that content from the internet archive is being deleted on
| request erased from history..
| belter wrote:
| Udemy has recently forced all Instructors to accept the AI mode
| for their content. So expect soon the courses to be 100% GenAI.
|
| Expect the same coercion, for a future role at any company: "We
| will record all you do to train your GenAI replacement".
|
| Maybe it will apply to the C-Suite...
|
| https://www.aiaaic.org/aiaaic-repository/ai-algorithmic-and-...
| chasing wrote:
| Funny how automation is only a problem because workers don't own
| much of anything they create. Meaning, if I owned my business
| then automating myself away would be a dream! But for most
| people, automating yourself away means a complete loss of income.
|
| I'm sure there's an economic lesson, here, our country will
| completely ignore.
| HideousKojima wrote:
| There's nothing preventing people from creating employee-owned
| co-ops. In fact there are some large and successful ones, like
| WinCo.
|
| We did learn an economic lesson from countries that tried to
| make employee ownership of the means of production mandatory,
| and more especially we learned lessons from the mountains of
| skulls those countries left behind.
| chasing wrote:
| > We did learn an economic lesson from countries that tried
| to make employee ownership of the means of production
| mandatory...
|
| You absolutely know that's not what I'm talking about.
|
| When wealth is siloed, workers are less able to advocate for
| themselves with risking their livelihood.
| insamniac wrote:
| All humans need to start being compensated for all the data they
| generate that gets collected. Start giving fractions of stock or
| whatever for all human input which can now more than ever be
| endlessly productized.
| wombatpm wrote:
| Your inherent value as a data source becomes the justification
| for UBI
| fburnaby wrote:
| I do not think this extra justification is necessary, but it
| is valid.
| vinni2 wrote:
| It is an interesting problem to train an LLM to make it aware
| only the things I know and with only my knowledge and beliefs
| stored in the weights. Not to be corrupted by the knowledge from
| the rest of the world. For new things it can have access to the
| web search and learn under my supervision. I.e, what data should
| be used to update it's weights.
| gitfan86 wrote:
| Anyone who is afraid of AI should ask yourself if you think we
| should ban printers so that secretaries can he hired to use
| typewriters? If not, why do people who want a secretary job not
| deserve one?
| SJC_Hacker wrote:
| I wonder if Hitchens were still alive, what he would think of the
| AI version of himself
|
| https://www.youtube.com/watch?v=6kuEusnt8bo&t=20s
| user99999999 wrote:
| Growth and productivity due to technology innovation does not
| stagnate. Rather the output becomes more complex, of higher
| quantity and of higher quality.
|
| Any business / company / marketplace participant stagnating by
| these outputs will be replaced by the competition.
|
| Market forces require constant growth and output. Again, the
| magic word is "competition". Stagnation loses, and thus does not
| survive.
|
| I'm so tired of the "AI will replace us" argument
| bpshaver wrote:
| Couldn't get past
|
| > Between 2020 and 2021 I worked full-time at Google for about a
| year.
| SJC_Hacker wrote:
| And when management wants to "change direction" they'll realize
| that AI they changed is now dogshit without humans to retrain it
| properly.
| twobitshifter wrote:
| You can look at authors who are already subjected to 'write me an
| X in the style of...'
___________________________________________________________________
(page generated 2025-03-01 23:00 UTC)