[HN Gopher] Claude Skills
___________________________________________________________________
Claude Skills
https://www.anthropic.com/engineering/equipping-agents-for-t...
Author : meetpateltech
Score : 397 points
Date : 2025-10-16 16:05 UTC (6 hours ago)
(HTM) web link (www.anthropic.com)
(TXT) w3m dump (www.anthropic.com)
| j45 wrote:
| I wonder if Claude Skills will help return Claude back to the
| level of performance it had a few months ago.
| bicx wrote:
| Interesting. For Claude Code, this seems to have generous overlap
| with existing practice of having markdown "guides" listed for
| access in the CLAUDE.md. Maybe skills can simply make managing
| such guides more organized and declarative.
| kfarr wrote:
| Yeah my first thought was, oh it sounds like a bunch of
| CLAUDE.md's under the surface :P
| crancher wrote:
| It's interesting (to me) visualizing all of these techniques as
| efforts to replicate A* pathfinding through the model's vector
| space "maze" to find the desired outcome. The potential to "one
| shot" any request is plausible with the right context.
| candiddevmike wrote:
| > The potential to "one shot" any request is plausible with
| the right context.
|
| You too can win a jackpot by spinning the wheel just like
| these other anecdotal winners. Pay no attention to your
| dwindling credits every time you do though.
| NitpickLawyer wrote:
| On the other hand, our industry has always chased the "one
| baby in one month out of 9 mothers" paradigm. While you
| couldn't do that with humans, it's likely you'll soon (tm)
| be able to do it with agents.
| j45 wrote:
| If so, it would be a better way than encapsulating
| functionality in markdown.
|
| I have been using claude code to create some and organize them
| but they can have diminishing return.
| guluarte wrote:
| it also may point out that the solution for context rot may not
| be coming in the foreseeable future
| phildougherty wrote:
| getting hard to keep up with skills, plugins, marketplaces,
| connectors, add-ons, yada yada
| prng2021 wrote:
| Yep. Now I need an AI to help me use AI
| consumer451 wrote:
| I mean, that is a very common thing that I do.
| wartywhoa23 wrote:
| That's why the key word for all the AI horror stories that
| have been emerging lately is "recursion".
| consumer451 wrote:
| Does that imply no human in the loop? If so, that's not
| what I meant, or do. Whoever is doing that at this point:
| bless your heart :)
| mikkupikku wrote:
| "Recursion" is a word that shows up a lot in the rants of
| people in AI psychosis (believe they turned the chatbot
| into god, or believe the chatbot revealed themselves to
| be god.)
| andoando wrote:
| Train AI to setup/train AI on doing tasks. Bam
| josefresco wrote:
| Joking aside, I ask Claude how to uses Claude... all the
| time! Sometimes I ask ChatGTP about Claude. It actually
| doesn't work well because they don't imbue these AI tools
| with any special knowledge about how they work, they seem to
| rely on public documentation which usually lags behind the
| breakneck pace of these feature-releases.
| gordonhart wrote:
| Agree -- it's a big downside as a user to have more and more of
| these provider-specific features. More to learn, more to
| configure, more to get locked into.
|
| Of course this is why the model providers keep shipping new
| ones; without them their product is a commodity.
| hansonkd wrote:
| Thats the start of the singularity. The changes will keep
| accelerating and less and less people will be able to keep up
| until only the AIs themselves know how to use.
| matthewaveryusa wrote:
| Nah, we'll create AI to manage the AI....oh
| skybrian wrote:
| People thought the same in the '90's. The argument that
| technology accelerates and "software eats the world" doesn't
| depend on AI.
|
| It's not exactly wrong, but it leaves out a lot of
| intermediate steps.
| xpe wrote:
| Yes and as we rely on AI to help us choose our tools... the
| phenomena feels very different, don't you think? Human
| thinking, writing, talking, etc is becoming less important
| in this feedback loop seems to me.
| xpe wrote:
| abstractions all the way down: abstraction
| abstraction abstraction abstraction
| ...
| AaronAPU wrote:
| I don't think these are things to keep up with. Those would
| be actual fundamental advances in the transformer
| architecture and core elements around it.
|
| This stuff is like front end devs building fad add-ons which
| call into those core elements and falsely market themselves
| as fundamental advancements.
| marcusestes wrote:
| Agreed, but I think it's actually simple.
|
| Plugins include: * Commands * MCPs * Subagents * Now, Skills
|
| Marketplaces aggregate plugins.
| input_sh wrote:
| It's so simple you didn't even name all of them properly.
| xpe wrote:
| If I were to say "Claude Skills can be seen as a particular
| productization of a system prompt" would I be wrong?
|
| From a technical perspective, it seems like unnecessary
| complexity in a way. Of course I recognize there are lot of
| product decisions that seem to layer on 'unnecessary'
| abstractions but still have utility.
|
| In terms of connecting with customers, it seems sensible, under
| the assumption that Anthropic is triaging customer feedback
| well _and_ leading to where they want to go (even if they don
| 't know it yet).
|
| _Update_ : a sibling comment just wrote something quite
| similar: "All these things are designed to create lock in for
| companies. They don't really fundamentally add to the
| functionality of LLMs." I think I agree.
| tempusalaria wrote:
| All these things are designed to create lock in for companies.
| They don't really fundamentally add to the functionality of
| LLMs. Devs should focus on working directly with model generate
| apis and not using all the decoration.
| tqwhite wrote:
| Me? I love some lock in. Give me the coolest stuff and I'll
| be your customer forever. I do not care about trying to be my
| own AI company. I'd feel the same about OpenAI if they got me
| first... but they didn't. I am team Anthropic.
| dominicq wrote:
| Features will be added until morale improves
| hansmayer wrote:
| Well, have some understanding: the good folks need to produce
| _something_ , since their main product is not delivering the
| much yearned for era of joblessness yet. It's not for you, it's
| signalling their investors - see, we're not burning your cash
| paying a bunch of PhDs to tweak the model weights without
| visible results. We are actually building products. With a huge
| and willing A/B testing base.
| hiq wrote:
| IMHO, don't, don't keep up. Just like "best practices in prompt
| engineering", these are just temporary workaround for current
| limitations, and they're bound to disappear quickly. Unless you
| really need the extra performance right now, just wait until
| models get you this performance out of the box instead of
| investing into learning something that'll be obsolete in
| months.
| spprashant wrote:
| I agree with this take. Models and the tooling around them
| are both in flux. I d rather not spend time learning
| something in detail for these companies to then pull the plug
| chasing next-big-thing.
| lukev wrote:
| I agree with your conclusion not to sweat all these features
| too much, but only because they're not hard at all to
| understand on demand once you realize that they all boil down
| to a small handful of ways to manipulate model context.
|
| But context engineering very much not going anywhere as a
| discipline. Bigger and better models will _by no means_ make
| it obsolete. In fact, raw model capability is pretty clearly
| leveling off into the top of an S-curve, and most real-world
| performance gains over the last year have been precisely
| _because_ of innovations on how to better leverage context.
| vdfs wrote:
| IMO, these are just marketing or new ways of using functions
| calling, under the hood they all get re-written as tools the
| model can call
| adidoit wrote:
| All of it is ultimately managing the context for a model. Just
| different methods
| BoredPositron wrote:
| It is a bit ironic that the better the models get they seem to
| need more and more user input.
| quintu5 wrote:
| More like they can better react to user input within their
| context window. With older models, the value of that additional
| user input would have been much more limited.
| nozzlegear wrote:
| It superficially reminds me of the old "Alexa Skills" thing (I'm
| not even sure if Alexa still has "Skills"). It might just be the
| name making that connection for me.
| j45 wrote:
| Seems to be a bit more than that.
| phildougherty wrote:
| Alexa skills are 3rd party add-ons/plugins. Want to control
| your hue lights? add the phillips hue skill. I think claude
| skills in an alexa world would be like having to seed alexa
| with a bunch of context for it to remember how to turn my
| lights on and off or it will randomly attempt a bunch of
| incorrect ways of doing it until it gets lucky.
| candiddevmike wrote:
| And how many of those Alexa Skills are still being updated...
|
| This is where waiting for this stuff to stablize/standardize,
| and then writing a "skill" based on an actual RFC or standard
| protocol makes more sense, IMO. I've been burned too many times
| building vendor-locked chatbot extensions.
| nozzlegear wrote:
| > And how many of those Alexa Skills are still being
| updated...
|
| Not mine! I made a few when they first opened it up to devs,
| but I was trying to use Azure Logic Apps (something like
| that?) at the time which was supremely slow and finicky with
| F#, and an exercise in frustration.
| joilence wrote:
| If I understand correctly, looks like `skill` is a instructed
| usage / pattern of tools, so it saves llm agent's efforts at
| trial & error of using tools? and it basically just a prompt.
| sshine wrote:
| I love how the promise of free labor motivates everyone to become
| API first, document their practices, and plan ahead in writing
| before coding.
| ebiester wrote:
| It helps that you can have the "free" labor document the
| processes and build the plan.
| skybrian wrote:
| Cheaper, not free. Also, no training to learn a new skill.
|
| Building a new one that works well is a project, but then it
| will scale up as much as you like.
|
| This is bringing some of the advantages of software development
| to office tasks, but you give up some things like reliable,
| deterministic results.
| sshine wrote:
| There is an acquisition cost of researching and developing
| the LLM, but the running cost should not be classified as a
| wage, hence cost of labor is zero.
| maigret wrote:
| It's still opex for finance
| skybrian wrote:
| Don't call it "free labor" at all then? Regardless, running
| an LLM is usually not free.
| _pdp_ wrote:
| At first I wasn't sure what this is. Upon further inspection
| skills are effectively a bunch of markdown files and scripts that
| get unzipped at the right time and used as context. The scripts
| are executed to get deterministic output.
|
| The idea is interesting and something I shall consider for our
| platform as well.
| nperez wrote:
| Seems like a more organized way to do the equivalent of a folder
| full of md files + instructing the LLM to ls that folder and read
| the ones it needs
| j45 wrote:
| If so it would be most welcome since LLMs doesn't always
| consistently follow the folder full of MD files to the same
| depth and consistency.
| RamtinJ95 wrote:
| what makes it more likely that claude would read these .md
| files then?
| phildougherty wrote:
| trained to
| j45 wrote:
| Skills is hopefully put through a deterministic process
| that is guaranteed to occur, instead of a non-deterministic
| one that can only ever be guaranteed to happen most of the
| time (the way it is now).
| meetpateltech wrote:
| Detailed engineering blog:
|
| "Equipping agents for the real world with Agent Skills"
| https://www.anthropic.com/engineering/equipping-agents-for-t...
| dang wrote:
| Thanks, we'll put that link in the toptext as well
| jampa wrote:
| I think this is great. A problem with huge codebases is that
| CLAUDE.md files become bloated with niche workflows like CI and
| E2E testing. Combined with MCPs, this pollutes the context window
| and eventually degrades performance.
|
| You get the best of both worlds if you can select tokens by
| problem rather than by folder.
|
| The key question is how effective this will be with tool calling.
| crancher wrote:
| Seems like the exact same thing, from front page a few days ago:
| https://github.com/obra/superpowers/tree/main
| Flux159 wrote:
| I wonder how this works with mcpb (renamed from dxt Desktop
| extensions): https://github.com/anthropics/mcpb
|
| Specifically, it looks like skills are a different structure than
| mcp, but overlap in what they provide? Skills seem to be just
| markdown file & then scripts (instead of prompts & tool calls
| defined in MCP?).
|
| Question I have is why would I use one over the other?
| rahimnathwani wrote:
| One difference I see is that with tool calls the LLM doesn't
| see the actual code. It delegates the task to the LLM. With
| scripts in an agent, I _think_ the agent can see the code being
| run and can decide to run something different. I may be wrong
| about this. The documentation says that assets aren't read into
| context. It doesn't say the same about scripts, which is what
| makes me think the LLM can read them.
| irtemed88 wrote:
| Can someone explain the differences between this and Agents in
| Claude Code? Logically they seem similar. From my perspective it
| seems like Skills are more well-defined in their behavior and
| function?
| j45 wrote:
| Skills might be used by Agents.
|
| Skills can merge together like lego.
|
| Agents might be more separated.
| rahimnathwani wrote:
| Subagents have their own context. Skills do not.
| ryancnelson wrote:
| The uptake on Claude-skills seems to have a lot of momentum
| already! I was fascinated on Tuesday by "Superpowers" ,
| https://blog.fsck.com/2025/10/09/superpowers/ ... and then
| packaged up all the tool-building I've been working on for awhile
| into somewhat tidy skills that i can delegate agents to:
|
| http://github.com/ryancnelson/deli-gator I'd love any feedback
| skinnymuch wrote:
| Delegation is super cool. I can sometimes end up having too
| much Linear issue context coming in. IE frequently I want a
| Linear issue description and last comment retrieved. Linear MCP
| grabs all comments which pollutes the context and fills it up
| too much.
| mousetree wrote:
| I'm perplexed why they would use such a silly example in their
| demo video (rotating an image of a dog upside down and cropping).
| Surely they can find more compelling examples of where these
| skills could be used?
| alansaber wrote:
| Dog photo >> informing the consumer
| Mouvelie wrote:
| You'd think so, eh ?
| https://en.wikipedia.org/wiki/The_purpose_of_a_system_is_wha...
| antiloper wrote:
| The developer page uses a better example, a PDF processing
| skill: https://github.com/anthropics/skills/tree/main/document-
| skil...
|
| I've been emulating this in claude code by manually @tagging
| markdown files containing guides for common tasks in our
| repository. Nice to see that this step is now automatic as
| well.
| mritchie712 wrote:
| this is the best example I found
|
| https://github.com/anthropics/skills/blob/main/document-skil...
|
| I was dealing with 2 issues this morning getting Claude to
| produce a .xlsx that are covered in the doc above
| bgwalter wrote:
| "Skills are repeatable and customizable instructions that Claude
| can follow in any chat."
|
| We used to call that a programming language. Here, they are
| presumably repeatable instructions how to generate stolen code or
| stolen procedures so users have to think even less or not at all.
| azraellzanella wrote:
| "Keep in mind, this feature gives Claude access to execute code.
| While powerful, it means being mindful about which skills you use
| --stick to trusted sources to keep your data safe."
|
| Yes, this can only end well.
| m3kw9 wrote:
| I feel like this is making things more complicated than it needs
| to be. LLMs should automatically do this behind you, you won't
| even see it.
| Imnimo wrote:
| I feel like a danger with this sort of thing is that the
| capability of the system to use the right skill is limited by the
| little blurb you give about what the skill is for. Contrast with
| the way a human learns skills - as we gain experience with a
| skill, we get better at understanding when it's the right tool
| for the job. But Claude is always starting from ground zero and
| skimming your descriptions.
| j45 wrote:
| LLMs are a probability based calculation, so it will always
| skim to some degree, and always guess to some degree, and often
| pick the best choice available to it even though it might not
| be the best.
|
| For folks who this seems elusive for, it's worth learning how
| the internals actually work, helps a great deal in how to
| structure things in general, and then over time as the parent
| comment said, specifically for individual cases.
| zobzu wrote:
| IMO this is a context window issue. Humans are pretty good are
| memorizing super broad context without great accuracy.
| Sometimes our "recall" function doesn't even work right ("How
| do you say 'blah' in German again?"), so the more you
| specialize (say, 10k hours / mastery), the better you are at
| recalling a specific set of "skills", but perhaps not other
| skills.
|
| On the other hand, LLMs have a programatic context with
| consistent storage and the ability to have perfect recall, they
| just don't always generate the expected output in practice as
| the cost to go through ALL context is prohibitive in terms of
| power and time.
|
| Skills.. or really just context insertion is simply a way to
| prioritize their output generation manually. LLM "thinking
| mode" is the same, for what it's worth - it really is just
| reprioritizing context - so not "starting from scratch" per se.
|
| When you start thinking about it that way, it makes sense - and
| it helps using these tools more effectively too.
| dwaltrip wrote:
| There are ways to compensate for lack of "continual
| learning", but recognizing that underlying missing piece is
| important.
| ryancnelson wrote:
| I commented here already about deli-gator (
| https://github.com/ryancnelson/deli-gator ) , but your
| summary nailed what I didn't mention here before: Context.
|
| I'd been re-teaching Claude to craft Rest-api calls with curl
| every morning for months before i realized that skills would
| let me delegate that to cheaper models, re-using cached-
| token-queries, and save my context window for my actual
| problem-space CONTEXT.
| dingnuts wrote:
| >I'd been re-teaching Claude to craft Rest-api calls with
| curl every morning for months
|
| what the fuck, there is absolutely no way this was cheaper
| or more productive than just learning to use curl and
| writing curl calls yourself. Curl isn't even hard! And if
| you learn to use it, you get WAY better at working with
| HTTP!
|
| You're kneecapping yourself to expend more effort than it
| would take to just write the calls, helping to train a bot
| to do the job you should be doing
| jmtulloss wrote:
| My interpretation of the parent comment was that they
| were loading specific curl calls into context so that
| Claude could properly exercise the endpoints after making
| changes.
| F7F7F7 wrote:
| He's likely talking about Claude's hook system that
| Anthropic created to provide better control over context.
| ryancnelson wrote:
| _i_ know how to use curl. (I was a contributor before git
| existed) ... watching Claude iterate to re-learn whether
| to try application /x-form-urle ncoded or GET /?foo
| wastes SO MUCH time and fills your context with "how to
| curl" that you re-send over again until your context
| compacts.
|
| You are bad at reading comprehension. My comment meant I
| can tell Claude "update jira with that test outcome in a
| comment" and, Claude can eventually figure that out with
| just a Key and curl, but that's way too low level.
|
| What I linked to literally explains that, with code and a
| blog post.
| mbesto wrote:
| > IMO this is a context window issue.
|
| Not really. It's a consequential issue. No matter how big or
| small the context window is, LLMs simply do not have the
| concept of goals and consequences. Thus, it's difficult for
| them to acquire dynamic and evolving "skills" like humans do.
| seunosewa wrote:
| The blurbs can be improved if they aren't effective. You can
| also invoke skills directly.
|
| The description is equivalent to your short term memory.
|
| The skill is like your long term memory which is retrieved if
| needed.
|
| These should both be considered as part of the AI agent. Not
| external things.
| blackoil wrote:
| Most of the experience is general information not specific to
| project/discussion. LLM starts with all that knowledge. Next it
| needs a memory and lookup system for project specific
| information. Lookup in humans is amazingly fast, but even with
| a slow lookup, LLMs can refer to it in near real-time.
| andruby wrote:
| Would this requirement to start from ground zero in current
| LLMs be an artefact of the requirement to have a "multi-tenant"
| infrastructure?
|
| Of course OpenAI and Anthropic want to be able to reuse the
| same servers/memory for multiple users, otherwise it would be
| too expensive.
|
| Could we have "personal" single-tenant setups? Where the LLM
| incorporates every previous conversation?
| mbesto wrote:
| > Contrast with the way a human learns skills - as we gain
| experience with a skill, we get better at understanding when
| it's the right tool for the job.
|
| Which is precisely why Richard Sutton doesn't think LLMs will
| evolve to AGI[0]. LLMs are based on mimicry, not experience, so
| it's more likely (according to Sutton) that AGI will be based
| on some form of RL (reinforcement learning) and not neural
| networks (LLMs).
|
| More specifically, LLMs don't have goals and consequences of
| actions, which is the foundation for intelligence. So, to your
| point, the idea of a "skill" is more akin to a reference
| manual, than it is a skill building exercise that can be
| applied to developing an instrument, task, solution, etc.
|
| [0] https://www.youtube.com/watch?v=21EYKqUsPfg
| buildbot wrote:
| The industry has been doing RL on many kinds of neural
| networks, including LLMs, for quite some time. Is this person
| saying we RL on some kind of non neural network design? Why
| is that more likely to bring AGI than an LLM?.
|
| > More specifically, LLMs don't have goals and consequences
| of actions, which is the foundation for intelligence.
|
| Citation?
| jfarina wrote:
| Why are you asking them to cite something for that
| statement? Are you questioning whether it's the foundation
| for intelligence or whether LLMS understand goals and
| consequences?
| buildbot wrote:
| Yes, I'm questioning if that's the foundation of
| intelligence. Says who?
| mbesto wrote:
| Richard Sutton. He won a Turing Award. Why ask your
| question above when you can just watch the YouTube link I
| posted?
| anomaloustho wrote:
| Looks like they added the link. But I think it's doing RL
| in realtime vs pre-trained as an LLM is.
|
| And I associate that part to AGI being able to do cutting
| edge research and explore new ideas like humans can. Where,
| when that seems to "happen" with LLMs it's been more
| debatable. (e.g. there was an existing paper that the LLM
| was able to tap into)
|
| I guess another example would be to get an AGI doing RL in
| realtime to get really good at a video game with completely
| different mechanics in the same way a human could. Today,
| that wouldn't really happen unless it was able to pre-train
| on something similar.
| ibejoeb wrote:
| I don't think any of the commercial models are doing RL
| at the consumer. The R is just accepting or rejecting the
| action, right?
| hbarka wrote:
| For humans, it's not uncommon to have a clever realization by
| way of serendipity. How do you skill AI to have serendipity.
| mediaman wrote:
| It's a false dichotomy. LLMs are already being trained with
| RL to have goal directedness.
|
| He is right that non-RL'd LLMs are just mimicry, but the
| field already moved beyond that.
| leptons wrote:
| I can't wait to try to convince an LLM/RL/whatever-it-is
| that what it "thinks" is right is actually wrong.
| dingnuts wrote:
| Explain something to me that I've long wondered: how does
| Reinforcement Learning work if you cannot measure your
| distance from the goal? In other words, how can RL be used
| for literally anything qualitative?
| kmacdough wrote:
| This is one of known hardest parts of RL. The short
| answer is human feedback.
|
| But this is easier said than done. Current models require
| vastly more learning events than humans, making direct
| supervision infeasable. One strategy is to train models
| on human supervisors, so they can bear the bulk of the
| supervision. This is tricky, but has proven more
| effective than direct supervision.
|
| But, in my experience, AIs don't specifically struggle
| with the "qualitative" side of things per-se. In fact,
| they're great at things like word choice, color theory,
| etc. Rather, they struggle to understand continuity,
| consequence and to combine disparate sources of input.
| They also suck at differentiating fact from fabrication.
| To speculate wildly, it feels like it's missing the the
| RL of living in the "real world". In order to eat, sleep
| and breath, you must operate within the bounds of physics
| and society and live forever with the consequences of an
| ever-growing history of choices.
| mbesto wrote:
| This 100%.
|
| While we might agreed that language is foundational to
| what it is to be human, it's myopic to think its the only
| thing. LLMs are based on training sets of language
| (period).
| anomaloustho wrote:
| I wrote elsewhere but I'm more interpreting this
| distinction as "RL in real-time" vs "RL beforehand".
| munchler wrote:
| I agree with this description, but I'm not sure we really
| want our AI agents evolving in real time as they gain
| experience. Having a static model that is thoroughly
| tested before deployment seems much safer.
| mbesto wrote:
| > Having a static model that is thoroughly tested before
| deployment seems much safer.
|
| While that might true, it fundamentally means it's not
| going to ever replicate human or provide super
| intelligence.
| baxtr wrote:
| So it's on-the-fly adaptive mimicry?
| OtherShrezzing wrote:
| In the interview transcript, he seems aware that the field
| is doing RL, and he makes a compelling argument that
| bootstrapping isn't as scalable as a purely RL trained AI
| would be.
| mbesto wrote:
| > LLMs are already being trained with RL to have goal
| directedness.
|
| That might be true, but we're talking about the
| fundamentals of the concept. His argument is that you're
| never going to reach AGI/super intelligence on an evolution
| of the current concepts (mimicry) even through fine tuning
| and adaptions - it'll like be different (and likely based
| on some RL technique). At least we have NO history to
| suggest this will be case (hence his argument for "the
| bitter lesson").
| samrus wrote:
| The LLMs dont have RL baked into them. They need that at
| the token prediction level to be able to do the sort of
| things humans can do
| vonneumannstan wrote:
| This is an uninformed take. Much of the improvement in
| performance of LLM based models has been through RLHF and
| other RL techniques.
| mbesto wrote:
| > This is an uninformed take.
|
| You may disagree with this take but its not uninformed.
| Many LLMs use self-supervised pretraining followed by RL-
| based fine-tuning but that's essentially it - it's fine
| tuning.
| skurilyak wrote:
| Besides a "reference manual", Claude Skills is analogous to a
| "toolkit with an instruction manual" in that it includes both
| instructions (manuals) and executable functions (tools/code)
| ChadMoran wrote:
| This is the crux of knowledge/tool enrichment in LLMs. The idea
| that we can have knowledge bases and LLMs will know WHEN to use
| them is a bit of a pipe dream right now.
| fragmede wrote:
| Can you be more specific? The simple case seems to be solved,
| eg if I have an mcp for foo enabled and then ask about a list
| of foo, Claude will go and call the list function on foo.
| corytheboyd wrote:
| > [...] and then ask about a list of foo
|
| Not OP, but this is the part that I take issue with. I want
| to forget what tools are there and have the LLM figure out
| on its own which tool to use. Having to remember to add
| special words to encourage it to use specific tools
| (required a lot of the time, especially with esoteric
| tools) is annoying. I'm not saying this renders the whole
| thing "useless" because it's good to have some idea of what
| you're doing to guide the LLM anyway, but I wish it could
| do better here.
| ChadMoran wrote:
| It doesn't reliably do it. You need to inject context into
| the prompt to instruct the LLM to use tools/kb/etc. It
| isn't deterministic of when/if it will follow-through.
| fridder wrote:
| All of these random features is just pushing me further towards
| model agnostic tools like goose
| xpe wrote:
| Thanks for sharing goose.
|
| This phase of LLM product development feels a bit like the
| Tower of Babel days with Cloud services before wrapper tools
| became popular and more standardization happened.
| cesarvarela wrote:
| I wonder how much this affects the model's performance. I
| imagine Anthropic trains its models to use a generic set of
| tools, but they can also lean on their specific tool
| definitions to save the agent from having to guess which tool
| for what.
| asdev wrote:
| I wonder what the accuracy is for Claude to always follow a Skill
| accurately. I've had trouble getting LLMs to follow specific
| workflows 100% consistently without skipping or missing steps.
| rob wrote:
| Subagents, plugins, skills, hooks, mcp servers, output styles,
| memory, extended thinking... seems like a bunch of stuff you can
| configure in Claude Code that overlap in a lot of areas. Wish
| they could figure out a way to simplify things.
| singularity2001 wrote:
| Also the post does not contain a single word how it relates to
| the very similar agents in claude code. Capabilities,
| connectors, tasks, apps, custom-gpts, ... the space needs some
| serious consolidation and standardization!
|
| I noticed the general tendency for overlap also when trying to
| update claude since 3+ methods conflicted with each other
| (brew, curl, npm, bun, vscode).
|
| Might this be the handwriting of AI? ;)
| kordlessagain wrote:
| The post is simply "here's a folder with crap in it I may or
| may not use".
| CuriouslyC wrote:
| My agent has handlebars system prompts that you can pass
| variables at orchestration time. You can cascade imports and
| such, it's really quite powerful; a few variables can result in
| radically different system prompt.
| _greim_ wrote:
| > Developers can also easily create, view, and upgrade skill
| versions through the Claude Console.
|
| For coding in particular, it would be super-nice if they could
| just live in a standard location in the repo.
| GregorStocks wrote:
| Looks like they do:
|
| > You can also manually install skills by adding them to
| ~/.claude/skills.
| deeviant wrote:
| Basically just rules/workflows from cursor/windsurf, but with a
| UI.
| pixelpoet wrote:
| Aside: I really love Anthropic's design language, so beautiful
| and functional.
| maigret wrote:
| Yes and fantastically executed, consistently through all their
| products and website - desktop, command line, third parties and
| more.
| lukev wrote:
| I agree 100%, except for the logo, which persistently looks
| like something they... probably did not intend.
| nozzlegear wrote:
| I always thought of it as an ink blot. Until now.
| micromacrofoot wrote:
| a helpful reminder that these things often speak from their
| asses
| jasonthorsness wrote:
| When the skill is used locally in Claude Code does it still run
| in a virtual machine? Like some sort of isolation container with
| the target directory mounted?
| xpe wrote:
| Better when blastin' Skills by Gang Starr (headphones recommended
| if at work):
|
| https://www.youtube.com/watch?v=Lgmy9qlZElc
| 999900000999 wrote:
| Can I just tell it to read the entire Godot source repo as a
| skill ?
|
| Or is there some type of file limit here. Maybe the context
| windows just aren't there yet, but it would be really awesome if
| coding agents would stop trying to make up functions.
| s900mhz wrote:
| Download the godot docs and tell the skill to use them. It
| won't be able to fit the entire docs in the context but that's
| not the point. Depending on the task it will search for what it
| needs
| dearilos wrote:
| We're trying to solve a similar problem at wispbit - this is an
| interesting way to do it!
| CuriouslyC wrote:
| Anything the model chooses to use is going to waste context and
| get utilized poorly. Also, the more skills you have, the worse
| they're going to be. It's subagents v2.
|
| Just use slash commands, they work a lot better.
| just-working wrote:
| I simply do not care about anything AI now. I have a severe
| revulsion to it. I miss the before times.
| sega_sai wrote:
| There seems to be a lot of overlap of this with MCP tools. Also
| presumably if there are a lot of skills, they will be too big for
| the context and one would need some way to find the right one. It
| is unclear how well this approach will scale.
| rahimnathwani wrote:
| Anthropic talks about 'progressive disclosure'.
|
| If you have a large number of skills, you could group them into
| a smaller number of skills each with subskills. That way not
| all the (sub)skill descriptions need to be loaded into context.
|
| For example, instead of having a 'PDF editing' skill, you can
| have a 'file editing' skill that, when loaded into context,
| tells the LLM what type of files it can operate on. And then
| the LLM can ask for the info about how to do stuff with PDF
| files.
| guluarte wrote:
| great! another set of files the models will completely ignore
| like CLAUDE.md
| simonw wrote:
| I accidentally leaked the existence of these last Friday, glad
| they officially exist now!
| https://simonwillison.net/2025/Oct/10/claude-skills/
| buildbot wrote:
| "So I fired up a fresh Claude instance (fun fact: Code
| Interpreter also works in the Claude iOS app now, which it
| didn't when they first launched) and prompted:
|
| Create a zip file of everything in your /mnt/skills folder"
|
| It's a fun, terrifying world that this kind of "hack" to
| exfiltrate data is possible! I hope it does not have full
| filesystem/bin access, lol. Can it SSH?...
| antiloper wrote:
| What's the hack? Instead of typing `zip -r mnt.zip /mnt` into
| bash, you type `Create a zip file of /mnt` in claude code.
| It's the same thing running as the same user.
| tgtweak wrote:
| Skills run remotely in the llm environment, not locally on
| your system running claude - worth noting.
| skylurk wrote:
| Woah, Jesse's blog has really come alive lately. Thanks for
| highlighting this post.
| sva_ wrote:
| All this AI, and yet it can't render properly on mobile.
| mikkupikku wrote:
| I'd love a Skill for effective use of subagents in Claude Code.
| I'm still struggling with that.
| arjie wrote:
| It's pretty neat that they're adding these things. In my
| projects, I have a `bin/claude` subdirectory where I ask it to
| put scripts etc. that it builds. In the claude.md I then note
| that it should look there for tools. It does a pretty good job of
| this. To be honest, the thing I most need are context-management
| helpers like "start a claude with this set of MCPs, then that
| set, and so on". Instead right now I have separate subdirectories
| that I then treat as projects (which are supported as profiles in
| Claude) which I then launch a `claude` from. The advantage of the
| `bin/claude` in each of these things is that it functions as a
| longer-cycle learning thing. My Claude instantly knows how to
| analyze certain BigQuery datasets and where to find the
| credentials file and so on.
|
| Filesystem as profile manager is not something I thought I'd be
| doing, but here we are.
| tomComb wrote:
| > the thing I most need are context-management helpers like
| "start a claude with this set of MCPs, then that set, and so
| on".
|
| Isn't that sub agents?
| arjie wrote:
| Ah, in my case, I want to just talk to a video-editing
| Claude, and then a sys-admin Claude, and so on. I don't want
| to go through a main Claude who will instantiate these guys.
| I want to talk to the particular Claudes myself. But if sub-
| agents work for this, then maybe I just haven't been using
| them well.
| iyn wrote:
| Does anyone know how skills relate to subagents? Seems that
| subagents have more capabilities (e.g. can access the internet)
| but seems that there's a lot of overlap.
|
| I've asked Claude and this it answered this:
| Skills = Instructions + resources for the current Claude instance
| (shared context) Subagents = Separate AI instances with
| isolated contexts that can work in parallel (different context
| windows) Skills make Claude better at specific tasks.
| Subagents are like having multiple specialized Claudes working
| simultaneously on different aspects of a problem.
|
| I imagine we can probably compose them, e.g. invoke subagents (to
| keep separate context) which could use some skills to in the end
| summarize the findings/provide output, without "polluting" the
| main context window.
| lukev wrote:
| How this reads to me is that a skill is "just" a bundle of
| prompts, scripts, and files that can be read into context as a
| unit.
|
| Having a sub-agent "execute" a skill makes a lot of sense from
| a context management, perspective, but I think the way to think
| about it is that a sub-agent is an "execution-level" construct,
| whereas a skill is a "data-level" construct.
| throwup238 wrote:
| Skills can also contain scripts that can be executed in a VM.
| The Anthropic engineering blog mentions that you can specify
| in the markdown instructions whether the script should be
| executed or read into context. One of their examples is a
| script to extract properties from a PDF file.
| jstummbillig wrote:
| ELI5: How is a skill different from a tool?
| notepad0x90 wrote:
| Just me or is anthropic doing a lot better of a job at marketing
| than openai and google?
| reed1234 wrote:
| It's much more focused on devs I feel like. Less fluff
| lquist wrote:
| lol how is this not optimized for mobile
| emadabdulrahim wrote:
| So skills are basically preset system prompts, assuming different
| roles etc? Or is there more to it.
|
| I'm a little confused.
| imiric wrote:
| Right, that's my interpretation as well.
|
| "AI" companies have reached the end of the road when it comes
| to throwing more data and compute at the problem. The only way
| now for charts to go up and to the right is to deliver value-
| added services.
|
| And, to be fair, there's a potentially long and profitable road
| by doing good engineering work that was needed anyways.
|
| But it should be obvious to anyone within this bubble that this
| is not the road to "superintelligence" or "AGI". I hope that
| the hype and false advertising stops soon, so that we can focus
| on practical applications of this technology, which are
| numerous.
| JyB wrote:
| I'm super confused as well. This seems like exactly that, just
| some default prompt injections to chose from. I guess I kinda
| understand them in the context of their claude chat UI product.
|
| By I don't understand why it's a thing in Claude Code tho when
| we already have Claude.md? Could also just point to any .md
| file in the prompt as preamble but not even needed.
| https://www.anthropic.com/engineering/claude-code-best-pract...
|
| That concept is also already perfectly specd in the MCP
| standard right? (Although not super used I think?)
| https://modelcontextprotocol.io/specification/2025-06-18/ser...
| chickensong wrote:
| Claude.md gets read every time and eats context, while it
| sounds like the skills are read as-needed, saving context.
| pollinations wrote:
| Plus executable.xode snippets. I think their actual source code
| doesn't use context. But feels like function calling packaged.
| mercurialsolo wrote:
| Sub agents, mcp, skills - wonder how are they supposed to
| interact with each other?
|
| Feels like fair bit of overlap here. It's ok to proceed in a
| direction where you are upgrading the spec and enabling claude
| wth additional capabilities. But one can pretty much use any of
| these approaches and end up with the same capability for an
| agent.
|
| Right now feels like a ux upgrade from mcp where you need a json
| but instead can use a markdown in a file / folder and provide
| multi-modal inputs.
| JyB wrote:
| Claude Skills just seem to be the same as MCP prompts:
| https://modelcontextprotocol.io/specification/2025-06-18/ser...
|
| I don't really see why they had to create a different concept.
| Maybe makes sense "marketing-wise" for their chat UI, but in
| Claude Code? Especially when CLAUDE.md is a thing?
| datadrivenangel wrote:
| Yeah how is this different from MCP prompts?
| pizza wrote:
| Narrowly focused semantics/affordances (for both LLM and
| users/future package managers/communities, ease of
| redistribution and context management:
|
| - skills are plain files that are injected contextually
| whereas prompts would come w the overhead of live, running
| code that has to be installed just right into your particular
| env, to provide a whole mcp server. Tbh prompts also seem to
| be more about literal prompting, too
|
| - you could have a thousand skills folders for different
| softwares etc but good luck with having more than a few mcp
| servers that are loaded into context w/o it clobbering the
| context
| jjfoooo4 wrote:
| I see this as a lower overhead replacement for MCP. Rather
| than managing a bunch of MCP's, use the directory structure
| to your advantage, leverage the OS's capability to execute
| JyB wrote:
| I think you are right.
| ebonnafoux wrote:
| For me the concept of MCP was to have a client/server
| relation. For skills everything will be local.
| pattobrien wrote:
| MCP Prompts are meant to be _user triggered_ , whereas I
| believe a Skill is meant to be an LLM-triggered, use-case
| centric set of instructions for a specific task.
| - MCP Prompt: "Please solve GitHub Issue #{issue_id}" -
| Skills: - React Component Development (React best
| practices, accessible tools) - REST API Endpoint
| Development - Code Review
|
| This will probably result in: - Single
| "CLAUDE.md" instructions are broken out into discoverable
| instructions that the LLM will dynamically utilize based on
| the user's prompt - rather than having direct access to
| Tools, Claude will always need to go through Skill
| instructions first (making context tighter since it cant use
| Tools without understanding \*how\* to use them to achieve a
| certain goal) - Clients will be able to add infinite
| MCP servers / tools, since the Tools themselves will no
| longer all be added to the context window
|
| It's basically a way to decouple User prompts from direct raw
| Tool access, which actually makes a ton of sense when you
| think of it.
| fny wrote:
| I fear the conceptual churn we're going to endure in the coming
| years will rival frontend dev.
|
| Across ChatGPT and Claude we now have tools, functions, skills,
| agents, subagents, commands, and apps, and there's a
| metastasizing complex of vibe frameworks feeding on this mess.
| LPisGood wrote:
| Metastasizing is such an excellent way to describe this
| phenomenon. They grow on top of each other.
| hkt wrote:
| The same thing will happen: skilled people will do one thing
| well. I've zero interest in anything but Claude code in a dev
| container and, while mindful of the lethal trifecta, will give
| Claude as much access to a local dev environment and it's
| associated tooling as I would give to a junior developer.
| mathattack wrote:
| There's so much white space - this is the cost of a brand new
| technology. Similar issues with figuring out what cloud tools
| to use, or what python libraries are most relevant.
|
| This is also why not everyone is an early adopter. There are
| mental costs involved in staying on top of everything.
| benterix wrote:
| > This is also why not everyone is an early adopter.
|
| Usually, there are relatively few adopters of a new
| technology.
|
| But with LLMs, it's quite the opposite: there was a huge
| number of early adopters. Some got extremely excited and run
| hundreds of agents all the time, some got burned and went
| back to the good old ways of doing things, whereas the
| majority is just using LLMs from time to time for various
| tasks, bigger of smaller.
| a4isms wrote:
| I follow your reasoning. If we just look at businesses, and
| we include every business that pays money for AI and one or
| more employees use AI to do their their jobs, then we're in
| the Early Majority phase, not the Innovator or Early
| Adopter phases.
|
| https://en.wikipedia.org/wiki/Technology_adoption_life_cycl
| e
| mathattack wrote:
| There's early adoption from individuals. Much less from
| enterprises. (They're buying site licenses, but not re-
| engineering their company processes)
| kbar13 wrote:
| i'm letting the smarter folks figure all this out and just
| picking the tools i like every now and then. i like just using
| claude code with vscode and still doing some things manually
| efields wrote:
| same same
| esafak wrote:
| On the other hand, this complexity represents a new niche that,
| for a while at least, will present job and business
| opportunities.
| Trias11 wrote:
| Right.
|
| I focus on building projects delivering some specific business
| value and pick the tools that gets me there.
|
| There is zero value in spending cycles by engaging in new tools
| hype.
| dalmo3 wrote:
| For Cursor: cursorrules, mdc rules, user rules, team rules.
| catgary wrote:
| These companies are also biased towards solutions that will
| more-or-less trap you in a heavily agent-based workflow.
|
| I'm surprised/disappointed that I haven't seen any papers out
| of the programming languages community about how to integrate
| agentic coding with compilers/type system features/etc. They
| really need to step up, otherwise there's going to be a lot of
| unnecessary CO2 produced by tools like this.
| awb wrote:
| Hopefully there's a similar "don't make me think" mantra that
| comes to AI product design.
|
| I like the trend where the agent decides what models, tooling
| and thought process to use. That seems to me far more powerful
| than asking users to create solutions for each discreet problem
| space.
| kingkongjaffa wrote:
| Where I've seen it be really transformative is giving it
| additive tools that are multiplicative in utility. So like
| giving an LLM 5 primitive tools for a specific domain and the
| agent figuring out how to use them together and chain them
| and run some tools multiple times etc.
| iLoveOncall wrote:
| Except in reality it's ALL marketing terms for 2 things:
| additional prompt sections, and APIs.
| james_marks wrote:
| I more or less agree, but it's surprising what naming a
| concept does for the average user.
|
| You see a text file and understand that it can be anything,
| but end users can't/won't make the jump. They need to see the
| words Note, Reminder, Email, etc.
| butlike wrote:
| Just wait until I can pull in just the concepts I want with
| "GPT Package Manager." I can simply call `gptpm add skills` and
| the LLM package manager will add the Skills package to my GPT.
| What could go wrong?
| libraryofbabel wrote:
| You forgot mcp-everything!
|
| Yes, it's a mess, and there will be a lot of churn, you're not
| wrong, but there are foundational concepts underneath it all
| that you can learn and then it's easy to fit insert-new-feature
| into your mental model. (Or you can just ignore the new
| features, and roll your own tools. Some people here do that
| with a lot of success.)
|
| The foundational mental model to get the hang of is really
| just:
|
| * An LLM
|
| * ...called in a loop
|
| * ...maintaining a history of stuff it's done in the session
| (the "context")
|
| * ...with access to tool calls to do things. Like, read files,
| write files, call bash, etc.
|
| Some people call this "the agentic loop." Call it what you
| want, you can write it in 100 lines of Python. I encourage
| every programmer I talk to who is remotely curious about LLMs
| to try that. It is a lightbulb moment.
|
| Once you've written your own basic agent, if a new tool comes
| along, you can easily demystify it by thinking about how you'd
| implement it yourself. For example, Claude Skills are really
| just:
|
| 1) Skills are just a bunch of files with instructions for the
| LLM in them.
|
| 2) Search for the available "skills" on startup and put all the
| short descriptions into the context so the LLM knows about
| them.
|
| 3) Also tell the LLM how to "use" a skill. Claude just uses the
| `bash` tool for that.
|
| 4) When Claude wants to use a skill, it uses the "call bash"
| tool to read in the skill files, then does the thing described
| in them.
|
| and that's more or less it, glossing over a lot of things that
| are important but not foundational like ensuring granular tool
| permissions, etc.
| Der_Einzige wrote:
| Tool use is only good with structured/constrained generation
| libraryofbabel wrote:
| You'll need to expand on what you mean, I'm afraid.
| AStrangeMorrow wrote:
| I think, from my experience, what they mean is tool use
| is as good as your model capability to stick to a given
| answer template/grammar. For example if it does tool
| calling using a JSON format it needs to stick to that
| format, not hallucinate extra fields and use the existing
| fields properly. This has worked for a few years and LLMs
| are getting better and better but the more tools you
| have, the more parameters your functions to call can have
| etc the higher the risk of errors. You also have systems
| that constrain the whole inference itself, for example
| with the outlines package, by changing the way tokens are
| sampled (this way you can force a model to stick to a
| template/grammar, but that can also degrade results in
| some other ways)
| libraryofbabel wrote:
| I see, thanks for channeling the GP! Yeah, like you say,
| I just don't think getting the tool call template right
| is really a problem anymore, at least with the big-labs
| SotA models that most of us use for coding agents. Claude
| Sonnet, Gemini, GPT-5 and friends have been heavily
| heavily RL-ed into being really good at tool calls, and
| it's all built into the providers' apis now so you never
| even see the magic where the tool call is parsed out of
| the raw response. To be honest, when I first read about
| tools calls with LLMs I thought, "that'll never work
| reliably, it'll mess up the syntax sometimes." But in
| practice, it does work. (Or, to be more precise, if the
| LLM ever does mess up the grammar, you never know because
| it's able to seamlessly retry and correct without it ever
| being visible at the user-facing api layer.) Claude Code
| plugged into Sonnet (or even Haiku) might do hundreds of
| tool calls in an hour of work without missing a beat. One
| of the many surprises of the last few years.
| dlivingston wrote:
| > Call it what you want, you can write it in 100 lines of
| Python. I encourage every programmer I talk to who is
| remotely curious about LLMs to try that. It is a lightbulb
| moment.
|
| Definitely want to try this out. Any resources / etc. on
| getting started?
| libraryofbabel wrote:
| This is the classic blog post, by Thorsten Ball, from way
| back in the AI Stone Age (April this year):
| https://ampcode.com/how-to-build-an-agent
|
| It uses Go, which is more verbose than Python would be, so
| he takes 300 lines to do it. Also, his edit_file tool could
| be a lot simpler (I just make my minimal agent "edit" files
| by overwriting the entire existing file).
|
| I keep meaning to write a similar blog post with Python, as
| I think it makes it even clearer how simple the stripped-
| down essence of a coding agent can be. There is magic, but
| it all lives in the LLM, not the agent software.
| judahmeek wrote:
| > I keep meaning to write a similar blog post with
| Python...
|
| Just have your agent do it.
| libraryofbabel wrote:
| I could, but I'm actually rather snobbish about my
| writing and don't believe in having LLMs write first
| drafts (for proofreading and editing, they're great).
|
| (I am not snobbish about my code. If it works and is
| solid and maintainable I don't care if I wrote it or not.
| Some people seem to feel a sense of loss when an LLM
| writes code for them, because of The Craft or whatever.
| That's not me; I don't have my identity wrapped up in my
| code. Maybe I did when I was more junior, but I've been
| in this game long enough to just let it go.)
| ibejoeb wrote:
| Pretty true, and definitely a good exercise. But if we're
| going to actual use these things in practice, you need more.
| Things like prompt caching, capabilities/constraints, etc.
| It's pretty dangerous to let an agent go hog wild in an
| unprotected environment.
| libraryofbabel wrote:
| Oh sure! And if I was talking someone through building a
| barebones agent, I'd definitely tag on a warning along the
| lines of "but don't actually use this without XYZ!" That
| said, you can add prompt caching by just setting a couple
| of parameters in the api calls to the LLM. I agree
| constraints is a much more complex topic, although even in
| my 100-line example I am able to fit in a user approval
| step before file write or bash actions.
| apsurd wrote:
| when you say prompt caching, does it mean cache the thing
| you send to the llm or the thing you get back?
|
| sounds like prompt is what you send, and caching is
| important here because what you send is derived from
| previous responses from llm calls earlier?
|
| sorry to sound dense, I struggle to understand where and
| how in the mental model the non-determinism of a response
| is dealt with. is it just that it's all cached?
| libraryofbabel wrote:
| Not dense to ask questions! There are two separate
| concepts in play:
|
| 1) Maintaining the state of the "conversation" history
| with the LLM. LLMs are stateless, so you have to store
| the entire series of interactions on the client side in
| your agent (every user prompt, every LLM response, every
| tool call, every tool call result). You then send the
| entire previous conversation history to the LLM every
| time you call it, so it can "see" what has already
| happened. In a basic agent, it's essentially just a big
| list of strings, and you pass it into the LLM api on
| every LLM call.
|
| 2) "Prompt caching", which is a clever optimization in
| the LLM infrastructure to take advantage of the fact that
| most LLM interactions involve processing a lot of
| unchanging past conversation history, plus a little bit
| of new text at the end. Understanding it requires
| understanding the internals of LLM transformer
| architecture, but the essence of it is that you can save
| a lot of GPU compute time by caching previous result
| states that then become intermediate states for the next
| LLM call. You cache on the entire history: the base
| prompt, the user's messages, the LLM's responses, the
| LLM's tool calls, everything. As a user of an LLM api,
| you don't have to worry about how any of it works under
| the hood, you just have to enable it. The reason to turn
| it on is it dramatically increases response time and
| reduces cost.
|
| Hope that clarifies!
| __loam wrote:
| Langchain was the original sin of thin framework bullshit
| kelvinjps10 wrote:
| I found that the way that Claude now handle tools on my sistema
| simplifies stuff, with its cli usage, I find the Claude skills
| model better than mcp
| lukev wrote:
| The cool part is that none of any of this is actually that big
| or difficult. You can master it on-demand, or build your own
| substitutes if necessary.
|
| Yeah, if you chase buzzword compliance and try to learn all
| these things outside of a particular use case you're going to
| burn out and have a bad time. So... don't?
| siva7 wrote:
| It feels like every week these companies release some new
| product that feels very similar to what they released a week
| before. Can the employees at Anthropic even tell themselves
| what the difference is?
| amelius wrote:
| These products are all cannibalizing eachother, so a bad
| strategy.
| zmmmmm wrote:
| Yep, the ecosystem is well on its way to collapsing under its
| own weight.
|
| You have to remember, every system or platform has a total
| complexity budget that effectively sits at the limit of what a
| broad spectrum of people can effectively incorporate into their
| day to day working memory. How it gets spent is absolutely
| crucial. When a platform vendor adds a new piece of complexity,
| it comes from the same budget that could have been devoted to
| things built on the platform. But unlike things built on the
| platform, it's there whether developers like it and use it or
| not. It's common these days that providers binge on ecosystem
| complexity because they think it's building differentiation,
| when in fact it's building huge barriers to the exact audience
| they need to attract to scale up their customer base, and
| subtracting from the value of what can actually be built _on_
| their platform.
|
| Here you have a highly overlapping duplicative concept that's
| taking a solid chunk of new complexity budget but not really
| adding a lot of new capability in return. I am sure the people
| who designed it think they are reducing complexity by adding a
| "simple" new feature that does what people would otherwise have
| to learn themselves. It's far more likely they are at break
| even for how many people they deter vs attract from using their
| platform by doing this.
| josefresco wrote:
| I just used tested the canvas-design skill and the results were
| pretty awful.
|
| This is the skill description:
|
| Create beautiful visual art in .png and .pdf documents using
| design philosophy. You should use this skill when the user asks
| to create a poster, piece of art, design, or other static piece.
| Create original visual designs, never copying existing artists'
| work to avoid copyright violations.
|
| What it created was an abstract art museum-esque poster with
| random shapes and no discernable message. It may have been trying
| to design a playing card but just failed miserably which is my
| experience with most AI image generators.
|
| It certainly spent a lot of time, and effort to create the
| poster. It asked initial questions, developed a plan, did
| research, created tooling - seems like a waste of "tokens" given
| how simple and lame the resulting image turned out.
|
| Also after testing I still don't know how to "use" one of these
| skills in an actual chat.
| taejavu wrote:
| If you want to generate images, use Midjourney or whatever.
| It's almost like you've deliberately missed the point of the
| feature.
| jedisct1 wrote:
| Too many options, this is getting very confusing.
|
| Roo Code just has "modes", and honestly, this is more than
| enough.
| rohan_ wrote:
| Cursor launched this a while ago with "Cursor Rules"
| radley wrote:
| It will be interesting to see how this is structured. I was
| already doing something similar with Claude Projects &
| Instructions, MCP, and Obsidian. I'm hoping that Skills can
| cascade (from general to specific) and/or be combined between
| projects.
| datadrivenangel wrote:
| So sort of like MCP prompt templates except not prompt templates?
| laurentiurad wrote:
| AGI nowhere near
| skylurk wrote:
| I know I'm replying to a shitpost. But I had a realisation, and
| I'm probably not the only one.
|
| If you can manage to keep structuring slightly intelligent
| tools so that they compound, seems like AGI is achievable.
|
| That's why the thing everyone is after right now is new ways to
| make those slight intelligences keep compounding.
|
| Just like repeated multiplication of 1.001 grows indefinitely.
| gigatree wrote:
| But how often can you repeat the multiplication when the
| repetitions are unsustainable?
| skylurk wrote:
| Yeah, sometimes it feels like we're just layering
| unintelligent things, with compounding unintelligence...
|
| But starting earlier this year, I've started to see
| glimpses of what seems like intelligence (to me) in the
| tools, so who knows.
| Lionga wrote:
| I know I'm replying to a shitpost. Well enough said.
| robwwilliams wrote:
| Could be helpful. I often edit scientific papers and grant
| applications. Orienting Claude on the frontend of each project
| works but an "Editing Skill" set could be more general and make
| interactions with Claude more clued in to goals instead of
| starting stateless.
| mercurialsolo wrote:
| One sharp contrast though I see between OpenAI and Anthropic is
| the product extensions are built around their flagship products.
|
| OpenAI ships extensions for ChatGPT - that feed more to plug into
| the consumer experience. Anthropic ships extensions (made for
| builders) into ClaudeCode - feel more DX.
| corytheboyd wrote:
| I'll give it a fair go, but how is it not going to have the same
| problem of _maybe_ using MCP tools? The same problem of trying to
| add to your prompt "only answer if you are 100% correct"? A skill
| just sounds like more markdown that is fed into context, but with
| a cool name that sounds impressive, and some indexing of the
| defined skills on start (same as MCP tools?)
| butlike wrote:
| Great, so now I can script the IDE...err, I mean LLM. I can't
| help but feel like we've been here before, and the magic is
| wearing thin.
| gloosx wrote:
| wow, this news post layout is not fitting the screen on mobile...
| Couldnt these 10x programmers vibecode a proper mobile version?
| thorio wrote:
| How about using some of that skills to make that page mobile
| ready...
| I_am_tiberius wrote:
| Every release of these companies makes me angry because I know
| they take advantage of all the people who release content to the
| public. They just consume and take the profit. In addition to
| that Anthropic has shown that they don't care about our privacy
| AT ALL.
| mercurialsolo wrote:
| The way this is headed - I also see a burgeoning class of tools
| emerging. MCP servers, Skill managers, Sub-Agent builders. Feels
| like the patterns and protocols need more explainability to how
| they synthesize into a practical dev (extension) toolkit which is
| useful across multiple surfaces e.g. chat vs coding vs media gen.
| actinium226 wrote:
| It's an interesting idea (among many) to try to address the
| problem of LLMs getting off task, but I notice that there's no
| evaluation in the blog post. Like, ok cool, you've added
| "skills," but is there any evidence that they're useful or are we
| just grasping at straws here?
| titzer wrote:
| While not generally a bad idea, I find it amusing that they are
| reinventing shared libraries where the code format is...English.
| So the obvious next step is "precompiling" skills to a form that
| is better for Claude internally.
|
| ...which would be great if the (likely binary) format of that was
| used internally, but something tells me an architectural screwup
| will lead to leaking the binaries and we'll have a dependency on
| a dumb inscrutable binary format to carry forward...
| tgtweak wrote:
| At term (and not even far term) - LLMs will be able to churn up
| their own "skills" using their sandbox code environments - and
| possibly recycle them through context on a per-user basis.
|
| While I like the flexibility of deploying your own skills to
| claude for use org-wide, this really feels like what MCP should
| be for that use case, or what built-in analysis sandbox should
| be.
|
| We haven't even gone mainstream with MCP and there are already 10
| stand-ins doing roughly the same thing with a different twist.
|
| I would have honestly preferred they called this embedded MCP
| instead of 'skills'.
| _pdp_ wrote:
| I predict there will be some sort of package manager opensource
| project soon. Download skills from some 3rd-party website and run
| inside Claude. Risks of supply chain issue will be obvious but
| nobody will care - at least not in the short term.
| nextworddev wrote:
| What is this, tools for Claude web app?
| XCSme wrote:
| Isn't this just RAG?
| jrh3 wrote:
| The tools I build for Claude Code keep reducing back to just
| using Claude Code and watching Anthropic add what I need. This is
| my tool for brownfield projects with Claude Code. I added skills
| based on https://blog.fsck.com/2025/10/09/superpowers/
|
| https://github.com/RossH3/context-tree - Helps Claude and humans
| understand complex brownfield codebases through maintained
| context trees.
| simonw wrote:
| Just published this about skills: "Claude Skills are awesome,
| maybe a bigger deal than MCP"
|
| https://simonwillison.net/2025/Oct/16/claude-skills/
| pants2 wrote:
| Skills are cool, but to me it's more of a design pattern /
| prompt engineering trick than something in need of a hard spec.
| You can even implement it in an MCP - I've been doing it for a
| while: "Before doing anything, search the skills MCP and read
| any relevant guides."
| manbash wrote:
| I agree with you, but also I want to ask if I do understand
| this correctly: there was a paradigm in which we were aiming
| for Small Language Models to perform specific types of tasks,
| orchestrated by the LLM. That is what I perceived the MCP
| architecture came to standardize.
|
| But here, it seems more like a diamond shape of information
| flow: the LLM processes the big task, then prompts are
| customized (not via LLM) with reference to the Skills, and
| then the customized prompt is fed yet again to the LLM.
|
| Is that the case?
| kingkongjaffa wrote:
| when do you need to make a skill vs a project?
| simonw wrote:
| In Claude and ChatGPT a project is really just a custom
| system prompt and an optional bunch of files. Those files are
| both searchable via tools and get made available in the Code
| Interpreter container.
|
| I see skills as something you might use inside of a project.
| You could have a project called "data analyst" with a bunch
| of skills for different aspects of that task - how to run a
| regression, how to export data from MySQL, etc.
|
| They're effectively custom instructions that are unlimited in
| size and that don't cause performance problems by clogging up
| the context - since the whole point of skills is they're only
| read into the context when the LLM needs them.
| timcobb wrote:
| then submit it, you don't need to post here about it
| hu3 wrote:
| i found it useful and coinstructive to post it here also.
|
| no reason not to.
| hu3 wrote:
| Do you reckon Skills overlap with AGENTS.md?
|
| VSCode recently introduced support nested AGENTS.md which
| albeit less formal, might overlap:
|
| https://code.visualstudio.com/updates/v1_105#_support-for-ne...
| outlore wrote:
| I'm struggling to see how this is different from prepackaged
| prompts. Simon's article talks about skill metadata being used by
| the model to look up the full prompt as a way to save on context
| usage. That is analogous to the model calling --help when it
| needs to use a CLI tool without needing to load up the full man
| pages ahead of time.
|
| But couldn't an MCP server expose a "help" tool?
| throwup238 wrote:
| That's pretty much all it is. If you look at the docs it even
| uses a bash script to read the skill markdown files into the
| context.
|
| I think the big difference is that now you can include scripts
| in these skills that can be executed as part of the skill, in a
| VM on their servers.
| GoatInGrey wrote:
| It's the fact that a collection of files are tied to a specific
| task or action. Prompts are only injected context, whereas
| files can be more selectively loaded into context.
|
| What they're trying to do here is translate MCP servers to
| something more broadly useable by the population. They cannot
| differentiate themselves with model training anymore, so they
| have been focusing more and more on tooling development to grow
| revenue.
| kingkongjaffa wrote:
| What's the difference in use case between a claude-skill and
| making a task specific claude project?
| kristo wrote:
| How is this different from commands? They're automatically
| invoked? How does claude decide when to use a skill? How specific
| do I need to write my skill?
| stego-tech wrote:
| I'm kind of in stitches over this. Claude's "skills" are
| dependent upon developers writing competent documentation _and_
| keeping it up to date...which most seemingly can't even do for
| actual code they write, nevermind a brute-force black box like an
| LLM.
|
| For those few who do write competent documentation _and_ have
| well-organized file systems _and_ the risk tolerance to allow
| LLMs to run roughshod over data, sure, there's some potential
| here. Though if you're already that far in, you'd likely be
| better off farming that grunt work to a Junior as a learning
| exercise than an LLM, especially since you'll have to cleanup the
| output anyhow.
|
| With the limited context windows of LLMs, you can never truly get
| this sort of concept to "stick" like you can with a human, and if
| you're training an agent for this specific task anyway, you're
| effectively locking yourself to that specific LLM in perpetuity
| rather than a replaceable or promotable worker.
|
| Just...it makes me giggle, how _optimistic_ they are that stars
| would align at scale like that in an organization.
| yodsanklai wrote:
| I'd like to fast forward to a time where these tools are stable
| and mature so we can focus on coding again
___________________________________________________________________
(page generated 2025-10-16 23:00 UTC)