[HN Gopher] We collected 10k hours of neuro-language data in our...
___________________________________________________________________
We collected 10k hours of neuro-language data in our basement
Author : nee1r
Score : 70 points
Date : 2025-12-08 17:33 UTC (5 hours ago)
(HTM) web link (condu.it)
(TXT) w3m dump (condu.it)
| ArjunPanicksser wrote:
| Makes sense that CL ends up being the best for recruiting first-
| time participants. Curious what other things you tried for
| recruitment and how useful they were?
| n7ck wrote:
| The second most useful by far is Indeed, where we post an
| internship opportunity for participants interested in doing 10
| sessions over 10 weeks. Other things that work pretty well are
| asking professors to send out emails to students at local
| universities, putting up ~300-500 fliers (mostly around
| universities and public transit), and posting on Nextdoor. We
| also just texted a lot of groupchats/posted on linkedin/ gave
| out fliers and the signup link to kind of everyone we talked to
| in cafes and similar. We take on some participants as
| ambassadors as well, and pay them to refer their friends.
|
| We tried google/facebook/instagram ads, and we tried paying for
| some video placements. Basically none of the explicit
| advertisement worked at all and it wasn't worth the money.
| Though for what it's worth, none of us are experts in
| advertising, so we might have been going about it wrong -- we
| didn't put loads of effort into iterating once we realized it
| wasn't working.
| mishajw wrote:
| Interesting dataset! I'm curious what kind of results you would
| get with just EEG, compared to multiple modalities? Why do
| multiple modalities end up being important?
| n7ck wrote:
| EEG has very good temporal resolution, but quite bad spacial
| resolution, and other modalities have different tradeoffs
| g413n wrote:
| what's the basis for conversion between hours of neural data to
| number of tokens? is that counting the paired text tokens?
| rio-popper wrote:
| edit: oops sorry misread - the neural data is tokenised by our
| embedding model. the number of tokens per second of neural data
| varies and depends on the information content.
| n7ck wrote:
| Hey I'm Nick, and I originally came to Conduit as a data
| participant! After my session, I started asking questions about
| the setup to the people working there, and apparently I asked
| good questions, so they hired me.
|
| Since I joined, we've gone from <1k hours to >10k hours, and I've
| been really excited by how much our whole setup has changed. I've
| been implementing lots of improvements to the whole data pipeline
| and the operations side. Now that we train lots of models on the
| data, the model results also inform how we collect data (e.g. we
| care a lot less about noise now that we have more data).
|
| We're definitely still improving the whole system, but at this
| point, we've learned a lot that I wish someone had told us when
| we started, so we thought we'd share it in case any of you are
| doing human data collection. We're all also very curious to get
| any feedback from the community!
| internet_points wrote:
| I thought that kind of career change only happened in The Sims
| :-)
| n7ck wrote:
| hahahah tell me about it!
| Gormisdomai wrote:
| The example sentences generated "only from neural data" at the
| top of this article seem surprisingly accurate to me, like, not
| exact matches but much better than what I would expect even from
| 10k hours:
|
| "the room seemed colder" -> " there was a breeze even a gentle
| gust"
| ninapanickssery wrote:
| Yeah, agreed
| jcims wrote:
| Exactly. And honestly both this example and the one about the
| woman seemed to be what I would actually think/feel vs what I
| say.
|
| Very interesting!
| CobrastanJorji wrote:
| Tangential to your point, if you collect 10,000 hours of brain
| scanning in exactly one damp basement, I wonder if perhaps the
| model would become very, very specialized for all of the
| flavors of "this room seems colder."
| rio-popper wrote:
| For the record, it was two basements -- we moved office in
| the middle -- and a bigger issue was actually overheating.
| But your point is basically right! The model is a lot better
| at certain kinds of ideas than others. Particularly
| concerning was the fact that the first cluster I noticed
| getting good was all the different variations of 'the headset
| is uncomfortable/heavy' etc. But this makes sense -- what
| participants talk about has a lot to do with what kinds of
| ideas the model can pick up, and this was more or less what
| we expected
| ag8 wrote:
| This is a cool setup, but naively it feels like it would require
| hundreds of thousands of hours of data to train a decent
| generalizable model that would be useful for consumers. Are there
| plans to scale this up, or is there reason to believe that tens
| of thousands of hours are enough?
| n7ck wrote:
| Yeah I think the way we trained the embedding model focused a
| lot on how to make it as efficient as possible, since it is
| such a data-limited regime. So I think based on (early) scaling
| results, it'll be closer to 50-70k hours, which we should be
| able to get in the next months now we've already scaled up a
| lot.
|
| That said, the way to 10-20x data collection would be to open a
| couple other data collection centers outside SF, in high-
| population cities. Right now, there's a big advantage in just
| having the data collection totally in-house, because it's so
| much easier to debug/improve it because we're so small. But now
| we've mostly worked out the process, it should also be very
| straightforward for us to just replicate the entire ops/data
| pipeline in 3-4 parallel data collection centers.
| richardfeynman wrote:
| This is an interesting dataset to collect, and I wonder whether
| there will be applications for it beyond what you're currently
| thinking.
|
| A couple of questions: What's the relationship between the number
| of hours of neurodata you collect and the quality of your
| predictions? Does it help to get less data from more people, or
| more data from fewer people?
| n7ck wrote:
| 1. The predictions get better with more data - and we don't
| seem to be anywhere near diminishing returns. 2. The thing we
| care about is generalization between people. For this, less
| data from more people is much better.
| richardfeynman wrote:
| I noticed you tracked sessions per person, implying a subset
| of people have many hours of data collected on them. Are
| predictions for this subset better than the median?
|
| For a given amount of data, is it better to have more people
| with less data per person or fewer people with more data per
| person?
| clemvonstengel wrote:
| Yes, the predictions are much better for people with more
| hours of data in the training set. Usually, we just totally
| separate the train and val set, so no individual with any
| sessions in the train set is ever used for evals. When we
| instead evaluate on someone with 10+ hours in the train
| set, predictions get ~20-25% better.
|
| For a given amount of data, whether you want more or less
| data per person really depends on what you're trying to do.
| The thing we want is for it to be good at zero-shot, that
| is, for it to decode well on people who have zero hours in
| the train set. So for that, we want less data per person.
| If instead we wanted to make it do as well as possible on
| one individual, then we'd want way more data from that one
| person. (So, e.g., when we make it into a product at first,
| we'll probably finetune on each user for a while)
| richardfeynman wrote:
| Makes a ton of sense, thanks.
|
| I wonder if there will be medical applications for this
| tech, for example identifying people with brain or
| neurological disorders based on how different their
| "neural imaging" looks from normal.
| wiwillia wrote:
| Really interested in how accuracy improves with the scale of the
| data set. Non-invasive thought-to-action would be a whole new
| interaction paradigm.
| devanshp wrote:
| Cool post! I'm somewhat curious whether the data quality scoring
| has actually translated into better data; do you have numbers on
| how much more of your data is useful for training vs in May?
| rio-popper wrote:
| so the neural quality real-time checking was the most important
| thing here. Before we rewrote the backend, between 58-64% of
| participant hours were actually usable data. Now, it's between
| 90-95%
|
| If you mean the text quality scoring system, then when we added
| that, it improved the amount of text we got per hour of neural
| data by between 30-35%. (That includes the fact that we filter
| which participants we have return based on their text quality
| scores)
| rajlego wrote:
| Did you consider trying to collect data in a much poorer country
| that still has high quality English? e.g. the Philippines
| rio-popper wrote:
| Yeah we did consider this. For now, there's an advantage to
| having the data collection in the same building as the whole
| eng team, but once we hire a couple more engs, I expect we'll
| just replicate the collection setup in other countries as well
| estitesc wrote:
| Loved watching this unfold in our basement. : )
| dang wrote:
| [under-the-rug stub]
|
| [see https://news.ycombinator.com/item?id=45988611 for
| explanation]
| ClaireBookworm wrote:
| Yoo this is sick!! sometimes it might actually just be a data
| game, so huge props to them for actually collecting all that
| high-quality data
| ninapanickssery wrote:
| This is very cool, thanks for writing about your setup in such
| detail! It's impressive that you can predict stuff from this
| noninvasive data. Are there similar existing datasets or this
| the first of its kind?
| cpeterson42 wrote:
| Wild world we live in
| titzer wrote:
| I lol'd at the hardware "patch" that kept the software from
| crashing--removing all but the alpha-numeric keys (!?). Holy cow,
| you had time to collect thousands of hours of neurotraces but
| couldn't sanitize the inputs to remove a stray [? That
| sounds...funky.
| NoraCodes wrote:
| Presumably it's more like an errant Ctrl-C.
| clemvonstengel wrote:
| Yup exactly this. Also Ctrl-W, alt tab, etc.
| in-silico wrote:
| It's interesting that the model generalizes to unseen
| participants. I was under the impression that everyone's brain
| patterns were different enough that the model would need to be
| retrained for new users.
|
| Though, I suppose if the model had LLM-like context where it kept
| track of brain data and speech/typing from earlier in the
| conversation then it could perform in-context learning to adapt
| to the user.
| clemvonstengel wrote:
| Basically correct intuition: the model does much better when we
| give it, e.g., 30 secs of neural data in the leadup instead of
| e.g. 5 secs. My sense is also that it's learning in context, so
| people's neural patterns are quite different but there's a
| higher-level generator that lets the model learn in context (or
| probably multiple higher-level patterns, each of which the
| model can learn from in context).
|
| We only got any generalization to new users after we had >500
| individuals in the dataset, fwiw. There's some interesting MRI
| studies also finding a similar thing that when you have enough
| individuals in the dataset, you start seeing generalization.
| asgraham wrote:
| Really cool dataset! Love seeing people actually doing the hard
| work of generating data rather than just trying to analyze what
| exists (I say this as someone who's gone out of his way to avoid
| data collection).
|
| Have you played at all with thought-to-voice? Intuitively I'd
| think EEG readout would be more reliable for spoken rather than
| typed words, especially if you're not controlling for keyboard
| fluency.
| clemvonstengel wrote:
| Yeah we do both text and voice (roughly 70% of data collection
| is typed, 30% spoken). Partly this is to make sure the model is
| learning to decode semantic intent (rather than just planned
| motor movements). Right now, it's doing better on the typed
| part, but I expect that's just because we have more data of
| that kind.
|
| It does generalize between typed and spoken, i.e. it does much
| better on spoken decoding if we've also trained on the typing
| data, which is what we were hoping to see.
| asgraham wrote:
| Interesting! I imagine speech-related motor artifacts don't
| help matters either, even if noise starts mattering less at
| scale.
| n7ck wrote:
| Yeah -- we have the participants use chinrests as well,
| which reduces head motion artifacts for typing but less so
| for speaking (because they have to move their heads for
| that of course). so a lot of the data is with them keeping
| their heads quite still, although the model is becoming
| much more robust to this over time.
| whatshisface wrote:
| What's the plan for after this mind reading helmet works
| reliably?
| brovonov wrote:
| Sell it to an ad agency.
| clemvonstengel wrote:
| We build headsets that lets you control your computer directly
| with your mind. Initially I expect we can get increased
| bandwidth / efficiency on common tasks (including coding) - but
| I think it gets really exciting when people start designing new
| software / interaction paradigms with this in mind.
| whatshisface wrote:
| If you want it to be remembered as a revolutionary computer
| interface, you will have to make sure it is not used in
| interrogations.
| xg15 wrote:
| It's an enormously cool project (and also feels like the next
| logical thing to do after all the existing modalities)
|
| But it feels eery to read a detailed story how they built and
| improved their setup and what obstacles they encountered,
| complete with photos - without any mention _who_ is doing the
| things we are reading about. There is no mention of the staff or
| even the founders on the whole website.
|
| I had a hard time judging how large this project even is. The
| homebuilt booths and trial-and-error workflow sound like "three
| people garage startup", but the bookings schedule suggests a
| larger team.
|
| (At least there is an author line on that blog post. Had to
| google the names to get some background on this company)
|
| You should consider an "about us" page :)
| rio-popper wrote:
| Good point. We're a team of 7 right now (3 engineering, 4
| running data collection across shifts). We've been spending
| ~all our time on the data and model side, so the "About us"
| page lagged behind, but we'll add one this week. Appreciate the
| feedback!
| xg15 wrote:
| No question these are the more important things to spend time
| on. Good luck!
| accrual wrote:
| Very cool project! I had a couple ideas during the read:
|
| * A ceiling-based pully system could help take the physical load
| off the users and may allow for increased sensor density. Some
| large/public VR setups do this.
|
| * I'm sure you considered it, but a double-converting UPS might
| reduce the noise floor of your sensors and could potentially
| support multiple booths. Expensive though, and it's already
| mentioned that data quantity > quality at this stage. Maybe a
| future fine-tuning step could leverage this.
|
| Cool write up and hope to see more in the future!
| moffkalast wrote:
| Your engineers were so preoccupied with whether or not they
| could, they didn't stop to think if they should.
___________________________________________________________________
(page generated 2025-12-08 23:00 UTC)