[HN Gopher] California moves to silence Stanford researchers who...
___________________________________________________________________
California moves to silence Stanford researchers who got data to
study education
Author : nradov
Score : 369 points
Date : 2023-07-28 15:19 UTC (7 hours ago)
(HTM) web link (edsource.org)
(TXT) w3m dump (edsource.org)
| troupe wrote:
| It doesn't seem entirely unreasonable that if a school system
| gives a researcher access to data that isn't shared with the
| public, the researcher agrees not to use that information to sue
| the school system. Such agreements would allow the school system
| to be more free to share information.
|
| The issue here seems to be that the school system is saying that
| the researchers aren't allowed to be a witness in any lawsuit
| against the school system regardless of whether it has to do with
| the data that was shared with the researchers.
|
| I think a bigger issue is whether the school system should be
| allowed to keep any information private in the first place. If
| the information can safely be shared with a particular researcher
| then it seems like there is minimal benefit to society in letting
| the school system pick and choose who gets access and who
| doesn't.
| BenGuz wrote:
| What data do you want to see? Most of it exists publicly but is
| very messy. You can get basic financial information here,[1]
| but data on student outcomes and school climate is very siloed
| - if there's a specific school/state you're interested in, I
| could help you find information.
|
| Even if you're a researcher, good quality data rarely exists.
| In NYC, which collects more data than any other school
| district, you're mostly relying on a (publicly available) 100
| question survey sent to every student. The survey author must
| have never talked to a child because the questions are worded
| like a clinical psychology paper. At low income schools the
| survey has a 20-30% response rate.[2]
|
| [1]
| https://nces.ed.gov/ccd/schoolsearch/index.asp?ID=2512750020...
| [2] https://tools.nycenet.edu/snapshot/2022/
| jrochkind1 wrote:
| While not _as_ "entirely unreasonable" as what the state is
| actually doing -- and I think we should be clear, that, as you
| say, the state is doing way worse and trying to prevent
| researchers from testifying on any matters at all...
|
| I'm not totally sure it's actually reasonable for a government
| to withhold data from researchers because they think it might
| be used against them in a lawsuit either. Is that a valid
| reason for a government institution to withhold data?
|
| Perhaps a court case will end up establishing that the broader
| thing is in fact unreasonable under the first ammendment too,
| perhaps this is a good "test case" being even so much more
| egregious, you always want an especially egregious case.
| dragonwriter wrote:
| > The issue here seems to be that the school system is saying
| that the researchers aren't allowed to be a witness in any
| lawsuit against the school system regardless of whether it has
| to do with the data that was shared with the researchers
|
| I think the issue isn't being a witness in the general sense,
| but an expert witness which is either a paid gig or one which
| payment is waived because of other alignment of interests.
| Being an expert witness against someone you are in any kind of
| working relationship with is a clear and obvious conflict on
| interest.
|
| > If the information can safely be shared with a particular
| researcher then it seems like there is minimal benefit to
| society in letting the school system pick and choose who gets
| access and who doesn't.
|
| So HIPAA-protected data that meets the standards for research
| sharing should instead be made public? (And if you say, "well,
| its different, this is the government"--government holds lots
| of data protected by HIPAA.
| lowbloodsugar wrote:
| >expert witness against someone
|
| The state is not a "someone". The state is in an extremely
| privileged position legally, and as such is bound by the
| First Amendment which you and I are not.
| dragonwriter wrote:
| Sure, the State is bound by the First Amendment, and there
| is a fair debate as to whether the clear conflict of
| interest involved in being an expert witness against the
| state must be tolerated alongside research data sharing for
| that reason, either in general (unlikely, IMO) or at least
| in the specific case where there is no nexus with the
| shared data (more likely).
|
| But it _is_ a clear conflict of interest.
| lowbloodsugar wrote:
| The problem is your claim of conflict. Measurement of the
| government by the people can never be considered a
| conflict. If the data shows that the government, the CDE,
| failed to improve outcomes, that is just data. It is the
| opposite of a conflict. The CDE is required to improve
| outcomes: suppressing information that it failed to do so
| is antithetical to that outcome. The CDE needs this
| information to do its job, regardless of claims by PHBs
| to the contrary.
| heliodor wrote:
| What information does a school system possibly need to keep
| away from the public? This smells.
| [deleted]
| amalcon wrote:
| Most obviously, personal information about students.
| troupe wrote:
| There might be some information that could be combined with
| other data in ways that would violate the privacy of students
| and their families. Obviously discipline records with student
| names shouldn't be public, but what about records without
| names where the students name could be found by linking it
| with other data.
|
| AOL released a bunch of search queries in 2006 with they idea
| that they were anonymous, but it turned out you could get
| quite a bit of personal information from them by linking
| searches together.
| nend wrote:
| If you're curious, read up on FERPA.
| nonameiguess wrote:
| Presumably all data about student educational outcomes, which
| are protected by FERPA. The school system doesn't have the
| option to just ignore the law and make this information
| public.
| bluGill wrote:
| If you discover evidence of wrongdoing, then you are ethically
| obliged to act on it regardless of any other contract. We have
| whistle blower laws for this reason.
|
| Even if the wrongdoing is not a criminal matter, if you
| discover a reason that someone can be sued, then you have an
| obligation to inform those who could sue and act as a witness
| for them in court. The only exception to this is if you are the
| lawyer for the party you discover the data - and then you have
| an obligation to inform them they can be sued so here is how to
| fix the problem in good faith (good faith meaning if it is
| discovered you as a lawyer will argue that when the problem was
| discovered they fixed it, and thus court should dismiss the
| problem as an honest mistake that was corrected - the courts
| should in turn if not dismiss the case at least award minimal
| damages)
|
| The above needs to take precedence over all contracts.
| prepend wrote:
| It seems unreasonable to me. Public institutions have a duty to
| the public that should be above any "self preservation" to
| protect itself.
|
| I expect that they actually owe more to people actively suing
| them to prevent any shenanigans.
|
| I think this is different from private instructions who have
| no, or very different, duty to private citizens.
| polygamous_bat wrote:
| > Public institutions have a duty to the public that should
| be above any "self preservation" to protect itself.
|
| Do you hold the same beliefs for publicly traded companies?
| Or do you just have unreasonable bars for government
| institutions only?
| prepend wrote:
| The state exists for the people. They serve me (and you).
|
| Publicly traded corporations are very different.
|
| My tax dollars pay for the government to operate and
| collect data. Not so much for publicly traded companies.
|
| That being said, for publicly traded corporations there are
| regulations on what data they must release, but I think
| it's mainly about financial performance.
|
| So a private education system would not need to release
| anonymized data on its students. But a public education
| system has a legal duty.
| fluoridation wrote:
| How is that relevant? Because both have the word "public"
| in their descriptions?
|
| (Incidentally, I think a lot of ills in the modern world
| exist because of companies that exist only to increase
| their value in stock exchanges, rather than to be useful.)
| indymike wrote:
| > It doesn't seem entirely unreasonable that if a school system
| gives a researcher access to data that isn't shared with the
| public, the researcher agrees not to use that information to
| sue the school system. Such agreements would allow the school
| system to be more free to share information.
|
| That is not what is going on here. The research is being asked
| to testify _against_ the school system by someone who is suing
| them.
| jjk166 wrote:
| > The issue here seems to be that the school system is saying
| that the researchers aren't allowed to be a witness in any
| lawsuit against the school system regardless of whether it has
| to do with the data that was shared with the researchers.
|
| While that does seem overbroad, if the restriction were only on
| cases related to the data shared by the researchers, then for
| many cases there would need to be a demonstration that it did
| or didn't relate to the data, and there isn't really a way to
| do that without disclosing the data.
| nickff wrote:
| Wouldn't a more reasonable position be a prohibition on
| researchers acting as _paid_ expert witnesses in cases against
| the school system? I can imagine that might disincentivize
| 'gold-digging' behavior by researchers.
|
| The complete ban on researchers engaging in any litigation
| seems over-broad, and designed to keep potential litigants from
| having access to anyone 'in-the-know'.
| darth_avocado wrote:
| If the institution is public, data should be public as long as
| individual PII is removed. No exceptions. And FOIA requests
| should be able to make this data available to anyone filing for
| an access within a reasonable amount of time. Period.
| tmpz22 wrote:
| > as long as individual PII is removed
|
| The devil is in the details. These records were likely not
| designed to be shared and I'd assume the entire system
| contains vulnerabilities that could create leakage. Leakage
| that could be used to harm individuals in a variety of ways -
| from discriminating future prospects to harassment and much
| much more.
|
| I agree in principle to what you're saying but we need to be
| truthful about what these current systems are capable of.
| maximinus_thrax wrote:
| > data should be public as long as individual PII is removed
|
| This is one of those things when ideology doesn't match the
| real world. If the amount of data is large enough and with
| enough parameters, removing PII doesn't do anything to
| protect privacy.
|
| What about medical records? What about protected classes?
| What about data about vulnerable people or victims?
|
| Student data is protected by another layer of regulation and
| for good reason. Also, the judiciary is a 'public'
| institution in general. Should we not seal records for
| minors? 'No exceptions' - year right..
| prepend wrote:
| This is true but it's quite possible to correct for unique
| or infrequently occurring combinations so privacy is still
| preserved and data are made available.
|
| It's not that hard to design data release to compensate for
| privacy protections and statistically test for a specific
| level of risk. There's a whole body of work on statistical
| disclosure control and there's plenty of open source or
| cheap enough privacy enhancing technology available.
|
| I'm not familiar with CA, but I expect they have someone on
| staff who can produce a "safe" dataset that preserve
| privacy and still allows for this question to be researched
| by low level geography, demographic, and socioeconomic
| factors.
| nradov wrote:
| We're getting a little off topic here, but the Federal
| government has published specific guidance on de-
| identification of medical records. You can construct some
| artificial scenarios where re-identification might be
| theoretically possible through record linkage with other
| data sources but in practice it's unlikely. In principle a
| similar approach could be used for student data, although
| I'm not familiar with the legal issues.
|
| https://www.hhs.gov/hipaa/for-
| professionals/privacy/special-...
|
| But all of this is orthogonal to the core issue of whether
| a state government should be allowed to prevent researchers
| from participating in lawsuits. There is no student privacy
| issue involved there. Witnesses in a civil suit still
| aren't allowed to violate student privacy laws regardless
| of the data they have access to, so it makes no sense to
| conflate those issues.
| dragonwriter wrote:
| > We're getting a little off topic here, but the Federal
| government has published specific guidance on de-
| identification of medical records.
|
| But releases (even without patient consent, with an IRB
| waiver) of non-deidentified PHI data for research is
| allowed, and this is specifically because
| deidentification necessarily destroys elements that would
| often be necessary in research.
|
| > You can construct some artificial scenarios where re-
| identification might be theoretically possible through
| record linkage with other data sources but in practice
| it's unlikely.
|
| It is explicitly part of the HIPAA safe harbor standard
| that, in addition to removing the required identifiers,
| you _cannot_ come up with such a scenario, and if you
| can, the data is not deidentified. (The last criterion of
| the standard is "The covered entity does not have actual
| knowledge that the information could be used alone or in
| combination with other information to identify an
| individual who is a subject of the information".)
| nradov wrote:
| What does any of that have to do with the legal issue of
| whether a state should be able to prohibit participation
| in certain lawsuits as a condition of gaining access to
| research data? Neither party has raised re-identification
| as a concern, nor have there been any allegations of
| privacy law violations.
| dragonwriter wrote:
| > What does any of that have to do with the legal issue
| of whether a state should be able to prohibit
| participation in certain lawsuits as a condition of
| gaining access to research data?
|
| As you yourself noted upthread, you had already taken
| this subthread afield from that topic.
| ke88y wrote:
| I also completely lost the plot here...
|
| AFAICT, the thread went something like this:
|
| The top-level concern is something like this: professors
| use their trusted relationship to schools in order to
| make bank on expert witness fees, which feels a bit
| corrupt and calls into question the researcher's motives.
|
| A rebuttal to this concern is that we can side-step that
| issue entirely because these data sets should be public
| anyways (anonymized, of course!). This obviates the above
| concern, since the researchers won't need to compromise
| themselves in order to get exclusive access to data that
| allows them to be expert witnesses and rake in $$$$.
|
| But the problem with that proposal is re-identification:
| if we can't make the data anonymous, then we all agree
| that it shouldn't be released (implicit in the
| "anonymized, of course!" caveat to "just release all the
| data" proposal).
|
| Then you pointed out that even for more important data
| like healthcare data, FDA apparently has ways of allowing
| release of data that takes into account the risk of re-
| identification risk (I didn't know this; thanks for
| sharing!)
|
| Then dragonwriter and you got deep into the weeds on
| HIPPA stuff.
|
| TBH I have no idea which of you is most correct here. But
| anyways, there are two ways for this conversation to go:
|
| 1. You are correct, good enough anonymization is
| possible: Stanford researchers should not be silenced; it
| is problematic that they have access to data other people
| cannot access, but the correct solution is to negate the
| originally problematic distinction between those
| researchers and the general public by making data public.
| Then there is no reason for the researchers to agree to
| these contract clauses, because they will have access to
| the data.
|
| 2. dragonwriter is correct, good enough anonymization is
| not possible: We can go back up to the top-level concern
| and observe that "just release all the data with
| anonymization" isn't a feasible solution to this problem.
| Or maybe there isn't actually a problem here at all. IDK.
| But in any case, "obviate the problem in the top-level
| post by releasing anonymized data" isn't a workable
| solution.
|
| Again, not following closely enough to have an opinion,
| but that's where we are now.
|
| I think a good compromise position is that we should have
| a law stating that K12 data should be available to
| certain education researchers -- subject to IRB approval
| and so on -- without any other strings attached.
| Including "don't sue me" clauses in releases of public
| data sets does feel like an inappropriate abuse of
| student privacy concerns.
| nradov wrote:
| The researchers don't have a trusted relationship with
| schools. They have a contractual relationship with the
| state government. The fundamental issues underlying the
| lawsuit are First Amendment freedom of expression and
| contract law; expert witness fees and researchers'
| motives are irrelevant.
|
| Whether student data de-identification is good enough or
| not is a total red herring. No one has accused the
| researchers in this case of violating privacy rules. The
| comments here about such privacy issues are largely
| hypothetical and tangential.
|
| If you think that California needs a new law expanding
| research access to educational data then feel free to
| suggest that to your state legislators, or sponsor a
| ballot initiative.
| ke88y wrote:
| LOL what? I'm so confused.
|
| It's not a red herring. It's a side conversation about a
| different but related topic.
|
| Someone proposed just releasing all the data.
|
| Someone else replied with why that wouldn't work.
|
| Ie, a conversation happened and the topic of discussion
| shifted.
|
| FWIW I agree with you on the object level question. No
| idea why you're being so abrasive, especially when you're
| the one who initiated/continued the conversational thread
| about deanonymization and even prefaced with "We're
| getting a little off topic here".
|
| Presumably at that point you understood that the topic of
| conversation had shifted, and people's
| agreement/disagreement didn't necessarily have anything
| to do with the original topic... since you literally said
| so and no one disagreed... so your reaction here is
| pretty odd and off-putting.
| gopher_space wrote:
| Neutral bystander here, I didn't get any of that from op.
| godelski wrote:
| > If the amount of data is large enough and with enough
| parameters, removing PII doesn't do anything to protect
| privacy.
|
| Honestly it doesn't have to be that large. We see this all
| the time with data websites or apps collect. Sure, you
| remove John Smith's name, but you still have his GPS
| coordinates. For the school, you remove Professor Smith's
| name, but you have a professor who teaches CS 123 and has 4
| graduate students. You bet you can guess who that is.
|
| I really do support open data, especially about public
| institutions, but at the same time we are in an era where
| this information is quite powerful. Seems to make a case
| for something like homeomorphic encryption or something,
| but will that even stop these collisions?
| CaptainNegative wrote:
| The appropriate notion here seems to be Differential
| Privacy, which is a mathematical definition informally
| saying "a scrambling of the dataset that is information
| theoretically indistinguishable from that where one
| arbitrary person is added or removed". It's a
| surprisingly deep topic, with entire (very good!)
| textbooks dedicated to it.
|
| PDF (entirely legal):
| https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf
| karaterobot wrote:
| In theory this makes sense, and I agree with the spirit. In
| practice, you can often re-identify individuals even in an
| anonymized dataset. For example, if you're dealing with a
| very rare disease, or a small minority group, you can usually
| figure out an anonymized row of data is referring to if you
| really try. So, it's not so simple, and the responsible thing
| to do is not have a blanket policy that takes human judgment
| and accountability out of the loop.
| ke88y wrote:
| I did some consulting on this WAAAAY back in the day.
|
| In the case of education, good enough anonymization often
| isn't really possible. A lot of information about school
| students is public -- yearbooks are gold mines, as are
| results from extra-curricular activities such as sports,
| class notes about whether/where they went to college or first
| jobs after college, etc. Still more can be purchased. As yet
| more can be inferred from public data (eg home address, rough
| estimates at parental income, etc.). This was back in the
| day. I'm sure now it's much worse.
|
| Most of the questions you want to ask about education are
| about treatments and outcomes. If these treatments (eg extra-
| curriculars) and outcomes (eg attended college, graduated HS,
| etc) are public then you can often figure out which student
| corresponds to each supposedly anonymous data-point.
|
| Maybe not perfectly. But way more than you would think. It's
| like the statistics version of those little logic puzzles
| from grade school -- "Four people have red hair. Five are
| girls and two are boys. Billy and Sally are chewing gum. Boy
| gum chewers have brown hair. No one over 5 has red hair.
| Sally is 4. etc. etc. etc. Match each person's name with
| their hair color.". You sort of figure out a small set of
| data points, then look at the results of the paper and
| reverse engineer some statistical calculations, and then a
| surprising amount of the others start falling into place.
|
| We didn't have a name for it when I did this work, but the
| basic point was "if you publish the dataset everyone will
| know little Johnny Table's test scores and GPA". Today this
| is called a reidentification attack.
|
| (I don't really know enough about the topic to have an
| opinion on the article per se, but "just publish everything"
| is definitely not a workable solution :))
| mrangle wrote:
| Provide proof by de-anonymizing one data set from research
| within the past five years.
| ke88y wrote:
| LOL. I'm not going to deanonymize datasets for an
| internet stranger. Certainly not for free, but also not
| at _any_ hourly rate unless I know that the organization
| employing me either has the original non-anonymized
| dataset or at least have very strict internal controls
| about how deanonymized data will be handled.
|
| (I also haven't done this work in a LONG time, and
| there's now a whole lot of academic work on the topic
| that didn't exist back then, so there are probably much
| better consultants for legitimate organizations looking
| to hire for this work.)
|
| If you want proof, you can use google to find LOTS of
| papers along these lines analyzing real datasets. I think
| https://arxiv.org/pdf/cs/0610105.pdf is a fairly typical
| example.
| tmpX7dMeXU wrote:
| I work with large private K-12 datasets regularly. Even
| as a passer-by, we had one for a big government-issued
| dataset in Australia. It might've been census data? It
| might've been just over five years ago? I actually don't
| care. The principles of effective data anonymisation,
| esp. in education, are understood by people that actually
| work in this area. It seems odd to be so demanding.
| malfist wrote:
| Another thing to consider about deanonymization:
|
| HIPAA says that 4 or more digits of a zipcode is PII. The
| people who protect your healthcare think most of a zipcode
| is too cardinal to reveal.
|
| How many schools serve more than one zipcode? More than a
| few?
| tensor wrote:
| I very very strongly disagree. It's important that
| researchers get access to diverse data and industry
| collaboration is often crucial for this. If companies are
| required to make all their data public they will be far less
| willing to collaborate with research. It's already hard
| enough as is to convince corporations that it's worth their
| time.
| jeroenhd wrote:
| Removing PII would probably also involve removing individual
| grades and other information that's necessary for any
| research to be effective. Thanks to predatory data collection
| practices on the internet, we know how little information you
| actually need to deanonimize someone. The problem only
| becomes worse when we're talking about kids.
|
| That said, research that cannot be reproduced is useless.
| There's a balance to be struck here, and it's somewhere
| between "make all data public" and "lock the data in a
| vault".
| dragonwriter wrote:
| > If the institution is public, data should be public as long
| as individual PII is removed.
|
| PII is much broader than most people understand because
| reidentification of what amateurs would see as deidentified
| data is easy (often trivial), and, as a consequence, to be
| useful for research data is often _not_ fully deidentified.
|
| EDIT: As an example, the HIPAA safe harbor deidentification
| standard requires removing 18 kinds of identifiers,
| including, as one of them:
|
| _All geographic subdivisions smaller than a state, including
| street address, city, county, precinct, ZIP code, and their
| equivalent geocodes, except for the initial three digits of
| the ZIP code if, according to the current publicly available
| data from the Bureau of the Census: (1) The geographic unit
| formed by combining all ZIP codes with the same three initial
| digits contains more than 20,000 people; and (2) The initial
| three digits of a ZIP code for all such geographic units
| containing 20,000 or fewer people is changed to 000_
| darth_avocado wrote:
| The intent of the comment was not to say the process is
| trivial or that removing PII is sufficient. However, it is
| not as impossible as people are making it out to be. I've
| worked on datasets at social media companies where
| literally thousands of columns were considered PII but
| realistically removing/scrambling just a subset of columns
| would make it impossible to identify individuals.
| dragonwriter wrote:
| > I've worked on datasets at social media companies where
| literally thousands of columns were considered PII but
| realistically removing/scrambling just a subset of
| columns would make it impossible to identify individuals.
|
| Maybe, though I doubt it was that easy against any but
| the most trivial reidentification efforts, but since most
| privately held PII isn't regulated (in the US at least),
| there's little consequence for a social media conpany
| getting it wrong other than PR.
| prepend wrote:
| HIPAA also allows for "expert determination" [0] for
| deidentification that differs from safe harbor and can
| allow for all sorts of things since there's no definition
| of what an expert is.
|
| And reidentification risk can be as high as even 1% and
| still be acceptable for hipaa. In a dataset of a million
| people that's 10,000 people identified and still be
| "acceptable."
|
| But hipaa doesn't apply to these CA data, it's just the
| clearest example of deidentification regulations I know of.
|
| But it's totally possible to deidentify data suitable for
| release to these researchers. It's just what CA considers
| deidentified and if it's still useful enough to these
| researchers. For the topic they are researching it should
| be pretty straightforward to remove PII enough to protect
| individuals and only remove some really unique
| characteristics (ie, only a single 20 year old or a
| particular race and ethnicity).
|
| But I'm guessing age groups by race and gender and
| socioeconomic are possible to preserve without tying back
| to an individual. Id go so far to say as it would be non-
| trivial, but pretty easy, for CA to produce this for the
| researchers, if not to the general public.
|
| [0] https://www.hhs.gov/hipaa/for-
| professionals/privacy/special-...
| Zetice wrote:
| To add to this, PII isn't always even clear. Different
| jurisdictions identify PII differently, there isn't One
| Master Definition that you pass a unit of data through,
| upon which an authoritative "THIS IS PII" or "THIS ISNT
| PII" is returned.
| orzig wrote:
| +1
|
| We had a multi-month project to get a subset of our data
| considered 'clean', and it required a consultant, a stats
| PhD and many dev hours. It was healthcare, so on the high
| end of paranoia (justifiably) but nowhere it is as simple
| as dropping the "name" column
| kayodelycaon wrote:
| Not when there is an existing law limiting it. FERPA
| specifically protects student records.
| galangalalgol wrote:
| Even anonymous records? That would seem to preclude
| studying the effectiveness of the education system. And if
| they weren't anonymous, what possible conclusions could you
| draw compared to anonymous records that would warrant that
| access?
| rovr138 wrote:
| > as long as individual PII is removed
|
| https://www2.ed.gov/policy/gen/guid/fpco/ferpa/index.html,
| "The Family Educational Rights and Privacy Act (FERPA) (20
| U.S.C. SS 1232g; 34 CFR Part 99) is a Federal law that
| protects the privacy of student education records"
|
| Now the question is if aggregated data with no PII is
| protected.
| jasonlotito wrote:
| Data aggregation does not guarantee privacy.
| prepend wrote:
| Of course not, but there are tests that can be applied to
| determine if privacy is protected.
|
| It's not possible to just aggregate and be done. But it
| is possible to set some privacy threshold and then insure
| that all records conform to that acceptable risk level.
| jrochkind1 wrote:
| If FERPA makes it illegal to share the data with
| researchers, then certainly it shouldn't have been shared.
|
| If FERPA allows sharing the data with researchers, is it
| right/proper/legal to share only on the condition it can't
| be used to harm the schools in court? Presumably that part
| is not in FERPA.
|
| (And to be clear, California here didn't just say they
| couldn't use the data in court, they said the researchers
| could not testify in any court case at all against the
| state. But we're talking hypothetically)
| nineplay wrote:
| You'd have to remove so much PII as to make any examination
| worthless "A student of age <redacted> and gender <redacted>
| at school <redacted> has a GPA of <redacted>". As little
| information as "a 16 year old black male at Main Street High
| School" can be narrowed down to a handful of possible
| candidates at a lot of CA schools.
| prepend wrote:
| I disagree as it depends on how many 16 year old black
| males there are in that high school. It's pretty simple to
| apply k-anonymity to control for an acceptable risk level.
| And add in generalization of age into groups and many
| questions can be answered.
|
| I think you could definite answer race x gender x grade but
| it will be harder when you factor in more unique
| characteristics like household income or vaccination
| status, etc.
| next_xibalba wrote:
| > The issue here seems to be that the school system is saying
| that the researchers aren't allowed to be a witness in any
| lawsuit against the school system
|
| Exactly. With this bit being particularly outrageous:
|
| > "Also, be aware," wrote Cindy Kazanis, the director of CDE's
| Analysis, Measurement, and Accountability Reporting Division,
| "that your actions have adversely impacted your working
| relationship with CDE, and your response to this letter is
| critically important to existing and future collaborations
| between us."
| asdajksah2123 wrote:
| I think you've described the salient issues here very well.
| chaps wrote:
| I think a bigger issue is whether the school system should be
| allowed to keep any information private in the first place.
|
| Are you genuinely suggesting that the public should have access
| to all attendance records, grades, test scores, etc etc of all
| students everywhere? That's the sort of information these
| researchers have.
| goatlover wrote:
| For researchers who follow the guidelines, yes it's necessary
| to do the kind of independent studies needed to assess
| educational development.
| nvy wrote:
| So long as the data cannot identify specific students (i.e.
| it's sufficiently anonymized) what's the issue?
| chaps wrote:
| .....do you genuinely think that the data can be
| sufficiently anonymized to protect the privacy of minors?
| mrangle wrote:
| Data has been sufficiently anonymized for decades.
| eimrine wrote:
| I believe that the data can be sufficiently faked to be
| anonymized.
| pjc50 wrote:
| That also renders it useless!
| IX-103 wrote:
| That's not strictly true. There's some recent work (as
| fascinating as it is incomprehensible) on generating
| datasets that share most aggregate properties with the
| actual dataset (measured through joint probability
| distributions), but do not reveal more than some epsilon
| of information about any individual contained in the
| original data set.
|
| These have the potential to revolutionize private
| computation and analysis, as they provide provable hard
| (theoretical) limits on the amount of information you can
| learn about individuals regardless of the type of
| analysis performed on the proxy dataset.
| mrguyorama wrote:
| Two datasets that share many aggregate statistics are not
| interchangeable.
| doctorpangloss wrote:
| I've actually gone through this process with the CDE and
| I was denied access. The privacy issue is a huge red
| herring, used to co-opt well meaning people like you.
|
| I requested data about STAR, the California standardized
| test used for Lowell's admissions. I wanted rows of the
| form (randomized student ID, STAR question ID, answered
| correctly), however they were recorded, and literally
| nothing more.
|
| They rejected the request because (1) they claimed such
| records didn't exist, which makes no sense because how
| exactly did they administer the test then; and (2)
| because standardized testing is carved out, in their
| opinion, from the related sunshine law.
|
| Why did I want these records? I wanted to show that
| scoring well on tests and using them to gate admissions
| doesn't mean what people think it means. Specifically,
| that if you administered the test Lowell used (STAR) by
| hardest question first, then terminated the test after
| the student gets N (close to 1) questions wrong, you
| would select nearly the same list of students. Only
| asking the vast majority of students only e.g. 1
| question, which they all get wrong, can't possibly
| measure how much they study, how comprehensive their
| knowledge is, etc. But these claims are routinely made in
| defense of the test and its purpose in selecting a class.
| This is coming from someone who wants test based
| admissions.
|
| So clearly political, right? I had to carefully word my
| request around all these conclusions. If you read the
| CDE's requirements, they really have specific political
| goals. You either align with them or you don't. And I
| tried to work around that, and I stilled failed. They
| just looked at the _absence_ of a political bent, and
| correctly concluded that it wasn 't _evidence_ of
| absence.
|
| If you want to do good, politically impactful educational
| research: run your own school. That's what the CDE wants
| you to do. It's not about discovering how to improve
| public schools.
| Dylan16807 wrote:
| If going until the student gets one or two of the hardest
| questions wrong is highly predictive of whether they get
| selected, that implies that students near the selection
| threshold are getting very few questions wrong, right?
|
| > Only asking the vast majority of students only e.g. 1
| question, which they all get wrong, can't possibly
| measure how much they study, how comprehensive their
| knowledge is, etc.
|
| This seems like a strawman?
|
| Yes, a single question can't measure those things _to a
| high degree of certainty_.
|
| But if you have students that do poorly on all the hard
| questions, and students that do well on all the hard
| questions, then asking them a single hard question might
| be 80% predictive of what group they're in.
|
| Why is it bad for that percentage to be high?
|
| The reason the test has lots of questions is specifically
| to increase the predictive quality. Being able to loosely
| predict from a small subset of questions seems reasonable
| to me. It doesn't mean the test is failing to measure the
| student's knowledge.
| mrguyorama wrote:
| >I had to carefully word my request around all these
| conclusions.
|
| Aren't you clearly saying you already had a desired
| outcome and were just fishing for the data to confirm it?
| I mean, I wouldn't give you any data in that case either.
| It's a strong signal that you are motivated by something
| other than what the data shows.
| Dylan16807 wrote:
| "I wanted to show X" sounds like a normal hypothesis to
| me.
|
| What's wrong with how they want to use the data? Sort the
| questions, run the algorithm, see how well the scores
| match the real scores.
| chaps wrote:
| Not sure it's fair to say it's a red herring, or that I'm
| "co-opted" like you suggest. Transparency is kind of my
| main dig -- I get it. Like, I recently helped a small
| team of researchers with some FOIA requests to get access
| to similar information you were denied.
|
| But at the end of the day it's fundamentally important to
| understand at what point transparency and privacy
| intersect.
| doctorpangloss wrote:
| I appreciate you're sincere about these issues.
|
| > But at the end of the day it's fundamentally important
| to understand at what point transparency and privacy
| intersect.
|
| "At the end of the day," these conversations about
| privacy are like 15 minutes long at private schools.
| People still keep sending their kids to private schools.
| I just don't know how much it matters.
|
| They surely care about privacy in their internal research
| and metrics, but they don't employ a full time Privacist.
| They might employ someone who checks the right boxes for
| them and deals with FERPA shit. But because they are
| aligned with the parents in delivering the best
| educations, for the most part, they are trusted to do
| with data what they want, and that sometimes includes
| inviting outside collaborators to look at it, without
| anywhere near the same faff as the CDE.
|
| If you're a journalist and you want to help a private
| school make a better education, out of the thousands of
| private schools, one of them will both let you write
| about it and also tell them something they don't wanna
| hear. Some might use privacy or whatever as the reason
| they don't want to collaborate with you, but on average
| it will be about trust.
|
| The CDE is never going to do that. There's only 1 CDE,
| and they are there to preserve the status quo.
| chaps wrote:
| For sure :)
|
| Very similar things happen when investigating criminal
| cases. There's possibly hundreds or thousands of
| instances of some type of misconduct or improper
| arrest... but none of the defense attorneys with those
| sorts of cases will talk about it with the press because
| of the very real potential harms of talking with the
| press. Or the ones that do talk are too high level.. or
| they might have some ulterior motive like self-promotion.
| It's really hard to express how many issues are a direct
| result of lawyers understandably, but systematically not
| raising any public awareness about truly awful things.
|
| Have you tried getting your data through CPRA requests?
| I'm out in Illinois and our law is pretty decent and not
| super familiar with CA's public records nuance, but it's
| really worth a try. What I know though is that California
| CPRA officers get away with a strange amount of abuse of
| the law. But even with that, you might be surprised what
| records are available. So if you do submit some requests,
| don't exactly expect it to be easy or immediate. Expect
| to be stonewalled, and need to sue at some point though.
| But IME public record suits are pretty hands-off (except
| when they're not..). And most of the lawyers I've worked
| with are upfront about what they will and won't litigate
| over.
|
| One thing you'll find is that.. basically nobody is
| looking into most of the awful things you'd expect would
| have eyes. It's very likely you'll be the only one doing
| those requests, or incrementally identifying how to get
| what you want through multiple requests over months. But
| each step breaks new ground and turns into feedback loops
| if you can build a community around it.
| jasonlotito wrote:
| Data aggregation does not guarantee privacy.
|
| Unless you can guarantee privacy, which "sufficiently
| anonymized" does not, then no.
| giovannibonetti wrote:
| The issue is the "sufficiently anonymized" part. Given a
| large enough number of dimensions, you may be able to
| identify students well enough.
|
| For example, if you take all students that took course A at
| time X, course B at time Y, course C at time Z and so on,
| eventually you might be able to narrow it down to a very
| small group, perhaps to even a single student.
| godelski wrote:
| This will also probably follow a power series too. So it
| isn't unlikely that you could deanonymize someone given
| just 2 courses. Not much information is needed to encode
| a lot of things.
| chaps wrote:
| Correct. I've done a fair amount of de-obfuscation work
| and it's _frighteningly trivial_ sometimes.
| arcticfox wrote:
| How can the researchers simultaneously publish research
| and not be allowed to testify to their conclusions in
| litigation though? It seems clear that this is not a
| privacy concern and is rather a protective measure.
| chaps wrote:
| On that we both agree.
| mrangle wrote:
| You're attempting to make an argument against any
| anonymized data being used in research. You'd have to do
| better than a hypothetical to make headway with it.
|
| Moreover, the logic would have to carry over to the very
| common practice of anonymizing data in professional
| communications (like training). Which would have HIPAA
| implications for some students.
|
| The common anonymizing practices have been utilized for
| decades without privacy breaches of note. That track
| record is also what your argument would have to defeat.
| chaps wrote:
| What's being discussed is the release of information
| directly to the public, not strictly to researchers.
| mrangle wrote:
| What I mentioned applies to safeguards against de-
| anonymization in the event of public access. Ie: a
| published research paper or professional notes left
| behind on a bus.
| chaps wrote:
| You're a little unclear there, friend. Can you please
| articulate your point a bit more clearly?
| mlyle wrote:
| Anonymizing a huge data set like this is impossible.
|
| Also, the burden of proof is on those that say that the
| data has no privacy implications, not on those who are like
| "ehhh, it's probably safe to release this."
| prepend wrote:
| > Anonymizing a huge data set like this is impossible.
|
| That depends. The entire dataset of course because it's
| everyone's student records. But you can probably subset
| it to the extent that it's still useful and perturb
| enough to protect individuals and be statistically
| equivalent.
|
| And you could also generate a bunch of aggregate results
| that do stuff like identify average grade differences
| before and after periods while correcting for other
| differences without including individual identifiers.
| chaps wrote:
| You're moving goalposts, friend. What you're suggesting
| and what OP are suggesting are in two completely
| different categories of disclosure extent. I don't think
| anybody here is suggesting that _no_ data should be
| available to the public.
| prepend wrote:
| I don't think so. OP was saying the data should be
| released. CA is saying it can't at all for privacy
| reasons.
|
| I'm saying that the data OP can be released with
| perturbations made to protect privacy and the data are
| still useful.
|
| I don't think anyone is calling for the raw data to be
| dumped. But for as much data as possible to be released.
| mlyle wrote:
| I agree with the person's criticism of your comment.
|
| Yes, obviously there is a level of aggregation where
| privacy concerns no longer hold.
|
| But there is no trivial transformation that allows
| education researchers the data they need but preserves
| anonymity. Education researchers want to aggregate and
| statistically sample the data in new ways; pre-
| aggregating it removes most ability to do so. If you want
| to do a principal component analysis of a few variables--
| good luck with aggregate data.
|
| If you provide nearly any data at the student-level,
| there's a pretty high chance that it can be deanonymized.
|
| At the same time, the state's position of attempting to
| prevent education researchers from participating in
| litigation (when using only public, non-restricted data)
| is egregious.
| dragonwriter wrote:
| > So long as the data cannot identify specific students
|
| It almost certainly can, even if it does not explicitly do
| so.
| amalcon wrote:
| The word "anonymized" needs to be excised from our
| collective vocabulary. "Anonymization" is not a thing that
| can be meaningfully done to a dataset about individuals.
| Coarse aggregation is possible, and the only practical way
| to achieve this end, but this has its own drawbacks in a
| research context.
| roody15 wrote:
| I am a bit confused by the case that is wanting to use the
| researchers data.
|
| So there was measurable learning loss from remote learning and
| during the pandemic.
|
| Ok this is known in education.
|
| The state has only relied on individual districts to make up the
| learning loss.
|
| Ok so that makes sense. There is no magic bullet on fixing the
| learning loss issue. The state relying on individual districts
| taking a multi approach to learning loss .. seems reasonable.
|
| I don't understand the merits of the lawsuit. The state of
| California is already aware of learning loss and is looking at
| ways to address.
|
| To be sued because the state of California didn't do x,y,z by the
| paintings seems incredible short sided and unrealistic. We are
| still learning how to best address learning loss from 2020.
|
| Just my two cents
| s1artibartfast wrote:
| This article doesn't talk about the case itself, so we would
| have to find a different source discuss that.
|
| I don't find it implausible that the state could have been
| negligent or knowingly inequitable in its learning deficit
| response.
|
| A simple example would be if it suppressed internal reports
| about the impacts and needs, or ignored them when structuring
| its response.
| jdkee wrote:
| "Sunlight is said to be the best of disinfectants; . . . "
|
| -Louis Brandeis
| [deleted]
| themitigating wrote:
| That's an oversimplification of the real world as a metaphor
| and if taken literally, also not true
| s1artibartfast wrote:
| If we're being literal and pedantic, Sunshine probably does
| more disinfection then every man-made method put together.
| adamnaga wrote:
| [dead]
| AlbertCory wrote:
| There are a lot of comments on my answer about expert witnesses,
| so I'll collect it all here:
|
| Martin Rinard is a star. They pay him $850/hour because he
| testifies well, and he's done it before. He's got the credentials
| from MIT so juries tend to listen to him. I remember this
| exchange:
|
| Apple lawyer: So that was a lot of money! Martin: It was a lot of
| work.
|
| People seem to be doing some object inheritance from an ancestor
| post's "one month" but I didn't say that. His work would have
| gone over many months.
|
| They interview him, and he writes something. Then the lawyers
| rewrite it. Then they all go over it, line by line. It's
| excruciatingly boring. I sat in on two days of the review of a
| different expert witness' 300-page declaration, and they had
| another day planned after me! They probably have a mock trial,
| where he practices his testimony (I'm not sure how prevalent that
| is).
|
| I didn't work on Apple v. Samsung; I was just a spectator.
|
| I don't know what an expert witness would get in this Stanford
| thing, but it doesn't seem to me like the spending would quite so
| wild.
| mcpackieh wrote:
| The truth does not fear investigation.
| asdajksah2123 wrote:
| There's a long history of nearly every major freedom supporter
| or civil rights supporter being investigated and wrongly
| imprisoned and even killed across the world. And that's just
| the famous ones we've heard of. There are an order of magnitude
| more who were done with well before they became historically
| famous and no one even knows about them.
|
| This bumper sticker quote doesn't really track in the real
| world.
| mcpackieh wrote:
| So are you meaning to say that California is afraid these
| Stanford researchers are going to imprison people, wrongly or
| otherwise? Come on. This isn't the police, these are academic
| researchers.
| LexiMax wrote:
| I think the point is that reality is a complex and messy
| place, and shallow platitudes don't really add much to the
| conversation.
| pseg134 wrote:
| You are right. The world is way too messy for morality
| and honesty.
| IX-103 wrote:
| The world is too messy for _naive_ morality and honesty.
| People are too easily swayed by anecdotes or irrelevant
| facts.
|
| In your "moral" world of brutal honesty: the children of
| serial killers would never find work, people who were
| caught cheating on a test in kindergarten would never be
| allowed in positions of power, and people with non-
| mainstream interests would be sidelined in favor of those
| that people more closely identify with.
|
| Is it right to hide things from people who would use that
| information incorrectly and to society's detriment? I
| think so, and that's why I believe people should have a
| right to privacy.
| mcpackieh wrote:
| The "shallow platitudes" cut through the BS. The
| government is trying to gag researchers because they want
| to hide their own failure. The narrative that the
| government is afraid of PII being revealed during a trial
| is straight horse shit. The courts themselves will decide
| what is or isn't appropriate information for a witness to
| share on the stand.
|
| Furthermore, am I to believe these researchers _are_
| trusted not to share student PII when doing their normal
| academic research, but at soon as they become witnesses
| against the state that trust is no longer warranted?
| Bullshit. If protecting PII were the motivation they
| would not allow researchers to access that PII and
| publish their findings. What they 're actually doing is
| preventing those researchers from testifying against the
| state. They're not protecting students, they're
| protecting the state's interests.
| nemo44x wrote:
| Yes, but narratives do.
| [deleted]
| hex4def6 wrote:
| "The truth does not fear investigation."
|
| I don't think you'd feel the same if you were the defendant in
| a lawsuit, even if you had a rock solid case.
|
| You might be completely vindicated, but bankrupted. Or, perhaps
| your lawyer is a dud, and fumbled the ball. Or perhaps the jury
| were idiots. Or perhaps the law has some unknown (to you)
| technicality that you end up hanging for. Or perhaps during the
| investigation you honestly misremember something or misspeak
| and the police / investigators become convinced you're guilty
| and spend all their time and resources trying to pin it on you.
| Or maybe they're just lazy, and you end up being an easy
| target. Don't worry, if you plead guilty you'll avoid a lengthy
| court battle that you can ill afford, and potential prison time
| if found guilty (are you that confident in your lawyer, your
| finances, the jury, and the legal system?). If you plead no-
| contest, you avoid jail, weeks or months of time off work
| defending yourself, and just do probation. But wait, I thought
| you had the Truth on your side?
| stainablesteel wrote:
| these schools are horrendously insidious, nothing should be kept
| secret, and all this [tax-payer funded] data should be public to
| begin with
| 1024core wrote:
| Years ago, there was a site which had photos of people, and asked
| you to guess: murderer or software engineer? (I'm going from
| memory here, so let's not get sidetracked by the details).
|
| In a similar vein, we need a site that lists actions taken by a
| state government and asks: was this in Ron DeSantis' Florida or
| California?
| ilikehurdles wrote:
| Or similarly, one listing low-rank in performance and quality
| of social programs and asking: was this Oregon or Mississippi?
| jjtheblunt wrote:
| or chicago, in general (where i'm originally from)
| zeroCalories wrote:
| Which one is the swe, and which one is the murderer?
| rufus_foreman wrote:
| Programming language inventor or serial killer?
| https://vole.wtf/coder-serial-killer-quiz/
| downWidOutaFite wrote:
| Desantis was doing the same thing in Florida, preventing
| proffesors from testifying in a voting rights case against the
| state. https://apnews.com/article/lawsuits-florida-ron-desantis-
| vot...
| appplication wrote:
| If I could wear a tin foil hat for a minute: it could be
| plausible that CA could fight this to allow it to escalate to
| the Supreme Court and establish a judicial standard for these
| types of cases.
|
| I don't really understand why, of all the CA government
| institutions, the CDE finds this to be appropriate stance
| though. An educational office should absolutely be held to a
| much higher standard than this, and should at its core value
| openness of information and freedom of speech. The fact that
| this lawsuit exists at all is an indication of deeply
| problematic internal values within CDE that are completely
| misaligned with its mission and governmental function.
| jeremyjh wrote:
| CDE is trying to protect itself from the consequences of poor
| policy decisions made at the highest levels of the department
| and state government.
| justrealist wrote:
| That's also questionable, but it's not the same thing.
| PartiallyTyped wrote:
| Is that not a first amendment violation?
| mcpackieh wrote:
| Yes, both are and one does not justify the other.
| PartiallyTyped wrote:
| I am glad you stated that because it's good to have it in
| writing.. we can't take anything for granted these days.
| jimbob45 wrote:
| Florida has the Sunshine laws which would likely preclude this
| from being a problem in the first place.
|
| Florida bad though upvotes to the left.
| ironmagma wrote:
| Sunshine laws have nothing to do with court testimony.
| jimbob45 wrote:
| _Observers say the dispute has the potential to limit who
| conducts education research in California and what they are
| able to study because CDE controls the sharing of data that
| is not available to the public._
|
| All data in Florida from public institutions are public.
| There would never have been controversy in the first place.
| But yeah, you're right - the Sunshine laws have nothing to
| do with testimony.
| ke88y wrote:
| _> We 're getting a little off topic here_
|
| LOL what? I can request student's grades and disciplinary
| records via an open records request?
| LexiMax wrote:
| Previously on HN:
|
| https://news.ycombinator.com/item?id=29065265
| gnicholas wrote:
| Professor Dee was one of the authors of some excellent research
| regarding SFUSD's math detracting experiment:
| https://www.edweek.org/teaching-learning/san-francisco-insis...
| TheMagicHorsey wrote:
| One of the weird things about America is that we all know Asian
| kids are better at math than other kids on average. Its pretty
| obvious to anyone that's been in a class with Asians or taught
| Asians. I've done both.
|
| But nobody can actually say this. Instead we have to pretend
| like this isn't the case. Just look at math Olympiad teams. I
| coached one years ago. My entire team was Asian except for two
| alternates. One who was Russian, and the other Indian.
|
| Yes, environment can change outcomes ... but maybe it can't
| change outcomes to a point where everyone is going to perform
| the same. Are we going to try to get everyone's 100M sprint
| into the same range too? People are different.
|
| We should give every individual the same shot at opportunities
| but I don't think we are ever going to make Asian kids perform
| at the level of other kids in math or vice versa. Its not
| environment. Every one of us that has taught an engineering or
| math course knows this. Even if we don't talk about it.
| dragonwriter wrote:
| > One of the weird things about America is that we all know
| Asian kids are better at math than other kids on average. Its
| pretty obvious to anyone that's been in a class with Asians
| or taught Asians. I've done both.
|
| > But nobody can actually say this.
|
| "Asian kids in the US are, on average, better at math and,
| furthermore, this effect is stronger the fewer generations
| removed from immigration they are, and is in large part due
| to well-established general familial impacts on performance
| and the selective filter of immigration."
| tmpX7dMeXU wrote:
| This is absurd cable news pundit-level commentary. It doesn't
| sound like you've actually looked into this. More that you've
| taken some snippets of your life experience and explained it
| using your preconceived worldview. Nothing empirical about
| it. Nothing scientific. And the cherry on top is the
| implication that you're "saying what we are all thinking".
| Your experience teaching engineering or maths courses doesn't
| qualify your baseless intuition as to causality, especially
| when the stakes are so high as to typecast such large groups
| of people.
|
| This is a classic case of a misplaced assumption of
| transferable expertise.
| moosey wrote:
| To sum up another comment: it's cultural, not biological.
|
| Race is not a useful scientific guideline for any kind of
| scientific study. For example: there is as much biological
| diversity in sub Saharan Africa as the rest of the world, but
| racially, the best we can do is "Black", or "African". It's a
| useless, dated concept that we, as species, find it difficult
| to work past because our brains are categorical engines.
|
| I'm as politically "leftist" as anyone you'll ever meet, but
| we have to be able to do better than "Asians are good at
| math" to make effective decisions about education, amongst
| other problems. This is of course impossible with the current
| world and thinking. Even though I know race isn't real, I
| still see it. It still has an impact on my day to day
| actions, because my stupid brain is all too happy to
| categorize people on how they appear.
|
| Taking another route: to say that Asians are good at math is
| categorical error. The word "Asians" represents something
| abstract, and abstract things cannot take action. Categorical
| error is basically the starting point for the various "isms"
| like misogyny, misandry, racism, etc.
| krapp wrote:
| This isn't a problem we have as a species. It's not
| biological, it _is_ cultural. The racial categories we use
| today were created in the 17th century to justify the white
| supremacist apparatus of slavery and colonialism - prior to
| that, people tended to categorize humanity by tribe,
| ethnicity or religion rather than superficial physical
| traits. Asian people, for instance, didn 't see each other
| as the same "race" until white people came along and
| assigned them that categorization.
|
| You, I and everyone else are stuck in this way of thinking
| because we've been so thoroughly indoctrinated into a
| system of white supremacy which permeates the entirety of
| Western culture, it isn't even noticeable, like we're in
| the Matrix. It persists because it's useful for keeping the
| power centers that benefit from it entrenched, and everyone
| else divided.
|
| We can move on from it, but I think the first thing we need
| to do is recognize that it isn't inevitable.
| Toast_ wrote:
| >Race is not a useful scientific guideline for any kind of
| scientific study.
|
| Tell that to prostate cancer researchers.
|
| To say race "isn't important" is completely ignorant.
| bhickey wrote:
| Hi. I did my master's in computational biology focusing
| on androgen independent prostate cancer. After that I
| worked in an autoimmunology lab. My projects included
| rheumatoid arthritis GWAS and b-cell phylogeny. To
| demonstrate that we did case-control matching correctly,
| I looked at how well self-reported ancestry corresponds
| to hapmap populations. The mapping is very noisy. "Race"
| is a social classification, sure it's correlated with
| biological markers but there are better measures. So,
| yeah, "race" as such isn't important.
| avierax wrote:
| I don't believe it's cultural only, the same way I don't
| believe ethiopians or kenyans excelling at marathons and
| long distance runs to be a cultural thing. Genetics play a
| factor, why can't math skills be influenced by genetics as
| well?
| bhk wrote:
| "Ah, it looks like you have a nice academic career going here.
| It'd be a shame if something were to happen to it..."
| say_it_as_it_is wrote:
| This has everything to do with demographics and science that
| presents findings contrary to ideology/politics. The same kinds
| of people pressure police to omit demographics data in police
| reports.
| anon84873628 wrote:
| Interestingly, the article states that the data sharing
| agreements do not limit what the researchers can publish. They
| can share results/conclusions critical of the state, which
| could then serve as a basis for litigation.
|
| What's weird is that they are being prevented from voluntary
| testimony on cases unrelated to the specific shared data, thus
| unnecessarily removing many experts from the pool.
| timcavel wrote:
| California has the strictest Science Denial laws on Earth, so HN
| must censor everyone reading this for 180 days.
| yttribium wrote:
| There's a distinction between a fact witness and an "expert
| witness". A private agreement can't prevent a court from
| subpoenaing a fact witness to testify. "Expert witnesses" are
| overwhelmingly hired guns paid to come in and voluntarily spin a
| narrative, and I'm not sure why they shouldn't be able to make
| that a provision of a contract just like any other commercial
| arrangement.
| phpisthebest wrote:
| Mainly because it is a government agency, and government agency
| do and should have lots of restrictions on them that are not
| like "any other commercial arrangement"
|
| One of the biggest things I disagree with republican on is that
| "government should be run like a business" no... it should not
| themitigating wrote:
| Seems pretty obvious because the main/sole purpose of
| business is to make money. Why would you want a government to
| act like that?
| phpisthebest wrote:
| To steel man it, it is because running a for profit
| business also requires efficient use of limited resources
| and driving out waste from the processes
|
| So "running government like a business" is a way to ensure
| tax money is being spent effectively and efficiently
|
| to be clear I dont think that is the best way to accomplish
| that goal, thus why I disagree with it, but it is not
| "profit" that drives that statement
|
| It came about because far far far far far too often
| government programs and spending are judged by their
| intentions, not their actual results.
| themitigating wrote:
| Only one aspect of a business is efficiency, not all
| businesses are efficient, and finally your view of
| government programs is based on right wing propoganda and
| not facts
| phpisthebest wrote:
| >>finally your view of government programs is based on
| right wing propoganda and not facts
|
| No it is not. That is reality today. Almost no government
| programs or spending is measured on their results.
|
| I would love for you to prove me wrong, and show me a
| government program where the resolution for any failure
| of that program was not "we need more money"
| mcpackieh wrote:
| > _I 'm not sure why they shouldn't be able to make that a
| provision of a contract just like any other commercial
| arrangement._
|
| Because we're talking about part of the government.
| sproketboy wrote:
| Leftists always do that. SCUMS.
| kmeisthax wrote:
| A huge part of the civil litigation metagame is just finding ways
| to legally exempt yourself from being sued. It used to be the
| case that only sovereign states could declare themselves immune
| from litigation, but now that power has been delegated to anyone
| who can convince someone else to sign a binding contract. Which
| is literally everyone because almost every business relationship
| requires contracts. And now we're going to "you can't testify
| against us because you had an NDA" which seems even more
| abusable.
|
| By $NEAR_FUTURE_YEAR the only people who wind up in civil court
| will be victims of extortion.
| tomohawk wrote:
| California has over 900 school districts. That's over 900 highly
| paid executive staffs plus associated bureaucracies.
|
| That's a lot to protect.
|
| By way of comparison, Florida has 69 school districts, and does
| measurably better across the board in providing education.
| onionisafruit wrote:
| Has anybody found a link to the contract in question or a quote
| from the relevant part of it? I'm curious how it seemed ok for
| the researchers to sign a contract with this provision.
| anon84873628 wrote:
| Probably because they didn't have any other choice if they
| wanted to do the research. Redlining won't get you anywhere, so
| need to wait for a situation like this to argue the
| unconstitutionality.
| themitigating wrote:
| So they did have a choice. I don't like the notion that if
| you want something then you don't have a choice on the
| actions you take to aquire it
| spamizbad wrote:
| > At issue is a restriction that CDE requires researchers to sign
| as a condition for their gaining access to nonpublic K-12 data.
| The clause, which CDE is interpreting broadly, prohibits the
| researcher from participating in any litigation against the
| department, even in cases unrelated to the research they were
| doing through CDE.
|
| That's an unreasonable restriction and I expect the ACLU to win
| this.
| ww520 wrote:
| Why are the K-12 data non-public? Aren't they from publicly
| funded institutes?
| spamizbad wrote:
| I'm guessing detailed information on individual students,
| anonymized - that's not something that any edu department
| will make public.
| kayodelycaon wrote:
| There's a law protecting individual student records: https://
| en.m.wikipedia.org/wiki/Family_Educational_Rights_an...
|
| It also goes beyond K-12. Quite a few parents try to get
| their children's college grades and run smack into this law.
| edgyquant wrote:
| Things involving minors tend to have stricter regulations.
| jimhefferon wrote:
| Might it bump against privacy? If you do a search on third
| graders receiving speech services in towns of pop less than
| 3000 in county... at some point you have private information.
| tmpX7dMeXU wrote:
| How would you feel if data pertaining to your interactions
| with government were made public just because the government
| is taxpayer-funded?
|
| How about data pertaining to your kids?
|
| Any Joe Taxpayer doesn't have a right to walk in and demand
| any data they want from a government department. That's
| entirely entirely reasonable. Anonymising data isn't nearly
| as easy as a passer-by with "faker.fake_name()" may think it
| is.
| ke88y wrote:
| Totally unreasonable, but that doesn't mean it's not legally
| enforceable :(
|
| Why do you expect the ACLU to win?
|
| (not arguing. Genuinely curious.)
| karaterobot wrote:
| I'm not the person you're responding to, but I think there's
| a case for optimism based on this being something which seems
| unreasonable, and not is thoroughly established by precedent.
| Especially since the organization challenging it exists to
| make exactly this kind of argument, and has a decent track
| record of doing it in the past. /shrug
| ke88y wrote:
| I've been screwed by enough unreasonable contracts that I
| have little faith. But, yeah, I suppose "ACLU's lawyers
| think they have a case" is as good a reason for optimism as
| any.
| hodgesrm wrote:
| > That's an unreasonable restriction and I expect the ACLU to
| win this.
|
| Looking at the details, it seems that this cannot be a blanket
| restriction, since a judge could compell you to provide
| testimony. [0] At that point it would not matter what the
| contract said.
|
| [0] https://www.law.cornell.edu/cfr/text/43/30.224
| pacbard wrote:
| When you work with state level education data, you do so under
| a research agreement. That means you outline your research
| agenda and the state agrees to provide data to you to answer
| your research questions.
|
| You can't pitch a research project and then go rouge and do
| whatever with the data.
|
| It looks like the state is interpreting that use of student
| data as part of the lawsuit to ve outside the scope of the
| prior approvals, therefore they are preventing Sean and Tom
| from using the data during the their testimony.
|
| Nothing prevents the defense to subpoena the same data and have
| them use it for their testimony.
| arcticfox wrote:
| As I understand it, this is the government saying that data
| it provided cannot be used as the basis for supporting
| litigation against the government.
|
| I am not a crazy disciple of the 1A but that seems pretty
| clearly to be something the government should not be able to
| do. Couldn't the government just slip that language into any
| of their FOIA agreements etc?
|
| It would be a very different situation for a non-government
| actor to have the clause.
|
| Very scummy by government.
| remote_phone wrote:
| How can data collected by the government be private? That should
| all be available to the public since it was gathered with public
| funds. Has no one issued a freedom of information request?
| eesmith wrote:
| Just about every accepts that it's reasonable for some
| government collected information to be kept private. FOIA
| requests exclude "personnel and medical files and similar files
| the disclosure of which would constitute a clearly unwarranted
| invasion of personal privacy".
| https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A...
|
| In this case it was for "student-level data that detail the
| demographic information and the performance records over time
| of California's 5.8 million students but without any names or
| identifying information. That data is the gold standard for
| accurate research. A partnership contract details the
| department's commitments and researchers' responsibilities,
| including strong assurances they will have security protections
| in place to protect students' privacy and anonymity."
|
| The thing about this sort of data is, removing PII from the
| dataset doesn't make it fully or even sufficiently anonymous.
| If there's only one Pacific Islander student in the Shasta
| Union High School District then it's easy to figure out who
| that is by coming it with other public data.
|
| Quoting https://en.wikipedia.org/wiki/Differential_privacy :
|
| ] Statistical organizations have long collected information
| under a promise of confidentiality that the information
| provided will be used for statistical purposes, but that the
| publications will not produce information that can be traced
| back to a specific individual or establishment. To accomplish
| this goal, statistical organizations have long suppressed
| information in their publications. For example, in a table
| presenting the sales of each business in a town grouped by
| business category, a cell that has information from only one
| company might be suppressed, in order to maintain the
| confidentiality of that company's specific sales.
|
| The clear justification for keeping this information private is
| that the government won't get sufficiently useful data without
| this promise. The United States Census Bureau released
| "confidential" information about draft evaders and Japanese-
| Americans; if you think they might do that again, perhaps
| you'll lie about some of the questions.
|
| People who receive this sort of information are required to
| take special care to maintain the needed level of anonymity.
|
| There's of course no reason why this should be used to muzzle
| researchers for completely unrelated fields.
| IronWolve wrote:
| IMHO,Making the public pay for records, at high expense in a
| digital age is how the government limit information. Police
| arrest\crime data, Court data, Zoning Data, Meeting
| transcripts, Budget Data, etc, and yes, Education data.
|
| Society shouldnt accept this data should be behind paywalls or
| accept high costs to access it. Or paper only releases to stop
| release restrictions for costs and size.
| codyb wrote:
| Zoning data and meeting transcripts generally are public? At
| least in NY that's been my experience.
|
| A lot of the rest I'd rather was private. Although it'd be
| nice to get aggregated data for certain crimes which
| currently are tracked at each individual department level and
| not in any sort of national manner.
| civilitty wrote:
| _> How can data collected by the government be private? That
| should all be available to the public since it was gathered
| with public funds. Has no one issued a freedom of information
| request?_
|
| Agreed. What gives the government the right to reject my FOIA
| requests for the exact specification and design files for
| gaseous centrifuges, implosion devices, and nerve gas?
|
| Extreme natsec examples aside, there are a thousand reasons to
| keep government data private, not the least of which is
| constituent privacy. Deanonymizing data is far easier than
| preparing it for release and the data schools keep on students
| is particularly sensitive (I'm not claiming that that's the
| case with this data, just making a general observation).
| MisterBastahrd wrote:
| Please provide us with your contact information, date of birth,
| social security number, height, weight, hair color, and eye
| color.
| themitigating wrote:
| Tax information is also collected by the government, should
| that be public? What about publicly funded hospital records?
| kayodelycaon wrote:
| Student records are protected by federal law.
| https://en.m.wikipedia.org/wiki/Family_Educational_Rights_an...
|
| Personally, I think an individual's privacy should take
| precedence here.
| deathanatos wrote:
| > _Personally, I think an individual's privacy should take
| precedence here._
|
| There's no individual's privacy even at stake here. None of
| the data that's non-public is even material or relevant to
| the dispute here, beyond that the professors in question
| signed an agreement to access the data for unrelated matters.
| verteu wrote:
| Requiring more education/outcome data to be public would help
| prevent this. If education researchers are forced to get data
| from California's Department of Education, there's tacit pressure
| to find results that make DoE look good.
| GCA10 wrote:
| There's a lot more going on here than the initial story reports.
|
| For more than a few academics, making big $$$ as an expert
| witness is a magnificent source of side income. (Fees of
| $1,000/hour, including lots of open-ended prep time, can be
| found.) That begs the question: Did the research lead to the
| desire to be an expert witness? Or did the desire to be an expert
| witness define the nature of the research project?
|
| We'd need to know a lot more about the origins of this project
| before being able to referee this one. But if the state of
| California is worried about litigants using "researchers" to find
| and filter data that ordinarily would be available only through
| legal discovery processes, that's not a crazy worry.
| [deleted]
| ChurchillsLlama wrote:
| The main point of the article is that the CDE is preventing
| those who partner with them from testifying about anything,
| even what's unrelated to the data CDE provides - 'Viewpoint
| discrimination'.
|
| > That begs the question: Did the research lead to the desire
| to be an expert witness? Or did the desire to be an expert
| witness define the nature of the research project?
|
| I don't think these questions are productive. You can't truly
| know why someone does what they do. And making the suggestion
| that the researchers tainted their research because of the
| money is purely speculative and unfair.
| anon84873628 wrote:
| Is there a reason not to take TFA at it's word, which says that
| the litigation in progress (for which expert testimony was
| requested) does not relate to the research those experts were
| conducting through agreements signed with CDE?
|
| The whole problem here is that as a soon as a researcher signs
| the contract, they are barred from participating in _any_
| litigation against the department even if it doesn 't involve
| the private data they were working with. So you have a large
| population of experts removed from the pool, because all the
| experts are likely to be involved in some type of research.
| AlbertCory wrote:
| I went to the Apple v. Samsung trial in 2016 or so, and the
| highest paid expert witness that day was $850. The other two
| were $450 and $350. Where are you getting this number?
|
| The prep time is included in your hours. The $850 guy said he'd
| put in 900 hours.
|
| (btw, it IS excruciatingly boring work. But of course, the
| money.)
| vasco wrote:
| One month of work for $765k. I was expecting one or two
| orders of magnitude lower payouts for a single expert witness
| in a single case. Who can afford to pay this?
| [deleted]
| neurocline wrote:
| I'd love to hire people that can work 900 hours in a single
| month. Just tell me where to find them. Or, wait, maybe
| they work in higher dimensions. Drat.
| roamerz wrote:
| AI
| AlbertCory wrote:
| Where are you getting this "single month" stuff?
| olddustytrail wrote:
| The person they're replying to who said "one month".
|
| One is singular.
| AlbertCory wrote:
| I didn't say that, though.
| Retric wrote:
| What you can work and what you can bill are two different
| things. I know of a few people that charge their rate
| from the minute they leave on a trip which is basically
| the min they put down their phone after accepting the
| contract to the minute they get back. However, they are
| all doing emergency, the company is losing tens of
| thousands per hour on the low end until this is fixed,
| kind of things.
|
| In practice it's equivalent to charging a higher hourly
| rate, but it makes billing simpler for these kinds of
| contracts.
| Jtsummers wrote:
| > One month of work for $765k.
|
| 900 hours is a bit more than one month. Even if he _only_
| worked 24x7 that 's over 5 weeks. Assuming 10 hour days and
| 5 days a week that's 18 weeks, just shy of 13 weeks if 10
| hours a day and 7 days a week.
| tornato7 wrote:
| Apple and/or Samsung
| AlbertCory wrote:
| Who said "one month"?
| drc500free wrote:
| Specialized attorneys with subject matter expertise are
| billing them $1500/hr, so $850 is kind of a bargain.
| AlbertCory wrote:
| He was an MIT professor of CS.
| HWR_14 wrote:
| There aren't even 900 hours in a month. That's 765k for 900
| billed hours, and you have to imagine that a good chunk of
| unbilled hours also occurred. So maybe that's for the
| equivalent of 8 months of boring work. Not continuous 8
| months either, you have to schedule other things between
| prepping. A lot of money. But not for a billion dollar
| lawsuit.
|
| As for "who can afford this" is a company worth tens of
| billions suing over a major product line vs a trillion
| dollar company.
| dragonwriter wrote:
| > and you have to imagine that a good chunk of unbilled
| hours also occurred
|
| No, I don't have to imagine that expert witnesses do an
| hourly-billed contract gig but are bad billing hours.
| HWR_14 wrote:
| Once they get the contract. I guarantee that the
| recruitment process and negotiation process was more
| involved than a phone call. And there could be work
| specifically excluded as "billable hours" that is still
| work. For instance, is the time to fly out compensated?
| gamblor956 wrote:
| All work that an expert does for a case is billable,
| including travel time. However, experts will frequently
| provide discounted or even unbilled work for individuals
| in certain circumstances (like criminal cases where the
| expert is testing in a forensic capacity to counter
| improper forensic analysis presented by a prosecution
| expert).
| HWR_14 wrote:
| Some experts charge for their time at a reduced rate
| (e.g. 50%) for travel, some a predetermined amount
| (taking the risk of delays on themselves), some only for
| the cost of the tickets, hotels, meals, etc.
|
| There is, AFAIK and based on what I can Google, no
| universal answer.
| Dylan16807 wrote:
| > I guarantee that the recruitment process and
| negotiation process was more involved than a phone call.
|
| I suppose, but do you think that was more than 9 hours?
| Because 9 hours is only 1% of their time.
| HWR_14 wrote:
| It could easily be more than 9 hours. How many people
| spend longer than that interviewing for a technical
| position across five rounds? And this is for one of the
| few experts Samsung will put up to defend a $xxx million
| suit.
|
| Plus, he works for MIT. He probably needs to clear his
| consulting work, which could be quick or not. MIT might
| have wanted a percentage. And if he wanted to use a grad
| student to assist him in prep work, negotiating that can
| add up too.
|
| There are other ways to add to the precontract numbers,
| but that should be enough.
| Dylan16807 wrote:
| Nine hours per round times five rounds plus another nine
| hours to get clearance is still only 6%
|
| And that's a huge number of hours for each round.
|
| It's really hard to reach "a good chunk" when compared to
| 900 hours.
|
| If MIT wants a cut that's a different issue, and I doubt
| negotiating that will take particularly long.
| HWR_14 wrote:
| The witness you are talking about was Martin Rinard.
| coding123 wrote:
| not sure if $850 is an hourly, but $850 after 2016->2023
| inflation is $1095 per hour
| foolfoolz wrote:
| a friend of my is a full time expert witness. he went to
| school for an engineering degree and did 1 year of industry
| work. he now provides expert testimony on technical cases all
| over the country. they fly him out to nice hotels with a
| generous per diem. he gets paid very well. they give him the
| materials to present in court. it's a very well paying
| position
| johndhi wrote:
| I worked on that trial! Maybe I saw you there :)
| divbzero wrote:
| What was your role in the trial? Was there anything from
| your experience that stood out as particularly surprising
| or interesting?
| AlbertCory wrote:
| spectator. I added a top-level comment about this.
| peyton wrote:
| Inflation easily puts $850 over the $1k mark today.
| AlbertCory wrote:
| So you're ignoring two of the numbers and just taking the
| third?
|
| That guy was a star.
| Fezzik wrote:
| I litigated mesothelioma cases and our experts were paid
| $600-$1,100/hr, depending on the expert. $1,000 is high but
| not unheard of. What's really wild is, in addition to prep
| time, they get paid that from the second they cross the
| threshold of their front door through when the return; many
| of our experts were flown in from the middle of the country
| to Oregon so they sure pocketed hefty sums.
| prepend wrote:
| I think there are cost maximizing lawsuits (like
| mesothelioma) and then lawsuits that aren't seeking to
| recover damages. And they pay their expert witnesses very
| differently.
|
| I also think there are many academics unwilling to serve as
| expert witnesses for tort lawsuits and they are different
| from "professional" expert witnesses.
| AlbertCory wrote:
| Good point. I doubt there's much money involved in this
| Stanford thing.
| angrais wrote:
| So the $850 guy got $850x900? So $765k? How many months were
| the 900 hours split over? This sounds absolutely ridiculous
| onionisafruit wrote:
| It's great pay if you can get it, and I'm sure it wasn't
| enough to be noticed in the legal costs of that case.
| nonethewiser wrote:
| Just for frame of reference, there are 2080 work hours in a
| year assuming 40hrs/week. So imagine making 765k for like 5
| months of work.
| viscanti wrote:
| They need to be able to have 5 months where they can
| clear the calendars and just work on that. It's still a
| lot for 5 months, but I imagine there's a lot of downtime
| too. Are they getting 5+ months every year?
| nonethewiser wrote:
| I suppose its actually spread out over a long period of
| time
| mrguyorama wrote:
| >hey need to be able to have 5 months where they can
| clear the calendars and just work on that
|
| Otherwise known as "having a job"
| viscanti wrote:
| Yeah. Having a job seems like it could keep you from
| regularly being able to stop everything for 5 months of
| high paid work. Maybe the money is enough from the few
| months that they're fine with it (and maybe it's easy for
| them to get a new job after or go back somewhere they've
| worked before). I'm genuinely curious. It seems like a
| lot to make for 5 months, but what do their earnings look
| like over a 5 or 10 year period?
| psunavy03 wrote:
| Go look at what partners at the biggest white-shoe law
| firms make. Over $1,000/hr.
| grogenaut wrote:
| Is that what they make or what they BILL. IT, Admin
| staff, paralegal, Jr lawyers, building, pro Bono and
| other marketing activities etc. It's paid for somehow.
| teachrdan wrote:
| I believe paralegals and junior lawyers bill for their
| time, too, also at eye-watering rates.
| AlbertCory wrote:
| The witness actually got that $850 an hour. The other
| stuff you mentioned was absorbed by the law firm and
| billed to the client(s).
| johndhi wrote:
| believe it or not, they make more! they make money off of
| what the associates and junior partners bill, too. but
| yes, that figure is what they bill.
| [deleted]
| rrix2 wrote:
| are they being called in as expert witnesses?
| psunavy03 wrote:
| The discussion was whether billing over $800/hr was
| "ridiculous." It's actually common for credentialed
| professionals who are at the very top of very specific
| fields.
| AlbertCory wrote:
| Not for Samsung. Or Google.
|
| Apple's damages expert got paid $2 million.
| l33t233372 wrote:
| 850 in 2016 dollars is almost 1100 in 2023 dollars.
| Zigurd wrote:
| It's not a "crazy worry" but defendants in civil suits have all
| kinds of worries. Regarding impugning Stanford researchers
| (N.b. no scare quotes) as being motivated by a consulting fee,
| that's what those fees are for: to get the best possible expert
| witnesses.
|
| I don't begrudge a good defense attempting to block a
| litigant's experts, either. However, everyone is better off for
| expert witnesses being motivated by fees to provide the best
| expert testimony. If there was something untoward about their
| motivation, it would be Stanford's problem.
| pocketsand wrote:
| Just for context. I'm a PhD trained in education research who has
| met Sean Reardon a handful of times, had a meal with him, gone
| through methods training with him. He sits at the top of the
| field and has the unconditional respect of nearly everyone for
| his methodological rigor.
|
| This is not a guy who shoots from the hip.
___________________________________________________________________
(page generated 2023-07-28 23:02 UTC)