Privacy, Genetics, and Data Governance
In April 2018, police in California arrested a 72-year-old man named Joseph James DeAngelo.
In April 2018, police in California arrested a 72-year-old man named Joseph James DeAngelo. He was charged with at least 13 murders and over 50 rapes committed in the 1970s and 1980s — crimes that had gone unsolved for decades despite being some of the most extensively investigated cold cases in American history. He had been known to the public only by his nicknames: the Golden State Killer, the East Area Rapist, the Visalia Ransacker.
The break in the case came from genealogy.
Investigators had crime scene DNA from the unknown attacker. They uploaded it to GEDmatch, a public genealogy database where people had voluntarily shared their genetic data to find biological relatives. They found partial matches — distant cousins of the killer. From those matches, they built a family tree. Eventually, they narrowed the suspect pool to DeAngelo.
He had never submitted his own DNA. His relatives, decades earlier and for entirely unrelated reasons, had. That was enough.
This case opened a new era of forensic genetic genealogy. It also opened a fundamental question that US law still hasn't fully answered: who owns genetic information, what rights do you have over genetic data you didn't share, and what should the legal framework look like for a technology that didn't exist when most privacy law was written?
---
GINA and Its Gaps
The major federal genetic privacy law in the United States is the Genetic Information Nondiscrimination Act of 2008 — usually called GINA.
GINA prohibits discrimination based on genetic information in two specific contexts:
Title I prohibits health insurers from using genetic information to deny coverage, set premiums, or determine eligibility.
Title II prohibits employers from using genetic information in hiring, firing, promotion, or other employment decisions.
These were significant protections, and GINA represented genuine legislative achievement when it passed. But the law has substantial gaps:
It only covers health insurance and employment. GINA does not cover:
- Life insurance
- Long-term care insurance
- Disability insurance
- Long-term disability insurance
This means a person can legally be denied life insurance based on genetic test results showing increased risk of heritable disease. They can be denied long-term care insurance based on genetic markers for Alzheimer's. The most consequential financial protections — protections against being penalized for genes you can't change — don't apply outside health insurance.
It doesn't cover non-employment, non-insurance discrimination. Schools, housing, and most other contexts aren't covered.
It applies primarily to information about disease risk — genes associated with disease. It doesn't clearly cover ancestry information, behavioral genetic markers, or genetic information used for non-health purposes.
It exempts small employers. Title II only applies to employers with 15 or more employees.
Enforcement is limited. GINA violations are enforced through the EEOC (employment) and standard health insurance regulators. Damages are capped, and the burden of proving genetic discrimination is high.
The result: GINA created an important baseline but left major gaps. Multiple states have extended protections to life insurance and other contexts, but the landscape is fragmented. Federal action to close these gaps has been proposed repeatedly but has stalled in Congress.
---
HIPAA and Its Limits for Biotech Data
Beyond GINA, the major federal medical privacy law is the Health Insurance Portability and Accountability Act (HIPAA), specifically its Privacy Rule (2003) and Security Rule (2005).
HIPAA covers Protected Health Information (PHI) held by covered entities — healthcare providers, health plans, and healthcare clearinghouses — and their business associates. PHI includes genetic information when held by covered entities.
The HIPAA protections include:
- Limits on use and disclosure of PHI without patient authorization
- Patient rights to access and amend their PHI
- Required security safeguards for electronic PHI
- Breach notification requirements
- Enforcement through HHS Office for Civil Rights
These protections matter enormously for traditional healthcare. They do not, however, fully cover modern biotech contexts.
HIPAA's covered entity limitation is critical. Many entities handling genetic data are not covered entities:
- Direct-to-consumer (DTC) genetic testing companies (23andMe, AncestryDNA, MyHeritage) are not HIPAA covered entities. They are subject to their own terms of service and various state laws, but not HIPAA.
- Health and fitness apps that collect biometric or genetic-adjacent data may or may not be HIPAA-covered depending on their function.
- Research databases sometimes operate outside HIPAA, especially when based on de-identified data.
- Insurance products beyond health insurance (life, disability, long-term care) are not covered.
- Law enforcement access to medical data is subject to specific HIPAA exceptions that allow disclosure in many investigative contexts.
The HIPAA framework was designed for a healthcare data ecosystem that's much narrower than the actual data ecosystem in which genetic and biotech information now circulates. The result is significant gaps where genetic data flows outside HIPAA's protections.
---
Direct-to-Consumer Genetic Testing
The direct-to-consumer (DTC) genetic testing industry has grown explosively over the past two decades. Companies like 23andMe and AncestryDNA have sequenced or genotyped tens of millions of people worldwide. The genetic databases these companies have built are the largest human genetic resources in history — and they exist primarily outside the medical regulatory framework.
The regulatory landscape:
FDA oversight is limited. FDA has authority over DTC genetic tests that make health-related claims, but most ancestry-only tests are unregulated. The FDA's regulation of 23andMe's health reporting has been complicated — in 2013, FDA ordered 23andMe to stop offering health interpretations; the company has gradually regained authorization for specific reports.
FTC oversight focuses on consumer protection. The FTC can act against false advertising and unfair business practices, but doesn't comprehensively regulate the industry.
State laws vary. Some states have specific DTC genetic testing laws (California, Illinois). Most do not.
Privacy is governed largely by company terms of service. Users agree to terms when they purchase tests. These terms can change. Companies can be acquired, with new owners adopting new policies.
Several specific issues have emerged:
Research sharing. Many DTC companies share user data with research partners — pharmaceutical companies, academic researchers — sometimes for substantial payments. 23andMe has partnerships with major pharmaceutical companies. Users typically consent to research sharing in terms of service, but the consent is often poorly understood.
Law enforcement access. As the Golden State Killer case demonstrated, genetic data uploaded to genealogy databases can be accessed by law enforcement. Different companies have different policies — some require warrants, some have opted out of law enforcement matching, others have allowed it explicitly. The legal status of law enforcement access to commercial genetic databases is still evolving.
Data breaches. In 2023, 23andMe experienced a major data breach affecting approximately 6.9 million users — about half of its customer base. Affected users' ancestry information, family tree data, and other profile information was exposed. The breach exposed the fundamental privacy fragility of any centralized genetic database.
Corporate ownership transitions. When DTC genetic testing companies are acquired or go bankrupt, what happens to their genetic databases? This is a real question. 23andMe filed for bankruptcy protection in early 2025, raising urgent questions about who would gain access to its genetic database and under what terms.
The DTC genetic testing industry has created the largest, most diverse human genetic databases ever assembled — and the regulatory framework governing those databases is incomplete, fragmented, and barely adapted to the technology.
---
Newborn Screening, Research Biobanks, and Specialized Data Issues
Several other genetic data contexts deserve specific attention:
Newborn screening retention. Every state in the US requires newborn screening — blood samples taken shortly after birth to test for various genetic and metabolic conditions. The bloodspot samples are then often retained, sometimes for decades, in state newborn screening laboratories. The retained samples can be used for research, quality control, or other purposes.
Multiple controversies have arisen:
- Texas retained millions of newborn screening samples without parental consent and used them for research. Litigation led to most samples being destroyed in 2009.
- Michigan maintains a large biobank of newborn screening samples (the BioTrust) that uses samples for research with various consent frameworks.
- Parental consent practices vary dramatically by state — some require opt-in for retention, some allow opt-out, some don't disclose retention at all.
The combination of mandatory newborn screening with long-term retention creates one of the largest population genetic databases in the country — and one with the weakest informed consent foundations.
Research biobanks. Large research biobanks like the NIH All of Us Research Program (over a million participants, aimed at precision medicine research) operate under specific consent frameworks but face ongoing questions about long-term data governance, secondary uses, and equity. The UK Biobank is a parallel project that has been even more extensively studied — and used as a model for ongoing US efforts.
Forensic genetic genealogy. The use of consumer genetic databases for criminal investigations has expanded dramatically since the Golden State Killer case. Standards have been developed (the Department of Justice released guidelines in 2019) but vary by jurisdiction. The legal questions — whether genealogical searches constitute a Fourth Amendment search, what consent your relatives can give for your genetic exposure, what investigative uses are permissible — remain partially unresolved.
Polygenic risk scores and embryo selection. Several companies now offer polygenic risk scores — multi-gene risk predictions — for IVF embryos, allowing parents to select embryos based on predicted health, height, intelligence, or other traits. The accuracy of these predictions is highly contested. The ethical and policy framework is essentially nonexistent in the US. Other countries (notably the UK) have specific regulatory bodies (the HFEA — Human Fertilisation and Embryology Authority) that govern these uses; the US has nothing comparable.
The picture across all of these contexts: significant genetic data accumulation, fragmented regulatory frameworks, and major unresolved policy questions.
---
Wait, Actually...
There's a deep asymmetry in genetic privacy that current law barely acknowledges: your genetic data is also your family's genetic data.
When you share your DNA with a consumer genetic testing company, you're sharing information about your siblings, parents, children, cousins, and more distant relatives. They didn't consent. They may not even know you did it.
This is why forensic genetic genealogy works at all. The Golden State Killer's relatives — not the killer himself — uploaded data that ultimately exposed him. The relatives consented to a genealogy service; they did not consent to participating in a criminal investigation that would identify their kin. Yet that's what happened.
This asymmetry has no clean legal solution. Strict consent rules would prevent legitimate uses (research, criminal investigation, family medical history). Permissive rules accept that your privacy decisions are made by your relatives. Most other countries' privacy frameworks don't really address this either; the EU's GDPR has somewhat stronger family-protective provisions but doesn't fully resolve the issue.
The deeper problem: genetic information is inherently shared. The legal frameworks of individual consent, individual ownership, and individual privacy — frameworks that work for credit card numbers and email addresses — don't fit a category of information that's biologically partial-shared with people who never agreed to anything.
Whatever the next generation of genetic privacy law looks like, it has to grapple with this. The current frameworks largely don't.
---
What protections does GINA (the Genetic Information Nondiscrimination Act) provide?
Are direct-to-consumer genetic testing companies (like 23andMe) covered by HIPAA?
What major case demonstrated the use of consumer genetic databases for criminal investigations?
What is the fundamental privacy issue with genetic data that traditional privacy frameworks struggle to address?
Audit Your Own Genetic Privacy Exposure
Even if you've never taken a DTC genetic test, you may have genetic data exposure through relatives, healthcare, or other sources. Work through the following:
- Document your exposure. Have you (or close family members) taken a DTC genetic test? Submitted to a newborn screening? Participated in genetic research? Received genetic testing for medical reasons? Used a genealogy service that integrated DNA?
- For each exposure, identify:
- What entity holds the data
- What the relevant terms of service or consent forms say (read at least one fully)
- What protections apply (HIPAA, GINA, state law, just contract)
- What rights you have to access, correct, or delete the data
- Identify your specific exposure to:
- Health insurance discrimination risk (GINA protects)
- Life insurance discrimination risk (no federal protection)
- Law enforcement matching (depends on company policy)
- Research use by third parties
- Corporate breach (any centralized database)
- Identify three specific actions you could take to reduce your exposure if you wanted to. (Not all of these are options for everyone, but documenting what they would be is the exercise.)
- Write one policy recommendation — based on what you found, what specific federal or state policy change would meaningfully improve genetic privacy protection?
This is essentially the same work a privacy auditor would do for an organization, applied to yourself. The discipline of taking your own privacy seriously is unusual, especially for genetic data, and the policy lens it gives you is one of the most useful contributions you can make to ongoing debates.
---