Genomics10 of 1345 minModules 1–9

The ethics and governance of genomics — who owns your genome?

Every module so far has been about what genomics can do.

Start here

Every module so far has been about what genomics can do. This module is about what we should do — and who gets to decide.

Genomic data is unlike any other kind of medical data. It is permanent: your genome doesn't change over your lifetime. It is identifying: it can be used to identify you even from an anonymous sample. It is familial: sequencing one person reveals information about their relatives who never consented. And it is predictive: it encodes probabilities of future diseases, traits, and even behaviors — information that insurers, employers, and governments have strong financial incentives to access.

These properties create ethical and governance challenges that don't have clean solutions. This module doesn't pretend they do. Instead it gives you the frameworks, the legal landscape, the key controversies, and the unresolved debates — so you can engage with them seriously.

By the end of this module you should be able to answer:

What makes genomic data ethically distinct from other medical data?
What is informed consent in genomics, and what are its limits?
What does GINA protect — and what does it not?
What are the major ethical issues with direct-to-consumer genetic testing?
What happened with the Havasupai tribe, and what does it reveal about research ethics?
What is data sovereignty and why do indigenous communities assert it?
Who owns your genomic data, legally and ethically?

---

What makes genomic data different

Medical data is generally considered private. Your blood pressure readings, your MRI scans, your psychiatric history — these are protected under HIPAA (Health Insurance Portability and Accountability Act) in the United States, and under equivalent frameworks in other countries.

Genomic data has all of those privacy concerns plus four additional properties that make it categorically more sensitive:

Permanence. A cholesterol reading can change with diet and medication. Your genome cannot. A genomic data breach is not recoverable — you cannot change your DNA the way you can change a password.

Identifiability. Genomic data can identify individuals even when stripped of names and demographic information. Latanya Sweeney demonstrated in 2013 that individuals in the Personal Genome Project could be re-identified from supposedly anonymized genomic data combined with publicly available records. A 2018 study in Science showed that 60% of Americans of European descent could be identified through genealogy databases from only a third-cousin match — with far fewer matches needed for closer relatives.

Familial reach. When you share your genomic data, you implicitly share information about every biological relative — parents, siblings, children, cousins — who never consented to that disclosure. A decision made by one family member has privacy consequences for others.

Predictive scope. Genomic data encodes future probabilities: risk of Alzheimer's disease, cancer susceptibility, drug responses, and traits ranging from height to, controversially, cognitive abilities. This predictive scope creates incentives for third parties to access genomic data for purposes — insurance underwriting, employment decisions, law enforcement — that go far beyond healthcare.

Together these properties mean that standard medical privacy frameworks are necessary but insufficient for genomics.

---

Informed consent — the foundational principle and its fractures

Informed consent is the principle that research participants and patients should understand what they are agreeing to before providing data or undergoing procedures. In genomics research, consent has always been complicated by the fact that researchers often don't know in advance what they will find or what future analyses will be done.

The traditional model: A participant consents to a specific study with defined purposes. Their sample is used for that study. If researchers want to use it for something else, they return and ask again.

This model broke down as genomic databases grew. Biobanks — repositories of biological samples linked to health records — accumulate samples over decades and are used for purposes that couldn't be anticipated at collection. Returning to millions of participants for re-consent is operationally impossible.

Broad consent became the practical alternative: participants consent to an expansive range of future research uses, typically including any health-related research approved by an ethics board. This is the model used by the UK Biobank (500,000 participants), the NIH All of Us Research Program, and most large biobanks.

Broad consent is a compromise. It gives researchers flexibility but gives participants much less clarity about what their data will actually be used for. Critics argue it is not meaningfully informed — that agreeing to "any future health research" is so vague as to be nearly meaningless.

Secondary findings and incidental findings:

When you sequence someone's genome for one purpose (say, to study heart disease), you inevitably learn about variants relevant to other conditions — BRCA1/2 variants, hereditary cancer syndromes, variants predictive of Alzheimer's disease. These secondary findings (anticipated when designing the study) and incidental findings (unexpected discoveries) raise hard questions:

Are researchers obligated to return these findings to participants?
Does a participant have the right not to know?
What if the participant is a child — should findings be returned to parents when the child cannot consent?

The American College of Medical Genetics and Genomics (ACMG) maintains a list of genes for which secondary findings should be returned to patients — currently 81 genes covering conditions where knowing the result enables preventive action. But the list and the obligation are contested. Some participants explicitly don't want to know if they carry a BRCA2 variant; others desperately want to. Consent forms increasingly ask participants to specify their preferences.

---

GINA — what it protects and what it doesn't

The Genetic Information Nondiscrimination Act (GINA) was signed into law in the United States in 2008 after 13 years of congressional debate. It prohibits two things:

Health insurers from using genetic information to deny coverage, adjust premiums, or impose pre-existing condition exclusions
Employers from using genetic information in hiring, firing, compensation, or any other employment decision

GINA was a meaningful step. Before it passed, people declined genetic testing because they feared it would be used against them — a rational fear, and one that was suppressing the clinical uptake of genomic medicine.

What GINA does not cover:

Life insurance: Life insurers in the United States can legally use genetic information to deny coverage or charge higher premiums. This is not a hypothetical — several documented cases exist of people being denied life insurance after testing positive for BRCA1/2 mutations or Huntington's disease mutations.
Disability insurance: Similarly unprotected.
Long-term care insurance: Similarly unprotected.
Military: The military can use genetic information in some contexts.
Small employers: Employers with fewer than 15 employees are not covered.
Manifested disease: GINA protects against discrimination based on genetic predisposition, not manifested disease. If your BRCA1 variant has already led to a breast cancer diagnosis, discrimination based on that diagnosis is governed by the ADA (Americans with Disabilities Act), not GINA.

The gaps in GINA have real consequences. Individuals who test positive for early-onset Alzheimer's variants, Huntington's disease mutations, or BRCA1/2 mutations face the choice between knowing their genetic status and maintaining access to life, disability, and long-term care insurance — or not knowing in order to preserve insurance eligibility. This is a coercion embedded in the legal structure.

Outside the United States, protections vary widely. Canada passed the Genetic Non-Discrimination Act in 2017, which is broader than GINA. The UK prohibits life insurers from using most genetic test results (with exceptions for Huntington's disease above certain coverage thresholds). Many countries have no specific genetic nondiscrimination law at all.

---

Direct-to-consumer genetic testing — the 23andMe problem

Direct-to-consumer (DTC) genetic testing companies — primarily 23andMe and AncestryDNA — have collectively genotyped over 40 million people. They offer ancestry analysis, health risk reports, carrier status, and pharmacogenomic panels, all purchasable without a physician involved.

DTC testing has democratized access to genomic information in ways that are genuinely valuable: millions of people have learned they are carriers for recessive conditions, discovered relatives through DNA matching, and accessed pharmacogenomic information relevant to their medication decisions.

But DTC testing has also created a set of ethical and governance problems that remain unresolved.

Data ownership and third-party sharing:

When you spit in a 23andMe tube, what happens to your data? The terms of service are lengthy, updated periodically, and not read by most users. Historically, both 23andMe and AncestryDNA have sold aggregate data to pharmaceutical companies for drug discovery research. 23andMe's partnership with GlaxoSmithKline (2018, $300 million) allowed GSK access to the genetic data of customers who had opted into research.

Participants opt in — but the opt-in is buried in consent flows, and most users don't read what they're consenting to. The business model of DTC genomics is, in part, selling aggregate genomic data to the pharmaceutical industry. The product is partially you.

23andMe bankruptcy (2025):

In March 2025, 23andMe filed for Chapter 11 bankruptcy after years of financial losses and a catastrophic 2023 data breach in which hackers accessed the personal and genetic data of approximately 6.9 million users. The bankruptcy raised an acute question: what happens to the genomic data of 15 million customers when the company that holds it goes bankrupt and its assets are sold?

The California Attorney General issued guidance urging 23andMe customers to delete their data. The bankruptcy court proceedings involved contested questions about whether genomic data could be sold to new owners as an asset — and what privacy obligations would transfer with it.

This case is the clearest demonstration to date that uploading your genome to a commercial platform creates privacy risks that persist long after the platform that collected it ceases to exist.

Law enforcement access:

Investigative genetic genealogy — using DNA databases to identify criminal suspects through relative matching — has been used to solve high-profile cold cases, most famously the identification of the Golden State Killer in 2018 using GEDmatch (a genealogy database that accepted raw DNA uploads from multiple DTC companies).

The technique works even when the suspect has never submitted their own DNA: matching to third or fourth cousins in a public database, combined with genealogical research, can narrow an identity to one or a few candidates. Because of the familial nature of genomic data, your decision to upload your genome to GEDmatch implicitly exposes your relatives — who never consented — to law enforcement identification.

GEDmatch now allows users to opt in to law enforcement searches rather than opting out. FamilyTreeDNA cooperated with the FBI directly. 23andMe and AncestryDNA have stated policies against voluntary law enforcement cooperation without a valid legal process (subpoena or court order) — but have complied with such orders when received.

The policy and legal landscape here is almost entirely unresolved. No federal law specifically governs law enforcement use of consumer genomic databases.

---

The Havasupai case — the foundational research ethics violation

In the early 1990s, researchers from Arizona State University collected blood samples from members of the Havasupai Tribe — a Native American community of roughly 650 people living at the bottom of the Grand Canyon — with the stated purpose of studying the high rates of type 2 diabetes in their community.

The community consented to diabetes research. Over the following decade, the samples were used for research on schizophrenia, migration patterns, and evolutionary population genetics — purposes never consented to and in direct tension with tribal beliefs about their origins and identity. Havasupai cultural beliefs hold that their people originated in the Grand Canyon; research suggesting their ancestors migrated from Asia was experienced as a violation of their foundational identity, not just their privacy.

When tribal members discovered the scope of the research in 2003, they sued ASU. The case settled in 2010: ASU paid $700,000 in damages, returned the blood samples, and provided additional community benefits.

What the Havasupai case established:

Consent is specific to purpose — broad or ambiguous consent to "research" does not authorize any subsequent use
For indigenous communities, genomic research intersects with cultural identity, sovereignty, and self-determination in ways that standard bioethics frameworks don't address
The power differential between academic researchers and marginalized communities creates structural conditions for consent violations
Once data leaves a community's control, the community has essentially no ability to limit its subsequent use

The Havasupai case remains the most cited example in discussions of research ethics with indigenous populations and continues to shape IRB (Institutional Review Board) practice for studies involving indigenous communities.

---

Indigenous data sovereignty

The Havasupai case was not an isolated incident. Across the world, indigenous communities have experienced genomic research that served outside interests with little benefit to or control by the communities themselves.

OCAP® principles (developed by the First Nations Information Governance Centre in Canada) articulate an alternative framework: Ownership, Control, Access, and Possession of data by indigenous communities. These principles hold that a community's data belongs to that community — not to the researchers who collected it, not to the institution that funded the research.

The Global Indigenous Data Alliance and the CARE Principles (Collective Benefit, Authority to Control, Responsibility, Ethics) provide an international framework for indigenous data governance that is increasingly referenced in research policy.

Practical implications:

Research on indigenous populations should require community-level consent, not just individual consent — because the community as a whole has interests in data about its members
Data should be stored in ways that give the community control over future access
Benefit sharing — ensuring research produces tangible benefits for the community, not just for researchers and the pharmaceutical industry — should be a condition of research participation
Indigenous researchers and community members should be involved in study design, conduct, and interpretation

The NIH All of Us Research Program and the Global Biobank Meta-analysis Initiative have incorporated indigenous data sovereignty principles to varying degrees — but implementation is uneven and contested.

---

Who owns your genome — legally

The legal answer to "who owns your genome" is surprisingly murky.

You own your physical DNA — courts have generally held that individuals have property interests in their biological materials. But the law is unsettled. The landmark case Moore v. Regents of the University of California (1990) held that a patient whose cells were used to create a commercially valuable cell line (without his knowledge or consent) did not have a property claim to the profits derived from those cells — only a claim for lack of informed consent.

You may not own your genomic data. Once your genome is sequenced and the data exists in digital form, the legal status is ambiguous. Database providers typically assert ownership of the compiled database (as a creative work) but not of the underlying data. However, the terms of service of most DTC companies grant them broad licenses to use your data in ways that effectively give them control over it.

The GDPR (EU) framework is the strongest existing legal protection: it classifies genetic data as a special category of sensitive personal data, requires explicit consent for processing, grants individuals rights of access and deletion, and applies to any company that processes data of EU residents regardless of where the company is based. The 23andMe bankruptcy tested this framework: EU data protection authorities asserted that genomic data of EU residents could not be freely transferred to a bankruptcy purchaser without adequate privacy protections.

Proposed US frameworks — including a federal privacy law and specific genetic data protections — have been debated in Congress without passing. Several states (California, Texas, Illinois) have stronger genetic privacy laws than federal law, but the landscape is fragmented.

---

The governance gap — where regulation hasn't kept up

Genomic technology has moved faster than regulatory and legal frameworks in almost every dimension.

The FDA and genomic tests: The FDA regulates in vitro diagnostics (IVDs) — clinical tests performed in laboratories. But it has historically exempted laboratory-developed tests (LDTs) from premarket review, creating a significant gap: a genomic test developed and performed within a single laboratory can be offered clinically without FDA review of its accuracy or clinical validity. The FDA issued new rules in 2024 to phase out the LDT exemption, but implementation is ongoing and contested.

Polygenic scores in clinical practice: No federal framework exists for evaluating when a polygenic score is clinically valid or ready for clinical use. Individual health systems are making these decisions independently. The result is substantial variation: a patient at one hospital may receive a polygenic score for cardiovascular risk that shapes their treatment; a patient at another hospital has no access to the same information.

AI and genomics: Machine learning models are increasingly used to interpret genomic data — predicting variant pathogenicity, identifying regulatory elements, generating polygenic scores. These models encode biases from their training data (Module 4's reference genome problem resurfaces here) and can fail silently in ways clinicians don't detect. No specific regulatory framework governs AI-based genomic interpretation.

Global coordination: Genomic data and the companies that hold it are global; privacy and research ethics frameworks are national. A company incorporated in Bermuda holding genomic data collected in the United States from participants of Nigerian ancestry, used for research in collaboration with a Chinese pharmaceutical company, is not clearly subject to any single jurisdiction's rules. International frameworks — the Global Alliance for Genomics and Health (GA4GH), the OECD Guidelines on Bioethics — provide soft guidance but lack enforcement mechanisms.

---

Check yourself

1. A participant in a biobank study consented under a broad consent model to "any future health-related research." Ten years later, their de-identified genomic data is used in a study of genetic predictors of criminality. The participant objects when they learn about the study. Does their broad consent cover this use? What ethical principles does this case implicate, beyond the legal question?

2. A person tests positive for a BRCA1 pathogenic variant through 23andMe. They tell their sister, who refuses to get tested. The sister is subsequently diagnosed with advanced breast cancer. The first person asks: did I have an ethical obligation to tell my sister? Did 23andMe have an obligation to notify her? What framework do you use to think about this?

3. A Native American tribe agrees to participate in a pharmacogenomics study with a university. The consent form covers use of samples for pharmacogenomic research. During analysis, researchers identify a variant highly associated with alcohol use disorder in the tribal population. They want to publish. The tribe objects. Evaluate this situation using both the OCAP principles and standard bioethics frameworks. Where do they conflict?

4. A genomics startup proposes to sequence 10,000 children ages 5–12 for a longitudinal study. Parents provide consent. The study will store whole-genome sequences indefinitely and use them for any approved future research. Identify three specific ethical issues this study design raises that are distinct from adult research, and propose a design modification for each.

---

Key facts to remember

Genomic data is permanently identifying, familial, and predictive — making it categorically more sensitive than other medical data
Broad consent is the dominant research model: flexible but poorly informed; contested in ethics literature
ACMG maintains a list of 81 genes for which secondary findings should be returned; right not to know is recognized in consent practice
GINA (2008): prohibits health insurer and employer genetic discrimination; does NOT cover life, disability, or long-term care insurance
23andMe bankruptcy (2025): 15 million genomes in contested legal limbo; California AG urged data deletion after 6.9M-user breach
Investigative genetic genealogy: Golden State Killer identified via GEDmatch; familial reach means your relatives' privacy depends on your decision to upload
Havasupai case (settled 2010): consent specific to purpose; $700K settlement; blood samples returned; foundational research ethics case for indigenous populations
OCAP principles and CARE principles: indigenous data sovereignty frameworks; community-level consent; benefit sharing required
Moore v. Regents (1990): no property claim to profits from your cells, but claim for lack of informed consent
GDPR: strongest existing legal framework for genomic data; special category; deletion rights; applies to EU residents globally
LDT exemption: genomic tests as lab-developed tests historically exempt from FDA review; new 2024 rules phasing this out

---

Primary sources & references

McGuire, A. L. & Beskow, L. M. (2010). "Informed consent in genomics and genetic research." Annual Review of Genomics and Human Genetics, 11, 361–381.
Sweeney, L. et al. (2013). "Identifying participants in the Personal Genome Project by name." Harvard Data Privacy Lab Working Paper.
Erlich, Y. et al. (2018). "Identity inference of genomic data using long-range familial searches." Science, 362, 690–694.
Garrison, N. A. et al. (2019). "Genomic data sovereignty and the Havasupai case." Nature Reviews Genetics, 20, 256–263.
Hudson, M. et al. (2020). "CARE Principles for Indigenous Data Governance." Data Science Journal, 19, 43.
Green, R. C. et al. (2013). "ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing." Genetics in Medicine, 15, 565–574.
Rothstein, M. A. (2008). "GINA, the ADA, and genetic discrimination in employment." Journal of Law, Medicine & Ethics, 36, 837–840.