by Jason Koebler
March 3, 2014

from Motherboard Website


The next time your privacy is invaded might not involve�text messages,�web browsing histories, or hidden cameras... It could involve the very stuff that makes you you.

It's time to start worrying about keeping your genetic information private.�

As genome sequencing becomes cheaper and as governments, researchers, doctors, and consumers find more reasons to sequence and store entire genomes, people are increasingly worrying about who will have access to them, and what they'll do with them.�

Already, researchers can often determine who it belongs to based on publicly-available genome databases and some basic Googling.�

"We can infer the identity of individuals from their DNA by looking at the Y chromosomes, and in some cases, we can identify the surname of the person based on Internet searches," said Yaniv Erlich, a genetic researcher at MIT who is working on genome encryption.

In some circles, Yaniv Erlich is considered a "genome hacker" because most of his experiments involve reverse-engineering DNA to glean something about the individual.

Already, we can get a rough idea of what someone looks like from their genome, and,�in a paper published last year in Science (Identifying Personal Genomes by Surname Inference), he proved it's possible to pinpoint who a person is based solely on their Y chromosome and publicly-available data.�

"We report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases," the authors write.

"We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources," he wrote.

Right now, some basic anonymization or "de-identification" of genomes is done at labs that regularly perform genetic research, according to Christopher Black of the New York Genome Center.

In scientific literature and databases, for instance, patient birth dates might be "shifted" and social security numbers and medical record numbers are deleted and exchanged for randomly-generated ID numbers. But genomes themselves aren't "encrypted."�

I asked Erlich why we should care - you can't change your genome, after all.

In that sense, genome privacy is inherently different than government snooping on things you can control, such as emails, your day to day life, your actions and your thoughts. �

"If you live in a democratic society, then maybe you can make a distinction about things you don't have control over versus things you do, but you'd not have to go far back in history to see it in a different way," Erlich said.

"Let's say I'm an Ashkenazi Jew, well, 70 years ago, I would probably want my genetic information to be highly protected."

He's right.

And it won't take another�megalomaniacal dictator hell-bent on ethnic cleansing to see how and why knowing someone's genome can and will be abused.�Genomes can provide insight into a person's medical history or future.

A technique to reverse-engineer genomes to help police with sketches has already been patented, but if you take it one step further, genomes�can be used to instantly identify someone if you're able to cross check it with a national database.

"A lot of it depends on what happens to those genomes," White said.

"Are they kept in a national database? If I'm an insurance company, I can decide whether you're a bad risk or not based on your genome. There are police and forensic applications, too."

We aren't sequencing every baby's DNA yet, but it's not some futuristic pipe dream.

Scientists in the United Kingdom have already lobbied the government to begin sequencing babies' DNA at birth, a plan that the country's health secretary has already backed.

Meanwhile, the National Institutes of Health is spending $25 million to explore the promise - and ethical challenges - of genetic testing at birth in the United States.�

Which brings us back to encryption. How do you take something that's 6 billion letters long and make it unreadable to an outsider, but not so scrambled that it's useless to researchers?

Erlich has an idea called homomorphic encryption, which allows data to be manipulated and analyzed in its encrypted form to generate a result.

When it's sent back to someone, the results can be decrypted to show the results.

"Imagine you have a brick of gold and you want to make a necklace out of it. I'm a jeweler, but you don't trust me to not take the gold and run away," Erlich said.

"What you do instead is place the brick in a box, and you put all the equipment I need to make the necklace inside the box, and then you lock the box. From there, I make the necklace, give the box back to you, you pay me, then you open it and you get your necklace."

Genomic homomorphic encryption would work the same way. A genome would be encrypted, and all the tools needed to analyze it would be altered to work on this encrypted genome.

A scientist would analyze it for you, send it back, and you'd decrypt it.

"As a researcher, when I look at the results myself, I see gibberish, cypher text. I have no idea what your risk for heart disease is," he said. "But then you have the key, you decrypt it, and you know."

The problem, at least for the moment, is that it takes a ridiculous amount of computing power to encrypt an entire genome, and all of the genetic analytical tools we've developed so far would have to be revamped to work with homomorphic encryption.

It's definitely not ready for widespread use, but at the moment, it's one of the most promising things we've got.

"It's not there yet," White said. "It's promising, but my personal opinion is that homomorphic encryption for genomic data is still years away from being practical."

Even once it's practical, it'll take lobbying and legislation to require the government to do it.

But at least someone is working on the technology to make genetic encryption a possibility.

"This is still in its infancy, but we're looking to the future," Erlich said.�"Some people say �I don't care about my genetic privacy.' I don't think that's going to stay the case."