All These Mutant Virus Strains Need New Code Names

GISAID started in 2008, after researchers around the world expressed some reticence at putting sequence data from their surveillance of bird flu into public domain databases. Under-resourced scientists didn’t want to drop a new sequence but then get scooped on the analysis by some other researcher with a zillion-dollar lab. And as GISAID got more and more data, the people who ran it had to come up with a way to identify each sequence and put them all into context with one another. Now it’s the main data repository for SARS-CoV-2 genomes.

But the world of Covid nomenclature has two more great and noble houses. Nextstrain, based at the Fred Hutchinson Cancer Research Institute and University of Basel, is one. Its organization revolves around clades, big branches on the phylogenetic tree of life. (Nextstrain started out doing the same job for influenza.) Its names have a cheat code—clades are organized by the year they’re discovered and a letter of the alphabet, and then according to specific mutations of interest. The de Oliveira team’s variant had a bunch of mutations, but the N501Y was important. (The mutation changes an asparagine, abbreviated with the letter N, to tyrosine, abbreviated with a Y, at the 501st amino acid on the virus’ spike protein, in the RBD (that’s Receptor Binding Domain) that attaches to the human ACE2 receptor (that’s Angiotensin-Converting Enzyme).

Easy, right? (Ahem.) But then things got even more complicated. The one the UK researchers were seeing had the same mutation, among many others. To distinguish it from de Oliveira’s, each got a new designation—appending “V1” on the one from the UK and “V2” on the other. Another similar variant that led back to Manaus, in Brazil, came to be “v3.”

check out the post right here
additional info
my link
additional reading
important source
you can check here
this link
see post
click reference
visit site
look here
try this web-site
Going Here
click to read
check this site out
go to website
you can look here
read more
use this link
a knockout post
best site
blog here
her explanation
discover this info here
he has a good point
check my source
straight from the source
go to my blog
hop over to these guys
find here
click to investigate
look at here now
here are the findings
click to find out more
important site
click here to investigate
browse around this site
click for more
why not try here
important link
hop over to this web-site
my website
browse around here
Recommended Site
Your Domain Name
Web Site
click this site
hop over to this site
i was reading this
click here to read
read here
i loved this
my blog
click now
you can try these out
informative post
top article
useful site
click this over here now
moved here
about his
navigate to this site
click this
click here for more info
investigate this site
more helpful hints
over at this website
go to the website
try this site

“We’re not trying to name everything. In fact, we’re really explicitly trying not to have more than 10 or 20 names a year, and we’re interested in picking out the most important things,” Hodcroft says. “That’s, like, big changes in the tree. When we see groups that are different in their genetics and they spread, even if it takes a while, in a region or around the world, we give those a Nextstrain clade.”

That’s not what the other bigwig in the space does, though. It’s analytical software called Pangolin—“Phylogenetic Assignment of Named Global Outbreak LINeages.” So-called Pango lineages start with a letter, initially A or B, designating the first two diverging SARS-CoV-2 sequences that emerged from China in late 2019 and early 2020. Each generation gets a number, and its descendants get an additional number, preceded by a period—but only for three generations. Four or more, and the whole lineage gets assigned to a new letter. Imagine an Obed-begat-Jesse-and-Jesse-begat-David vibe, but with diagrams and genomic receipts. “Lineages are operating on a different resolution. You can have very big ones and small ones, but the idea is to capture the emerging edge of the pandemic,” says Áine O’Toole, an evolutionary biologist at the University of Edinburgh who created Pangolin and is now one of its main developers. “The idea is to have a cluster of sequences that is linked to some sort of epidemiological piece of information.”

(After publication, O’Toole emailed me to note that while she had created the Pangolin software, she didn’t come up with the Pango notation used in the nomenclature—that was a bigger team. It’s an important distinction that also proves my point about how hard it is to name things, including the people who name things.)

Pangolin has a tricky bit. Anyone working on a viral genome can use the software to try to figure out whether they have something new, and where it might fit with all the known lineages (with data pulled from GISAID, just as Nextstrain does). But making a final call on whether a strain is indeed new, and deserves a different spot in the heuristic—its Pango lineage—is up to actual living people on the team and suggestions from scientists in the field. “I think maybe it’s something we need to work harder on, to try to convey there’s a difference between lineage designation and lineage assignment,” O’Toole says. “When we designate lineages, that’s just based on what we know. If you’ve got a new lineage and we haven’t seen it, Pangolin won’t be able to assign it, because it can’t predict lineages that will arise in the future. So there is a lag.”

Leave a Reply

Your email address will not be published.

Previous post A New Artificial Intelligence Makes Mistakes—on Purpose
Next post Jeff Bezos Steps Down as CEO—and Shows Amazon Is a Cloud Company Now