– Okay I think in the interest of time I’m gonna get started Good morning everyone, my name is Vicki Ellingrod and I am the faculty lead for the education workgroup for Precision Health Before we begin with our speaker today, I wanted to provide you some updates about some of the newer Precision Health activities that we have going on In terms of education, we’ve recently introduced a Precision Health certificate program It is one of the first in the nation that is focusing primarily on Precision Health It is open not only to current graduate students across campus but also professional students and masters students who are graduate students If you are interested in being part of this certificate program, we are now enrolling students for fall and I encourage you to visit our website if you would like to learn more Additionally on Tuesday, June 30th, at noon, there is going to be an online information session that you can find the link for on the Precision Health webpage So if this is something that’s of interest to you, please go ahead and register for that information session to learn more In terms of Precision Health activities for facilitating research, one of our goals is to really build the infrastructure to enable interdisciplinary research with various tools and resources across campus While one of these resources is going to be discussed shortly by our speaker today, I also want to make sure that you know that we have a new analytics platform to help you even researchers from all disciplines access and use Precision Health data More information similar to our certificate program is available on the Precision Health webpage Precision Health also realizes the importance of not only scientific discovery related to the individualization of healthcare resources and scientific discoveries but also translating it to findings into practice Therefore our implementation workgroup has been doing just that And some of the most recent findings of this work is also being highlighted on the Precision Health webpage And lastly, one of the key roles that Precision Health has been playing on campus is the funding of research Precision Health realizes the importance of encouraging research not only through educational events such as this one, but with also real research dollars Therefore I’m happy to announce that currently there’s nearly $6.5 million in funds that have already been distributed for research, as well as for promising trainees here on campus Therefore, with this information and these updates I hope I’ve enticed you to learn more about Precision Health I would also like to encourage you to become an official member of Precision Health and more information about how to do that is available on the webpage So if you go to precisionhealth.umich.edu there’s a wealth of information there waiting for you to learn more about how you can become involved, potentially apply for some of the research funds or take advantage of our educational programs I also invite you to save the date for our symposium taking place on Wednesday, September 23rd This year we will be having a virtual event, however we’ll be having several prominent National Speakers, and we have a chance to interact with other community members within Michigan and beyond through a virtual or an E-poster session So please watch your email for more information about this event, I really think this is going to be a very different event than you’ve ever been a part of not only focusing on the science of Precision Health, but also the impact on the communities that we all work with and serve So with that, I want to introduce our speaker for today We are excited to host this workshop on a topic that has become very important to everyone very quickly and that is COVID-19 As more research is being done across campus on this topic and related topics, you’re here today to learn about a valuable research available to support COVID research

using Precision’s Health, Michigan Genomics Initiative or MGI cohort of nearly 80,000 participants Our speaker today is Erin O’Brien Kaleba Director of the Research Data Warehouse in DataDirect, as well as the Data Office for Clinical and Translational Research She and her team partner with investigators and academic departments to translate research data needs into technical solutions with an experienced team of data-based programmers, and she particularly has expertise in accessing the MGI data So how this is gonna work, Erin is going to speak, I’m gonna be moderating the chat function So if you have questions, please feel free to type them in the chat function And then around 12:45 or so, we’ll be asking some of those questions And with that, I want to thank Erin for agreeing to present today, and we’ll hand it over to you – Thanks Vicki so much You can see Vicki has my name under her photos (Vicki laughs) so she’s gonna take the hard questions and I’ll take the softballs So yeah, thank you so much to Precision Health, to Tina for all the coordination and Vicki and to Rachel Dawson for your leadership Also thank you to my team who does all the work that I’ll shamelessly take credit for in this presentation But I’d like you to think about this talk today as just the start of a conversation about what we have built to date for your use and what we will be building, what you tell us is the most critical next, either data type, or solution or integration that you need to be successful So that you know, the alternate title to this really could be why is the Precision Health platform ideal for studying this disease? What is it about COVID-19? Its rapid uptick, its pervasiveness that makes the Precision Health platform with all its resources and its experts the ideal place to centralize a lot of what we’re making available So an alternate title for this could have been if you remember nothing else from this talk remember that there are people and there are tools that can serve you and help you remember what was said today So let’s start with a little bit of background So the world seems stumped by this pandemic, the ferocity of it, the unpredictability of it I mean the epidemic itself was predictable We saw this coming, we the scientific community of the world knew there was one of these big pandemics coming, yet the course of the disease, who’s most impacted? That is a largely unpredictable and that needs to be studied That is ripe for predictive models for looking at all the data and information we have and making tools to predict who’s next, what’s next? So although we’re stumped by this one, pandemics are not new Black Death, cholera, yellow fever, even up until the 1980s, HIV and AIDS And we got lovely images such as these plague physicians and painters back in the day would spend their time doing these just disaster ugly paintings of plagues and the impact on society This particular one actually shows not just peasants sick but kings and people in the upper crust showing that plagues are non discriminatory when it comes to who gets sick So the reason this is so important to be studied and why the Precision Health Initiative is ideal is that we know the immune system attacks the virus, that’s what gets you the inflammation response and gets you a fever But there’re in some cases the immune system goes berserk I don’t know if that’s a clinical term, but it resonates with us It does more damage than the actual virus is doing, it causes something called cytokines storms And what we know is that this can cause the blood vessels to leak, the blood pressure will absolutely plummet The immune cells will attack the lungs that they’re supposed to be protecting But what needs to be studied as who is most at risk for this extreme reaction? Whose immune system will behave in a predictable way and get rid of the virus? And who is going to have an immune response that is absolutely more damaging? Are there social determinants that we could say, this particular group needs heavier prevention because they’re more at risk? Are there genetic factors that suggest one group versus another is more prone

to this type of extreme response? Are there other things we haven’t even thought about yet, beyond clinical and social and genetic that are really the keys for us to be studying? And you all may know what those are and we’d love to hear what you’re finding in your labs, what you’re finding in your social research that we should make available because those are going to be, it’s not going to be one factor, it’s going to be some mix of factors But this is why I say that the Precision Health platform is ideal to be studying this from all these different angles So today I’m gonna talk about what is the Precision Health initiative and what is the Data Office? When should I seek their services and their tools? How is COVID-19 being defined globally versus for clinical versus for research? And life a few months from now what should we expect, other than me getting a haircut and long overdue highlights? So the thing that the Precision Health Initiative is really doing is taking information about the patient, reams and reams and 100s of millions of documents and vital signs and surveys, and trying to put those into a place where you can easily access them, we cannot expose the patient sensitive information to risk or loss and where the patient can be part of that conversation about what we can and should study about them So this is really amazing if you look at most academic institution there’s something that’s like the Precision Health Initiative I think Michigan’s is a little bit unique though You’ll see institutions call it the Personalized Medicine initiative, or the Precision Medicine Initiative U of M was intentional about calling it Precision Health And I think that definition on the left-hand side of the screen, that it’s really studying populations with the intent to drill down to actionable decisions for the individual person And really the pursuit of wellness So we’re not just studying the genetic and social factors for sickness and disease, we’re studying why some people stay well What is it about them, their lifestyle, their genes that make them stay well? So that’s one thing that sets University of Michigan apart is this focus on health and wellness as well as disease A second thing is that we have 19 schools and colleges at U of M who are all part of The Precision Health Initiative, many of whom have contributed dollars and faculty and expertise This is a cross campus initiative, the provost, the Deans of Engineering, and Public Health, and Pharmacy and Medicine and so many others have said, this is important, we’re gonna actually put money on the table And the Precision Health Initiative is not something that’s so brand new, we’re starting from scratch It’s an accelerator and incubator It’s to take all the pockets of innovation that were already out there and glue them together and say to a social work researcher, and a Ross School of Business researcher, and someone in Medicine, you guys are all making advances Your contributions to discovery and to translation will be stronger if we work together So that’s the Precision Health Initiative Lots of more information on that if you’re interested Let’s talk specifically about the Data Office The Data Office has gone through a lot of changes or an identity crisis I joke that it used to be just a Jeremy, it felt like back when I was hired, every department or every school had a Jeremy And I’d say how do you get data? How do your researchers get data to answer their questions? They’ll be like, “We got a guy named Jeremy “who pulls it for us.” So Jeremy became an Honest Broker Office And that title was really to reflect the fact that this office would be the go between between the patient and the researcher, and would provide that sensitive information The honest broker kind of paired together with the Research Data Warehouse and DataDirect to become the Data Office for Clinical and Translational Research, DOCTR is another acronym for that So it’s the people who get you the data you need, but not more than the data you need I’ve listed out here a list of the types of data There are many many more But I find sometimes it’s helpful to hear some examples so that you can know is this something we can help you with? Or can we help you get in touch with a different group? So it is very basics cohort discovery So how many patients with COVID-19 transferred from outside hospitals? How many patients with lupus have had a knee replacement? So these how many’s a lot of times,

funders will have a funding opportunity and we want to be responsive but we don’t even know how many we see here at Michigan, you go with a gut feel of I think there’s about five patients a week that I see with this condition We also serve the needs for retrospective observational study So there’s a lot of study the difference of outcomes for those based on the length of time someone was on a ventilator And then those retrospective observational results, then can really inform a prospective cohort design and help with recruitment So once I’ve identified my eligible population, I want to recruit them in the most efficient way possible For those who are eligible, when’s their next appointment? When are they gonna come in to see their rheumatologist and I can maybe have someone at the Domino farms rheumatology clinic ready So those are reports our tools can print out for you on a daily or a weekly basis, when eligible patients are due for their next appointment with us Precision Health, so many possibilities here that I’m gonna talk about in the next set of slides But this is the clinical data I’ve been describing plus a whole extra suite of data options So how many patients with underlying atrial fibrillation have genetic and environmental data that I can analyze together with their medication list, let’s say? And finally the growth of network funded research or multicenter study of research kind of how the NIH wants to fund research nowadays, not just, here Michigan here Duke, we’re gonna fund you to do this but we need you to work across health systems so that we get a bigger bang for the buck So how can we make U of M data look like everyone else’s data so we can more nimbly share it? How can we map it? Even if everyone has the epic medical record, the implementations are so unique that you really do have to map to common data standards in order to efficiently be able to share and link with other health systems So those are the types of data needs obviously there are many more, but then each of those data needs comes in a lot of flavors And so as Vicki mentioned, we have the technical set of skills in the Precision Health and Data Office and we also have a regulatory set of skills that will kind of tease out each of these unique nuances So here’s the variations of those requests Could you pull a limited data set from me I can’t have direct identifiers Could you pull data from my fellow? Could I share it with my visiting med students? Could you give me all the data so I can build machine learning algorithms? Can I store this dataset on the cloud? Can I link this data with genetic data, with geolocation data, with patient survey data that I have? So each individual data need, just like research itself is a unique question, it has unique nuances We tease out where’s the best place to store it? Who can have access? How anonymized do the data need to be? And so these are some of the things that our office does So it’s really about getting data from the patient as the source, putting it through the lens of what did the IRB say was most ethical in the conduct of the research with those data? And the compliance part about privacy and security? Where can you store it? Can you invite others into your environment to analyze it? can you reach back out to that patient? So this is the lens, so we’re not IRB, we’re not compliance but we use the guidance from those two pillar organizations to look at each of the researchers and get the data that they need So let’s talk specifically about COVID-19 data I feel like these slides will be stale in about 10 minutes because this virus and this pandemic is so rapidly growing that we are trying to keep up and iterate in a way that you the researcher or you the research coordinator know what you’re looking at as opposed to what you pulled a month ago So we’re trying to establish consistencies but I recognize that it’s very hard as we learn more and more So in March when the U.S’s experience and Southeast Michigan really began to experience the full impact of the pandemic There was a flurry of activity and I’m blown away, I’m so impressed by how quickly committees were stood up the data, what did they call ’em? I’m not gonna think about it, like command centers were stood up to learn about this to help those primarily at the bedside Those are the front lines, do we have enough ventilators? Do we need to expand the number of floors, that were stepped down floors and now need to be full ICUs? They needed the data first So research as important as we think it is,

as you think it is as a researcher, it’s not the first in line for data, right? So the data for research rarely needs to be real time Whereas on care delivery and operations, those data for COVID-19 starting in March it was high urgency, they had to be real time We had to know when someone was going to deteriorate if we could even predict that They had to be absolutely standardly defined, it had to be in a really consistent way so we could see our trends And the integration with other types of data was there but it was minimal So how is research different when we got in line in March and said, “Oh, we need to start studying this, “we need to get information out to the research community “so that this can be studied.” We definitely recognized we needed to come behind that front line The urgency for research Usually, I’ll say usually, but COVID-19 has changed everything but usually is medium to low urgency The freshness can be a day old, a week old, it doesn’t have to be real time, you’re not treating a patient right in front of you, you’re studying And so if it’s a couple days stale that’s okay What you’re asking is usually novel, it’s not like a quality improvement measure where you’re measuring the same thing Research is you’re asking a question that we don’t know the answer to and so there’s a lot of newness to it And the importance of integrating with other types of data, looking at long term outcomes, looking at patient experience and patient perception, looking at wearables and biomarkers and all that it’s incredibly important for research So my favorite, Martin Luther King quote talks about the fierce urgency of now Certainly, it applies to so much going on right now but certainly COVID-19, you could say, you know, a time for vigorous and positive action And so I think about research being behind the front line needs but not far behind it This pandemic and this urgency is greater than we’ve seen with other types of research So what have we tried to make available for you? I’m gonna go through a couple of data types, how you get it, what you need to have in place to obtain these data So I’m gonna talk about self-serve tools, things you can do on your own today, from your own desk Hopefully with lawn mowers less loud than the ones outside my window now my neighbor’s mow it all hours of the day What a custom extract means if the self-serve tools are not meeting your needs because your data needs are so complex, we can help you out with a custom data extract What biospecimens or human tissue, or viral tissue is available for your use in research, and then what genetic information derived from those biospecimens is available to you So let’s start with structured clinical data By discrete data elements or structured data elements think about data collected in the course of someone’s care, that’s either picked from a drop down menu or radio buttons So these are things like labs and medications and diagnoses And so the DataDirect tool and the EMERSE tool are the two self-serve tools DataDirect is really the tool for the structured data So this is access to clinical data that’s a day old or a week old, that you can either do cohort discovery with, no IRB needed or you can obtain raw-level patient data if you have IRB approval So what we did for COVID-19, a lot of people were asking us, how’s it being defined? Do I have to put in every single lab that is available to test for COVID-19 and then look for a positive or a negative? What if someone got tested elsewhere and transferred into U of M how do I find them? And so we thought it would be helpful to build what we’re calling a starting population So I think of this as that baking show where they put something into the oven and immediately pull it out fully baked So this is a pre-baked definition or a computed phenotype for COVID-19 that’s available to you today And for the research definition, is patients at Michigan Medicine with presumptive positive were positive COVID-19 as a result of one of those six lab tests that you see below Or they have a diagnosis, the ICD-10 UO7.1 or.2 is associated with positive COVID-19 even if you don’t have evidence of the test And so this is not really readable

but it’s to show you our logic of how we built this, what sits behind that starting population so that when you go in and start your research on COVID-19, and you grab the studying population that updates daily, that’s fresh as of yesterday, you can in your paper or in your methods say here’s what’s included, here’s what’s not included So how is this research definition different than what our colleagues on the frontline are studying? So in my chart, in the EMR, the electronic medical record, there’s a COVID-19 definition And that, again, it’s confirmed only, it’s inpatient and discharged This is our definition, sorry It’s confirmed and it’s inpatient and discharged The MiChart one, one of the MiChart definitions is only current occupancy and currently admitted patients with COVID-19 Once they’re discharged they’re not part of that number because they’re no longer needed for calculating resources or supplies Ours is inpatient, discharged, came here for a test were sent home because their oxygen saturation level was okay even though they still had a fever And then, in the medical record, there’s something called the active infection record Which appears on a patient’s chart that tells you what type of droplet precautions you the care provider needs to consider That in my chart goes into the definition of COVID That alone does not go into our definition, we need either a test or a diagnosis code So just to know if you see a number in the electronic record, and it’s different than ours, that might explain it So in DataDirect once you log in and you create a new query, you can either start with a denominator of all patients who have ever had a medical record number at Michigan, which is over 4 million, or you could start with the population, with anyone tested positive for COVID-19 And so for level one users, it’s datadirect.precisionhealth.umich.edu for level two users, it’s datadirectmed@umich.edu The beauty of Precision Health is that you don’t have to be employed by Michigan medicine to be able to access health research for the important studies you do And so we hope eventually, there won’t be two versions of DataDirect, it will be just one version and your credentials and your levels of approval will all be authenticated on the back end and the user experience will just be one tool Right now it’s two different tools but with the same data You can still get that starting population of COVID-19 patients in the either tool So let me know if you have any trouble with that So here’s some of the structured data available Patient demographics, dates of admission, dates of primary care visits, diagnosis, whether it was on your problem list or whether it was the build diagnosis, the reason you’re being seen here today All surgeries and procedures are in there, medications that were ordered, medications that were administered inpatient, labs, and then I’ll talk a little bit more about the Central Biorepository, that’s a core part of the Precision Health Initiative in which patients consent to have their data, their biospecimens, their genetic information accessed for future research And that’s in Vicky Blanc Central Biorepository Those are structured data, now let’s talk about unstructured data I think at the same time that I’m giving this town hall David Hanauer is doing a training for EMERSE EMERSE is a tool that has more than 100 million clinical documents So think about pathology reports, discharge summary, physician notes David says that 80% of important information about the patient is actually in free text, not in a structured field that’s easy to query So he’s developed this tool that will actually take you right to language in a patient’s chart that has a string of text that you’re Interested in so you don’t have to comb through the whole chart on the front end Remdesivir is the antiviral drug that’s getting a lot of attention, doing some clinical trials about it I was quickly able to say okay 176 unique patients since January 1st, 2020 have heard the term remdesivir mentioned in their notes Some of them it may say like patient declined or patient started on it and had adverse reaction,

something like that It will tell you how many patients that it’ll bring you right to their chart You do need IRB for this because it is not de-identified, its actual views into the patient’s notes So now we’ve done clinical data structured, unstructured, free text Now let’s talk about what tissue and biospecimens are available to you So specimens related to COVID-19, we have quickly growing repositories of plasma, serum and nasal swab tissue This is a little bit hard to read but again, we can help walk you through this There are several requirements for obtaining clinical residuals or blood leftover from clinical diagnosing and clinical laboratory testing So, the UMOR stood up a committee very quickly to help prioritize COVID-19 research And it wasn’t an approval body the way the IRB is but it was a necessary step to prioritize, to say to the IRB to say to Precision Health, to say to the Data Office, this is absolutely a high priority research study Fast track it, get them resources they need or this is really important, but it’s a medium or low in terms of COVID-19 And this group would turn around prioritization within 24 to 48 hours I don’t think they slept, they still need to receive any research ideas related to COVID-19 Their volume is a lot lower now than it was in March and April But they just did a fantastic job at saying, What are you studying? Will you be depleting our resource of plasma? If so, it has to be able to, you know, like, of utmost scientific and translation value So we can help walk you through that But the DataDirect tool can tell you at least how many are available So if I use my starting population of COVID-19 and then I want to look to also include a diagnosis of let’s say, atrial fibrillation, and then I wanna say, “Okay of these patients who met my criteria, “how many of them have a specimen available?” Because I’m going to want to run some separate analyses on those let’s say And then as I mentioned those specimens, some of those specimens have been further genotyped or sequenced And so the genetic data about patients with COVID-19 is available, not on all patients But I’ll tell you a couple of things that we’re making available, and if you can give me feedback on how you would like to see this information, how you would like to consume it, what else we should think about So as part of the Michigan Genomics Initiative, Vicki mentioned there’s more than 80,000 patients at Michigan who have consented to have their DNA analyzed through a genome-wide association study chip that tells us variants about what’s in their genotypes, and then patients with a nasal swab with COVID have had the virus itself, the RNA sequenced That’s an increasing number that’s going to be publicly available Actually the RNA detail itself, but to take that RNA about the virus, and couple it with other types of data, we have about those same patients, their geolocation data, their social determinants, their clinical course of care That’s again an extreme advantage that the University of Michigan has over other groups who are doing this So if you’re looking for COVID-19 positive patients who are also in MGI, you could then say, All right, I’m going to work with the research facilitators or the Data Office to obtain their genotype their GWAS data And then the viral RNA again is just COVID positive from a nasal swab And that’s going to be available soon So if the self-serve tools are inadequate, to answer your research question because it’s highly complex and the variables all have a relation to each other that’s highly complex We have a set of SQL analysts, they’re database analysts who can pull custom data for you, according to your specifications, from multiple systems, they can link it to other types of data and they can deliver it to your secure server wherever you’re going to do your analysis And so we’ve been working together,

we have had probably 10 to 20 requests for custom data extracts related to COVID-19 And part of our, you know, influx of requests is when the basic science, you know, enterprise at Michigan had to close and when human subjects clinical trials had to close, researchers and their teams turned to things they could do remotely and data is one of those powerful things Informatics, modeling, associations, those are all things you can do while wearing sweatpants at home And so our business has been unbelievably busy and we are so grateful for that Because you can still request data polls, you can still play with DataDirect, you can still do have consults with the research scientific facilitators, all of that from home And so it’s helped the researchers bridge this time between, you know, closing of the enterprise and reopening of traditional research And it’s also gotten a lot of excitement in these researchers about what data are available that they had no idea And so we do a lot of consultations where people say, is this available? How many do you have with this? Can I link it with data I already have? So when you get to an actual data poll from one of our data analysts, it’s a $60 an hour recharge rate and a lot of the data requests are a one time delivery, others are a scheduled or an automated delivery So every Monday morning, a refresh of the data will automatically come to your secure analytic location So let’s look ahead for the last couple minutes And I want to tell you about new data since maybe you last used the tools or the services And I want to hear from you what else we should be going after what else we should be studying So in the last couple months, we’ve been working a lot on patient surveys or patient reported outcomes and we’ll talk more about that We have information on clinical trial enrollment, so we’re working to say I found these patients are eligible for my trial, how many of them are already involved in an a CT and probably could not be part of my trial? Cancer staging treatment and outcomes, we’re working with the cancer tumor registry that’s a really really valuable source of data for anyone who was diagnosed or treated at Michigan with cancer And it goes deep into their staging, their treatment, their outcomes, their biomarkers, things that in the regular medical record might be in a path report or might be in free text somewhere, this is actually unstructured fields These are registrars that our Cancer Center hires to really go in-depth and data enter these cancer findings The Michigan index is up to date as of a couple months ago and it has date of death and cause of death for anyone from Michigan medicine who died in the state of Michigan Additionally we have access to but not in the self-serve tools we have access to the National Death Index And so think of that as a much broader, that’s a national scope However it’s only data death, we don’t have cause of death from there So we’re searchers are telling me that they need some sort of hybrid information to get a more complete picture of outcomes research where you’re studying death Natural language processing, we’re working a lot with the learning health system on this, and taking free text or strings of data like presenting symptoms, chief complaint and we’re converting those into structured data that would be searchable like any other structured field And then we’re taking U of M data and spending a great amount of time mapping it to some of these national data standards So there’s a PCORI Network, Pediatric Trial Network, OMOP is a standard that’s used in Precision Health activities There’s a Common Data model where it will break down each of let’s say, 14 data tables and say, all of your lab files need to have white codes result, you know And all of your demographics have to be African American is one, Asian is two And so when you map your data, it’s the same data, it’s in a new format that makes you ready to collaborate with other institutions So we often get asked by our colleagues in engineering, “Is this big data?” And I never know how to answer I say it’s a lot of data, I don’t know if it’s big data But I do know it’s growing bigger by the day,

it’s millions of encounters, from patients who receive their care at Michigan It’s millions of genetic variants, it’s hundreds of surveys Right now what we know is it’s a not a very diverse population And so Precision Health is putting in great effort to expand our reach and expand the opportunity for patients from different socioeconomic backgrounds, from different racial or ethnic backgrounds to participate in MGI and to help us shape what we study and what we collect, and what’s important to them Because just Michigan Medicine alone is not especially diverse, it’s 95% English speaking, it’s 85% white or Caucasian It’s also a very high education level, so we really do So it’s big data, it’s a lot of data but it is getting bigger So it is taking the medical phenotype, the genetic geno-type Increasingly family history is being asked about and coded in a way you can study Behavioral and lifestyle So we have the Apple Watch study called MIPACT, where we’re learning a lot about patient’s blood pressure during their regular day And surveys they respond to via their Apple Watch What’s their pain score when they’re about to take their opioid? Are there any patterns we can learn from them? Environmental factors that may contribute to someone’s outcomes Social factors that may be contributing to who does well, who doesn’t So to glue all these together is our goal for the Precision Health Initiative and we’re getting close So although we’re getting a lot of data, we don’t need to get data, just for data sake We need to be data rich and information rich There’s this expression drip data rich information poor, like you sit on a pile of information but you can’t make any sense of it, you can’t obtain it in a way that could lead to a study result or a conclusion And so we want to be data rich and information rich, that’s our goal So the two surveys that I want to kind of close with are really exciting So Bhramar Mukherjee is leading an effort to do a epidemiology based questionnaire called EPIQ This is a one long and detailed, and nominal questionnaire, similar to what’s being done in the U.K with the U.K Biobank And it goes over environmental factors about your amount of sleep, your smoking, anxiety, your family, your family’s health And it’s really the type of information about a patient that is more predictive of outcomes than the data we collect in a 15-minute doctor’s appointment This is what leads to someone having healthy outcomes or not, it’s these and we’ve never collected it in this much of a systematic way So we’re in a pilot phase we have 60 surveys collected today We also have a COVID-19 questionnaire that we use through Qualtrics and we sent to more than 50,000 patients with and without COVID-19 These are patients at our Michigan Genomics Initiative, who have agreed to be recontacted It asked about perceptions to COVID-19, how healthy is your family? What age group is your family? For those who were infected, what were the symptoms? Were you hospitalized? Were you sent home? How long did that fever last? This will be discoverable in DataDirect Right now we have as of this morning 8,032 surveys completed today Bhramar Mukherjee, one of our faculty said that survey response rates have never been higher because people are at home and they have the opportunity to fill out surveys So it’s one of the small benefits of this pandemic and the stay home orders is that people have time to answer all of these questions which is great Geolocation data, this is information, we’ve taken the street address of patients, mapped it to a latitude and longitude, that latitude longitude coincides with a census track ID And so from that census track ID there are national data available about characteristics of a patient who lives on that block It’s mapped that individual patients, you know, income and education level, it’s data about that specific area So it’s much more granular than zip code data, it’s much more actionable, it’s much more related to health And this is the first time we’ve had this available So I’m just gonna go through these characteristics A lot of them have been compiled by national groups

into indices, so neighborhood affluence, neighborhood immigrant index And so some of them you can already take advantage of what’s been pulled together, others it just come to you like, for the census ID what are the percent of those in a household with an education less than eighth grade and income less than this? And it gives you some context when you’re studying outcomes of a given condition So I just want to finish with all these data, you know, we do our best to de-identify, and we do our best to protect that patient who has so generously allowed us to use your information to study But we can never say something’s 100% de-identified And if I have a single data set with lab values, I could do a pretty good job de-identifying But I just showed you all that information, information from their Apple Watch, information from their surveys, information about their genetics And when you combine all of those de-identified data sets together, patient identity becomes increasingly discoverable And so one of the main messages we really really really stress in Precision Health as well, all these resources are available and we’re so excited about them There’s additional owners on you, the user on us, the provider of the information to make sure we’re protecting these Even if we can give you a de-identified data set, we make sure that you keep a strong password on it, keep it in a HIPAA configured enclave, even if it doesn’t have HIPAA identifiers So that’s just one thing that’s underpinning all of the data I just got so excited telling you about is that we really really need to partner and keeping those data protected otherwise, this will all go away So now I would love to hear from you all about what else we should be working on What did I miss? What else is key to COVID or to Precision Health moving our Precision Health Initiative forward that we haven’t thought about or that I maybe didn’t mention? – Thank you Erin that was wonderful I don’t know how to virtually clap (both laugh) – We’ll pretend the lawnmower in the background is applause Vicki – Well you are going to be able how you’re hearing my dog applaud in the living room – Oh good, good, okay – But anyway so we did have a couple of questions come in that would great So one of the question has to do around the COVID genetic data availability And they’re interested in getting it nanopore sequenced and would appreciate it if you could get the Fast five files – Excellent, I would love to put you in touch with our colleagues in the Central Biorepository but it might also be our colleagues in public health biostats who work with the raw data derived from the DNA So Vicki maybe I’ll work with you just to get that individuals or the individual can send me an email to the email on the screen and I’ll get you in touch with the right people for that, that’s a great question – So there was also two other questions that I think kind of go along with each other So one is, is it possible to add patient age at a counter to the Precision Health version of DataDirect? Ages – Oh good – Ages associated with COVID-19 susceptibility so it’s important to include as a covariate it is available in the medical school version of DataDirect but not the Precision Health version And then the other question has to do regarding the patients that are in the COVID database of patients included in the starting population are inpatient or discharged Does that mean the population includes a higher proportion of severe COVID patients than the general population of people infected since many people don’t need to be in the hospital? – Great question, great question So first patient agent encounter, that’s absolutely doable in the Precision Health version of DataDirect So all of the dates because that DataDirect sits on top of a de-identified data set within a given patient’s record we took the dates and shifted them all the exact same amount so that we can Still do calculations between started on this medication and got this reaction this time, but they’re not real dates, they’ve been shifted And so agent encounter could fall in that same So I’ll make sure we can add that thank you so much for that suggestion And then for patients who tested positive, I would say they’re more severe if they made it, maybe to Michigan If there was a community hospital that diagnosed them,

but maybe didn’t have the full ICU and ventilators and other support staff So in that way they are but really we would just we don’t even look at hospitalization as a criterion we look at positive tests and diagnosis You can study who was hospitalized and who was sent home What I was trying to do with that slide is say that certain population includes all of those sent home after they were diagnosed which was a large percentage of the COVID positive The goal, you know, March and April when our numbers were highest, to get people out of the hospital was one of the biggest goals If they had a high fever they could be sent home, if they had low oxygen saturation, they had to be admitted And so the ED and others had to walk this fine line of who to hospitalize who not Whether you got sent home, or you got admitted those are all in the population So we don’t actually look at current location we just look at, do we have evidence of a positive test? Or did your record have a COVID diagnosis? So, I probably made it sound like we skewed toward inpatient, but really, it’s anyone with COVID-19 that we saw here in Michigan Medicine – Great, another question is what is the best path forward to obtain more granular information about biospecimens When I search in DataDirect the specimen type I’m interested in does not appear for COVID-19 cohort i.e in nasal swabs and what derivative forms of the sample are available? – Great question, thank you I hope next week we’ll be exposing nasal swabs Right now that screenshot I showed of COVID-19 that’s the blood and serum And so the nasal swabs will hopefully be available next week I think they’re gonna be coming through the feed, so we get a data feed from the Central Biorepository lab system, and so it’s a nightly feed In the meantime, if you want to email me and just say, how many patients of this cohort have a nasal swab? I could tell you that just through a manual pull but it will be available for discovery And we’re trying to think of, and if anyone wants to help us pilot test this, we’re trying to think of the best way to say these patients have specimens, here’s the various type These patients have genetic data, this is the virus RNA, this is the patient’s DNA and so we’re trying to figure out the best way to display all of that – So we also have a question about some of the NLP data And could you repeat a little bit about how to access the NLP data through the self service system? And specifically what kind of information is included in those NLP data? – Yeah thank you for that Right now we have about three use cases that we’ve tried out with NLP and we’re not sure how to display those in DataDirect, but we have them in coordination with the Node And then someone on the research data warehouse team Heung Ju also has done work with NLP And so again it’s one of those data types we have available and we can deliver to you through a custom We don’t know how to display it yet, self-serve But if you have suggestions, you know whoever asked the question I would love to work with you – And that right now is the end of the Q&As or chats that came in So if anyone else has another question, I can unmute you provided I can figure out how to do that I’ve not had to do that yet I think you have to raise a hand somehow And I’m not exactly sure how to do that (laughs) I might need some help I’m not seeing anything though But if you tell me your name and that you have a question, I certainly can unmute you I’m having a lot of appreciation for my daughter’s college that had a huge chat with thousands of parents and what the president of the university had to undergo for that So I’m sure I can figure it out with 40 people – Oh my gosh I love it – I’m not seeing any more questions – Okay – I do have an email to share with you later, Erin And once again I want to thank you, I think this was a fantastic and very timely and I certainly share your appreciation of how fast things and how Michigan was able to mobilize so quickly to do this in a structured fashion

And shows that we truly are the leader in this and your team in particular being able to share this data in such a wide way is truly remarkable I also want to thank Tina Crozier who helped put on this event and dealing with a lot of the logistics I think this is one of the first webinars that Precision Health has done So thank you Tina for your organizational skills, and I want to thank everyone who came and attended this The recording is going to be available on the Precision Health webpage, which again is precisionhealth.umich.edu We will also email all the attendees to let them know when that is up But again I think this was really remarkable So thank you so much – Thanks Vicki – And we are concluded – All right – Thank you – Thank you – Bye-bye