The Chan Zuckerberg Initiative Foundation, a 501(c)(3) nonprofit private foundation ("CZIF," "we," "us," or "our"), provides the Chan Zuckerberg GEN EPI product ("Services" or "CZ GEN EPI") in close collaboration with the Chan Zuckerberg Biohub ("CZB"), and the Chan Zuckerberg Initiative, LLC ("CZI LLC"). This Privacy Policy describes the types of information we collect or that is uploaded by CZ GEN EPI Users (collectively "Users" or "you", ex: registered public health officials at state and/or county level Departments of Public Health ("DPH"), other public health researchers), and how we use, share, and protect that information.
CZ GEN EPI is a tool that uses pathogen genomic sequence data to help you infer how pathogens are moving through a population and how cases and outbreaks are related. In order to become a User of CZ GEN EPI you must be acting in your organizational capacity, which means a couple things: (1) your use of CZ GEN EPI may be subject to your organization’s policies and (2) upon sign-up, you’ll be placed into a group with other users from your organization.
Here’s how CZ GEN EPI processes and manages Upload Data: Users submit Raw Sequence Data (as described below) as well as information about those sequences, such as the date the sample was collected ("Sample Metadata" as further defined below -- Raw Sequence Data and Sample Metadata together make "Upload Data"). Any human genetic data contained within the Raw Sequence Data is filtered out and deleted following upload, leaving genomic data only about the pathogen. This pathogen genomic data is then analyzed in order to identify the normally-occurring genetic mutations that make up each pathogen sample’s unique genetic "barcode." This barcode can then be used to identify strains, variants, and relationships between samples. By default, these analytical outputs will be visible to the User that uploaded the Sample and other members of the User’s organization ("Group", ex: a Department of Public Health) using CZ GEN EPI.
Users can then choose to share analytical outputs outside their Group. We hope that this sharing of pathogen data will help to create a clearer picture of how pathogens are circulating in your community and thereby help advance public health goals.
To help you better understand our Privacy Policy, we’ve created the below Summary, which includes bullets regarding Key Things to Know, as well as a Table summarizing key aspects of our data practices. For more information about the rules governing your use of CZ GEN EPI, please also see our Terms of Use ("Terms"). Please remember that you are using CZ GEN EPI in your organizational capacity, which means that your organization’s policies will apply to your use.
Type of Data | What is it? | What’s it used for? | How is it shared? | Your Choices |
Data you upload to or create using CZ GEN EPI | ||||
Raw Sequence Data | Genetic sequence files (ex: FASTQ) uploaded by Users containing both host and pathogenic genomic data. | Upon upload, Raw Sequence Data is processed through our data pipeline and all human genetic information is filtered out and deleted. We use the remaining data, with Sample Metadata, to create the Pathogen Consensus Genome and to support your creation of further analytical results. | Raw Sequence Data is processed only to filter out host data. It is not available to anyone other than you. Other than as specifically requested by you, such as to debug an issue, staff working on CZ GEN EPI never access this data. | Users can request deletion of Raw Sequence Data, Sample Metadata, Pathogen Consensus Genomes, analytical outputs, or their CZ GEN EPI account data by contacting us at hello@czgenepi.org and we will fulfill the request within 60 days. Please be aware, however, that we cannot delete any Pathogen Consensus Genomes or analytical outputs which have been shared outside of CZ GEN EPI. |
Sample Metadata | Data about Samples annotated by Users (ex: sample collection date or location). | See above. | Sample Metadata is visible to other users in your Group, as well as third party entities that your Group is visible to. These entities can see your samples, but not your private, internal identifiers. This data is also accessible by technical partners (CZ Biohub and CZI, LLC) and Service Providers (ex: AWS) that help operate and secure CZ GEN EPI. For example, we need to be able to access your data in order to back up and maintain the database. This Privacy Policy applies to all parties that access data to support CZ GEN EPI and they will not use the data for any purpose beyond operating and securing CZ GEN EPI. We will never sell your data or share it with anyone that does. | |
Pathogen Consensus Genome | Data about the likely pathogen strains contained within the Raw Sequence Data | See above. Users may also upload this data to CZ GEN EPI directly if they have assembled a pathogen consensus genome in a different program, but would like to analyze that genome in CZ GEN EPI. | Pathogen Consensus Genomes are visible to other users in your Group, as well as third party entities that your Group is visible to. Samples marked "private" will never be shared beyond your Group unless you choose to mark them "public" later on. | |
Analytical results | Analyses created by Users based on Pathogen Consensus Genomes (ex: phylogenetic trees). | Users use CZ GEN EPI to drive analytical results that they can then choose to share more broadly. | Analytical results you create are visible to other users in your Group, as well as third party entities, that your Group is visible to. | |
Data CZ GEN EPI collects | ||||
User Data | Data about researchers with CZ GEN EPI accounts such as name, email, institution, basic information about how they are using CZ GEN EPI, and information provided for user support (ex: resolving support requests). | We use this data only to operate, secure, and improve the CZ GEN EPI services. | Basic CZ GEN EPI account information such as name and institution may be visible to other CZ GEN EPI Users. This data is also shared with technical partners (CZ Biohub and CZI, LLC) and Service Providers (ex: AWS) that help operate and secure CZ GEN EPI. This Privacy Policy applies to all parties that access data to support CZ GEN EPI and they will not use the data for any purpose beyond operating and securing CZ GEN EPI. We will never sell your data or share it with anyone that does. | Users can request deletion of their CZ GEN EPI account data by contacting us at hello@czgenepi.org and we will fulfill the request within 60 days. |
Device and Analytics Data | Device Data (ex: browser type and operating system) and Analytics Information (ex: links within CZ GEN EPI you click on and how often you log into CZ GEN EPI) includes basic information about how Users and Visitors are interacting with CZ GEN EPI. | See above. | See above. |
"Upload Data" is data that Users upload to CZ GEN EPI (other than the information which is provided during registration to create a User account). Upload Data consists of pathogen genomic data (including "Raw Sequence Data", which includes both host and pathogenic genome data and "Pathogen Consensus Genomes," which is only pathogenic genome data) and corresponding metadata ("Sample Metadata", such as time and location of sample collection). In the event that human genetic sequence information is uploaded as part of the Upload, it is removed as part of processing the Upload.
As described in our Terms, Users are required to obtain and maintain all necessary consents, permissions, and authorizations required by applicable laws prior to uploading, sharing, and exporting Upload Data with the Services.
Upload Data includes Raw Sequence Data, Pathogen Consensus Genomes, and Sample Metadata.
If you are able to find data in CZ GEN EPI or any Sample Metadata that you believe is identifying, please let us know at privacy@czgenepi.org and we will address it.
Upload Data is used for the following purposes:
We do not own Upload Data and will never sell it. As mentioned above, your Upload Data will be visible within your Group.
Raw Sequence Data and Sample Metadata are shared back to the Users that uploaded the data, as well as other Users within the same organization (your "Group"). Please note that while the Raw Sequence Data is temporarily visible to other members of your Group, this data is not retained on the CZ GEN EPI platform.
We may also share your Pathogen Consensus Genomes (whether uploaded by you or generated by us) and/or analytical outputs with third parties in accordance with the provisions of your organization’s policies and/or as required by law. For example, certain Users in California currently allow the California Department of Public Health ("CDPH") to access data from their Group. Where such access is allowed by Groups, the third party can access this data through their own CZ GEN EPI accounts, and may have similar viewing permissions as members of the uploading Group. However, they will not have access to your private, internal identifiers.
You control the sharing of Raw Sequence Data and Sample Metadata which has been uploaded by any member of your organization Group. It will not be visible to other Users outside of your Group unless you choose to share it more broadly. We don’t own, rent, or sell your data.
Pathogen Consensus Genomes, whether uploaded by you or generated by CZ GEN EPI, will be shared by us with public repositories (as set out below) unless you choose to mark this information as "private." In the event that the Pathogen Consensus Genome is created by us, it will automatically be marked as private if the corresponding Raw Sequence Data is marked private.
Data uploaded to CZ GEN EPI by Users should always be anonymous. The pathogen genome does not contain personal data, as it cannot be personally linked with an identifiable individual.
In the rare event that human genetic data is not successfully deleted in the initial upload process, CZ GEN EPI may process this data only insofar as necessary in order to delete it. This processing is in our legitimate interest, and in the legitimate interests of CZB and CZI LLC, in order for us to ensure that no personal data is contained within the genomic data stored on CZ GEN EPI.
If you have uploaded Raw Sequence Data, we first strip any human reads, and then generate a Pathogen Consensus Genome by mapping the remaining sequencing reads to a pathogen-specific reference genome. These Consensus Genomes are the foundational unit of analysis for genomic epidemiology. If you have submitted Pathogen Consensus Genomes as Raw Sequence Data, we simply align it to the appropriate pathogen reference genome.
CZ GEN EPI also gives you the ability to create new analytical outputs from pathogen genomes, such as phylogenetic trees that allow you to better map the relationship between strains.
Users have full control over their data and the ability to mark samples as "private". Private samples will never be shared outside of your Group unless you choose to mark them as "public" later on.
Analytical results, including phylogenetic trees, generated from your Upload Data and Pathogen Consensus Genomes are the property of your Group only, and can only be seen by you and members of your group. You and your group control who to share them with and when.
Additionally, as outlined above, this data may be visible in some form to third parties in accordance with your organization’s policies, or in accordance with applicable law.
CZ GEN EPI also collects information about Users in order to offer the Service. Other than basic information required to create an account (e.g. email address, name, Group affiliation), the User determines what information they want to upload onto CZ GEN EPI.
User Data will be used to operate, secure, and improve the Services. This means the following purposes:
We (along with CZB and CZI LLC) have a legitimate interest in using personal data within User Data in the ways described in this Privacy Policy to provide, protect, and improve CZ GEN EPI. This allows us to improve the service that we provide to Users which, in turn, supports research regarding the study of infectious disease with the potential to benefit global public health.
CZB, CZIF, and CZI LLC collaborate closely in order to build, design, and operate CZ GEN EPI so that it can be as useful as possible to researchers and the public health community. CZB and CZIF provide scientific and data analysis leadership and CZI LLC focuses on maintaining CZ GEN EPI’s infrastructure, security, and compliance. The three parties are all data controllers for CZ GEN EPI and will all only use data as described in this Privacy Policy.
We also use service providers, such as database providers like Amazon Web Services, to support the operation of CZ GEN EPI. These service providers are data processors and their use is limited to the purposes disclosed in this Privacy Policy.
Users have the option to share their analytical outputs with certain third party tools. You control whether to use these integrations or not.
In certain circumstances, we also share your Upload Data and analytical results with other governmental, public health entities in accordance with your organization’s policies and with applicable law. For example, certain Users in California currently allow the California Department of Public Health ("CDPH") to access Upload Data and analytical results from their Group.
In the unlikely event that we can no longer keep operating CZ GEN EPI or believe that its purpose is better served by having another entity operating it, we may transfer CZ GEN EPI and all data existing therein (Upload Data, analytical outputs, and User Data) so that Users can continue to be served. We will always let you know before this happens, and you will have the option to delete your account and any data you’ve uploaded. Should this occur, the entity to which we transfer your data will be obliged to use it in a manner that is consistent with this Privacy Policy and our Terms.
We may disclose Upload Data, analytical outputs, and/or User Data if we believe in good faith that such disclosure is necessary (a) to comply with our legal obligations or to respond to subpoenas or warrants served on us; (b) to protect or defend our rights or property or those of Users; and/or (c) to investigate or assist in preventing any violation or potential violation of this Privacy Policy, or our Terms.
We use industry standard security measures to ensure the confidentiality, integrity and availability of data uploaded into CZ GEN EPI. This includes practices like encrypting connections to CZ GEN EPI using TLS (encrypting data while in transit), hosting CZ GEN EPI on leading cloud providers with robust physical security, and ensuring that access to any personal data within CZ GEN EPI by CZIF, CZB, and CZI LLC staff is limited to those staff who need access to operate the Service.
Security takes ongoing work and we will continue to monitor and adjust our security measures as CZ GEN EPI develops. Please notify us immediately at security@czgenepi.org if you suspect your account has been compromised or are aware of any other security issues relating to CZ GEN EPI.
We retain your personal data only as long as is reasonably necessary:
Please note that we do not control, and so cannot delete Pathogen Consensus Genomes and analytical outputs that have been shared outside of CZ GEN EPI.
Users have the following choices:
CZ GEN EPI is a US-based service. If you want to use CZ GEN EPI, you must first agree to our Terms, which set out the contract between CZ GEN EPI and our Users. We operate in the United States, and use technical infrastructure in the United States to to deliver the Services to you.
If you have any questions, comments, or concerns with this Privacy Policy, you may contact us at privacy@czgenepi.org.
We may update this Privacy Policy from time to time and will provide you with notice of material updates before they become effective.