The identification of high-risk factors for the infection by SARS-CoV-2 and the negative outcome of COVID-19 is crucial . The genetic background of the host might account for individual responses to SARS-CoV-2 infection besides age and comorbidities . A list of candidate polymorphisms is needed to drive targeted screens, given the existence of frequent polymorphisms in the general population . We carried out text mining in the scientific literature to draw up a list of genes referable to the term``SARS-CoV *". We looked for frequent mutations that are likely to affect protein function in these genes . Ten genes, mostly involved in innate immunity, and thirteen common variants were identified, for some of these the involvement in COVID-19 is supported by publicly available epidemiological data . We looked for available data on the population distribution of these variants and we demonstrated that the prevalence of five of them, Arg52Cys (rs5030737), Gly54Asp (rs1800450) and Gly57Glu (rs1800451) in MBL2, Ala59Thr (rs25680) in CD27, and Val197Met (rs12329760) in TMPRSS2, correlates with the number of cases and/or deaths of COVID-19 observed in different countries . The association of the TMPRSS2 variant provides epidemiological evidence of the usefulness of transmembrane protease serine 2 inhibitors for the cure of COVID-19 . The identified genetic variants represent a basis for the design of a cost-effective assay for population screening of genetic risk factors in the COVID-19 pandemic.