FINDbase Userguide

1.    General introduction

2.     User registration

2.1.Administrators

2.2.National coordinators

2.3.Advisors and curators

2.4.Genetic disease consortia members

2.5.Obtaining an account

3.    Web site

4.    Home page

5.    Querying the database

5.1.Data querying by Population

5.2.Data querying by Disorder/Mutation

6.    Data entry and modification

7.    Miscellaneous features

7.1.Quality features

7.2.Summary listings

7.3.Data source

7.4. FINDbasecontributors

8.    Selected reading


 

1. General introduction

FINDbase stands for Frequency of INherited Disorders database and is designed to provide a free online tool for population-specific mutation data collection and display of mutation frequencies. FINDbase was developed to facilitate easy creation and maintenance of a fully web-based population-specific database, which is platform-independent and uses open source software (PHP and MySQL) only. The design of the database follows the recommendations of the Human Genome Variation Society (HGVS; www.hgvs.org) and focuses on the collection and display of the various pathogenic DNA sequence variations and their frequencies in various populations around the globe. Links to Online Mendelian Inheritance in Man (OMIM) are provided to enable the user to access more in-depth information on all inherited disorders for which information of their mutation spectra is recorded in the database, therefore contributing towards database content uniformity.

FINDbase is a collection of many National and Ethnic Mutation DataBases (NEMDBs), operating under the same platform, i.e., software. The NEMDBs, which form FINDbase, can be created in two ways:

1. Population related, data on genetic disorders found in a particular population are entered by the group of researchers active in the field of human genetics in their country of origin.

2. Disorder related, population-specific data can be also contributed into FINDbase from members of genetic disease consortia, (Cystic Fibrosis, etc), locus-specific databases (HbVar database for hemoglobin variants and thalassemia mutations, PAHdb, etc) and human genetic journals. In this case, each entry corresponding to a particular population, not previously documented in FINDbase, results automatically in a new NEMDB creation.

The design of FINDbase allows for easy bi-directional flow of mutation frequency information, e.g., from each NEMDB to FINDbase and back (data warehousing; Fig. 1). The latter implies that data within FINDbase can be used not only for those queries that the source databases can handle, but also for those that require integrated knowledge that the individual sources do not have.

. Data warehousing

Each NEMDB, contributing data to FINDbase, is controlled by a dedicated team of researchers. This team consists of the National coordinator and a number of advisors/curators (Fig. 2). Overall, FINDbase will be under the control of a multinational steering committee, consisting of several National coordinators and worldwide experts in the field of human (population) genetics.

Development of FINDbase initiated in late 2004, while it was first officially released in January 2006. FINDbase has derived from the key functionalities of the previously described ETHNOS software, which provides the basis for flat-file National Mutation database construction. Since February 2006, development of FINDbase is more targeted at improving the ease of use of the system. The article describing the database is scheduled for publication in late 2006.

Fig. 2. NEMDB management team

2. User registration

In FINDbase, there are four levels of registered users: (1) Administrators, (2) National coordinators, (3) Advisors and curators and (4) genetic disease consortia members. FINDbase visitors require not to be previously registered.

2.1. Administrators

FINDbase administrators have full access to all database’s functionalities and contents. Only the FINDbase technical staff have administrative rights.

2.2. National coordinators

National coordinators are mainly responsible for managing the overall construction and maintenance of a NEMDB, contributing to FINDbase. Their role is also to promote the usage of their NEMDB as an aid to several diagnostic laboratories locally, and to promote investigations on those genetic disorders that their mutation spectrum is not yet known in their countries.

To obtain a National coordinator account, not previously included in FINDbase, a request should be filled (see 2.5. Obtaining an account), which is automatically sent to the database administrators. Granting a National coordination account is subject to FINDbase steering committee decision. Eligibility of an individual interested to serve as National coordinator of a particular National Mutation database is based on the extent of his/her previous involvement in genetic services in the country whose NEMDB wishes to coordinate and his/her general scientific record. Following favorable decision by FINDbase steering committee, the administrator creates the National coordinator account. National coordinator’s rights include:

(a) Data entry and modification, (b) registration and account activation for advisors/curators for the NEMDB he/she coordinates, (c) data re-allocation to another advisor/curator (to be done).

2.3. Advisors and curators

Advisors and curators are responsible for data entry in the NEMDB, for which they are registered. Each advisor/curator can only modify or delete data entered by him/herself and under no circumstances can alter data entered by another advisor/curator. In the data entry fields (see paragraph 5.2), the name of the population that the advisor/curator is registered, is automatically set (by default). If an advisor/curator decides to end his/her involvement with FINDbase, the National coordinator will transfer the data he/she contributed to another advisor/curator, who will then be responsible for their modification/removal.

To obtain an advisors/curators account, for NEMDBs previously included in FINDbase, a request should be filled (see paragraph 2.5. Obtaining an account), which is then automatically sent to the National coordinator of that particular NEMDB. The National coordinator will then grant the account, depending on the applicant’s eligibility (see above). It is not possible to obtain an advisors/curators account, for a NEMDB not previously included in FINDbase, or for a NEMDB without a National coordinator.

2.4. Genetic disease consortia/locus-specific database members

Genetic disease consortia members have the same rights as the advisors and curators of NEMDBs. The only difference is that in the data entry fields, the genetic disease name is automatically set, according to the corresponding genetic disease consortium/locus-specific database, instead of the population name.

2.5. Obtaining an account

To request an advisors/curators account, you need to select from the population/NEMDB menu and then fill in the questionnaire in the “Register” page (Fig. 3). To request a National coordinator account, you need to specify the population’s name in the “New population” field and then fill in the same questionnaire. CAUTION: Please fill in as many fields as possible to provide the maximum amount of your contact details.

All personal data, including email-addresses are for internal use only and to communicate between curators and the National coordinator. Only the name of a curator and the organisation he/she works for is visible for the end user. The FINDbase staff will not make personal data available to non FINDbase members.

3. Website structure

FINDbase website consists of the menu, located at the left side of the screen and indicating the different available options and the main screen. Different options, currently supported are:

Fig. 3. Registering for an account in FINDbase [for advisor/curator for those populations already documented or for National coordinator for a new NEMDB/population (e.g. Italian)].

1. Home: Welcome page with summary listings for the populations recorded within FINDbase.

2. User guide: Provides detailed information pertaining to database query, curation and coordination.

3. Search: Provides search options for the data.

4. Login: Data entry portal for curators.

5. Links: Contains links to relevant sites.

6. News: FINDbase Related news and interesting events.

7. Contact us: Detailed contact information of the administrators to the FINDbase project.

8. Curators: List of all curators contributing data to FINDbase.

9. Register: Registration page for those researchers interested to get involved as advisors/curators or National coordinators for FINDbase.

The current FINDbase release is also displayed and the date FINDbase is last updated is automatically set beneath. Information on the database visitors, corresponding to the database impact to society is shown together with summary listings, based on disorders, genes and mutations (see also paragraph 7.2).

4. Home page

FINDbase homepage provides several means of accessing each NEMDB recorded. The main tool includes a world map with dots, corresponding to those populations for which their genetic makeup is documented in FINDbase. Placing the arrow above each dot (corresponding to the capital of each country) allows the user to see how many disorders are documented in this population (Fig. 4a). By clicking on a dot, a succinct summary of all genetic disorders, the genes involved and the documented mutants are shown (Fig. 4b). Alternatively, select a population’s name from the menu, located next to the worldmap or go directly to a disorder.

5. Querying the database

Data stored in FINDbase can be queried in two different ways: (a) by population or (b) by disorder. This is illustrated in Figure 5.

5.1. Data querying by population

Select a population group from the left-side menu. This automatically yields a summary displayed in Figure 4b. By selecting a disorder from the menu below, a more detailed list is provided (Fig. 6), including the OMIM link, providing the user with access to textual information on that particular disorder, the mutations (written in their official nomenclature), and the corresponding mutation frequencies. This table is followed by the name and contact details of the curator, who entered this data set and the source article used.

A.

B.

Fig. 4. (A)FINDbase welcome page, (B) Sample query outcome, when selecting to display data for the Greek-Cypriot population from the world map option

Fig. 5. Querying options provided by FINDbase.

Fig. 6. Outcome of the query “Display all mutations leading to beta-thalassemia in the Croatian population” (in a Table format).

The query can be refined if the mutation frequency range is specified in the respective boxes. Also, the query outcome can be provided as a graph, rather than a table (Fig. 7).

5.2. Data querying by Disorder

Querying data in FINDbase is also possible by Disorder. The user needs to select a genetic disorder name from the right-side menu and the query returns the different population groups for which data are available in FINDbase (Fig. 8a). By selecting a Mutation from the list right below, the query can be further refined to that particular mutation related to those populations where it is identified (Fig. 8b).

Fig. 7. Graphical display of the query “Display all mutations leading to β-thalassemia in the Croatian population”.

6. Data entry and modification

In FINDbase, data entry is only possible for curators. The data entry page can be found in the “Login” option. To access the data entry communication fields, the user needs first to login at the top-right corner of the main screen. Once logged in, the registered user enters the Publication data editor (Fig. 9), which will further guide him/her through the data entry procedure.

A.

B.

Fig. 8 (A) Outcome of the query “Display all mutations leading to Glucose-6-phosphate dehydrogenase deficiency (G6PD) in all populations” (B) Query can be further refined for a certain mutation (e.g. G6PD Acrokorinthos).

In brief, the user needs first to specify the type of data source, e.g. publication or abstract presented in an international conference or unpublished data contributed by a diagnostic laboratory or research center. In case of publication, by including the PubMed ID, the respective article abstract is automatically entered in the source documents (see also paragraph 3).

. Publication data editor, the data entry tool of FINDbase

Fig. 10. Data entry for disorders already documented in FINDbase.

If the source publication corresponds to a genetic disorder already documented in FINDbase for another population group (Fig. 10), the user only needs to type it in. CAUTION: In order to avoid redundant data entries, only one publication (the most representative one, according to the National coordinator) can be entered for each genetic disorder per population.

In both cases, empty fields appear in a Table format, where the registered user enters the data, e.g., OMIM ID, the mutation name in the correct (official) nomenclature and the number of chromosomes or a frequency value relative to the number of chromosomes. Remember that for NEMDB curators/advisors, the population name is set automatically and they only need to define the disorder name.

A.

B.

Fig. 11. (A) Data entry for disorders not previously documented in FINDbase. In this case, the genetic disorder included in this publication is automatically added in the disorders list, based on the PubMed ID given by the user. (B) Data addition and/or modification of existing records, e.g. for Cystic Fibrosis mutation frequency data reported for the Algerian population.

In case of Genetic disorder consortia members, the name of the disorder is automatically set and the population field needs to be defined. When finished, by clicking on the “Add” bottom, a new line will appear, while the “Remove” bottom will erase that particular line from the records. By clicking on the “Related articles” link, a PubMed search is performed to identify related papers to that particular entry. CAUTION: DO NOT forget to click on “Save” when data entry for a particular publication is completed.

7. Miscellaneous features

7.1. Quality features

FINDbase design follows certain database guidelines in order to conform to quality. The date of the last update is automatically set and clearly displayed at the left-side menu, together with the number of visitors, as an indication of the database impact to society. The copyright and disclaimer notice are also clearly visible at the bottom of each page and so is the source of funding.

7.2. Summary listings

FINDbase gives the opportunity to the user to access its contents through summary listings. There are 3 possibilities:

1. Summary sorted by Genetic Disorder. In this case, the genetic disorders are listed together with the gene(s), which when mutated lead to that particular disorder.

2. Summary sorted by Genes. In this case, the genes are listed together with the genetic disorders and the mutations documented in the database.

3. Summary sorted by Mutation. The information available in this listing is identical to the one sorted by genes. However, the table format is such that each row includes only one mutation, which makes it easier for the user to view.

7.3. Data source

In order to provide details on FINDbase data source, a listing with the documents/publications from which FINDbase data originate is provided. This list is automatically compiled, during data entry (see also paragraph 6).

7.4. FINDbase contributors

A detailed list of contributors is clearly shown, by selecting the option “curators” at the left-side menu. Each contributor’s name is accompanied by his/her institute, the city and country of origin and the total records contributed to FINDbase. Users interested to contact those individuals can click on their name to access their e-mail address.

8. Selected reading

8.1. Scriver CR, Nowacki PM, Lehväslaiho H. (1999). Guidelines and recommendations for content, structure and deployment of mutation databases. Hum Mutat 13:344-350.

8.2. Sipila K, Aula P. (2002). Database for the mutations of the Finnish disease heritage. Hum Mutat 19:16-22.

8.3. Hardison RC, Chui DH, Giardine B, Riemer C, Patrinos GP, Anagnou N, Miller W, Wajcman H. (2002). HbVar: A relational database of human hemoglobin variants and thalassemia mutations at the globin gene server. Hum Mutat 19:225-233.

8.4. Teebi AS, Teebi SA, Porter CJ, Cuticchia AJ. (2002). Arab genetic disease database (AGDDB): a population-specific clinical and mutation database. Hum Mutat 19:615-621.

8.5. Horaitis O, Cotton RG. (2004). The challenge of documenting mutation across the genome: the human genome variation society approach. Hum Mutat 23:447-452.

8.6. Patrinos GP, Giardine B, Riemer C, Miller W, Chui DH, Anagnou NP, Wajcman H, Hardison RC. (2004). Improvements in the HbVar database of human hemoglobin variants and thalassemia mutations for population and sequence variation studies. Nucleic Acids Res 32:D537-D541.

8.7. Patrinos GP, van Baal S, Petersen MB, Papadakis MN (2005). The Hellenic National Mutation database: A prototype database for inherited disorders in the Hellenic population. Hum Mutat, 25: 327-333.

8.8. Patrinos GP, Brookes AJ (2005). DNA, diseases and databases: Disastrously deficient. Trends Genet, 21: 333-338.

8.9. Tadmouri GO, Al Ali MT, Al-Hai Ali S, Al Khaja N. (2006). CTGA: the database for genetic disorders in Arab populations. Nucleic Acids Res, 34: D602-D606.