Querying
     rs IDs
     Database marker IDs
     Chromosomal Positions

LD Search Options
     Genotypes database
     Populations
     Min MAF
     r2 Range
     Max Distance (kb)
     Max # Mendel errors
     HW p-value cutoff
    
Job Results
     Job Summary
     Download Results
     Query Results

Query Results
     

Querying

To use rAggr, enter a list of genomic queries into the search box. You may enter rs numbers, marker IDs from the selected genotypes database, or chromosomal positions.

 

A maximum of 500 queries may be entered into the search box at a given time.

 

rs IDs

 

An rs ID represents one SNP/indel and uses an identifier used by the dbSNP database. Some markers in the available genotypes databases use rs numbers either as primary IDs or as an aliases. If the rs number is an alias, and matches multiple markers, then all matching markers will be queried.

 

Database marker IDs

 

Each genotypes database has its own set of unique marker IDs. Some examples from the 1000 Genomes Phase III include rs145072688:10352:T:TA and 1:10579:C:A, which can be combinations of rs ID, chromosome, position and alleles, or just chromosome, position and alleles. Markers can always be queried on their database marker ID. If you don't know the database marker ID, querying by the chromosomal position (e.g., 1:10579) can help identify it.

 

Chromosomal Positions

 

In addition to the above query types, you may enter a chromosomal position, which consists of a chromomsome ID, followed by a colon (:), then a nucleotide position. All markers that map to the entered position will be queried. For indels, the position corresponds to the position of the first base pair of the reference allele (assuming a VCF-formatted position). Positions are based on either hg18 or hg19 depending on the genotypes database selected.

 

Examples:

 

chr7:98,930,691

7:98,930,691

7:98930691

 

 

LD Search Options

Genotypes database

 

rAggr caluclates pairwise linkage disequilibrium values using the genotypes of a reference database. The following databases are available:

     1) 1000 Genomes, Phase 3, Oct 2014. See http://www.1000genomes.org/. Genotypes were downloaded from the IMPUTE2 website. hg19 positions are used.
     2) 1000 Genomes, Phase 1, Mar 2012. See http://www.1000genomes.org/. Genotypes were downloaded from the IMPUTE2 website. hg19 positions are used.
     3) HapMap Phase II, release 24 (HapMap2_r24, Oct 2008). See http://hapmap.ncbi.nlm.nih.gov/. hg18 positions are used.

 

Populations

 

You may select one or more HapMap or 1000 Genomes populations to use as a reference genotype dataset. The available populations and number of individuals in each population vary by genotypes database. For more information, visit the website of the database of interest. Every query you enter will have a separate result set for each population you select. You must select at least one population or combination of populations (e.g., CHB+JPT) to run rAggr. The descriptions of the available populations are below:

 

African ancestry

ACB African Caribbean in Barbados
ASW African Ancestry in Southwest US
ESN Esan in Nigeria
GWD Gambian in Western Division, The Gambia
LWK Luhya in Webuye, Kenya
MSL Mende in Sierra Leone
YRI Yoruba in Ibadan, Nigeria

Americas ancestry

CLM Colombian in Medellin, Colombia
MXL Mexican Ancestry in Los Angeles, California
PEL Peruvian in Lima, Peru
PUR Puerto Rican in Puerto Rico

East Asian ancestry

CHB Han Chinese in Bejing, China
CDX Chinese Dai in Xishuangbanna, China
CHS Southern Han Chinese, China
JPT Japanese in Tokyo, Japan
KHV Kinh in Ho Chi Minh City, Vietnam

European ancestry

CEU Utah residents with Northern and Western European ancestry
FIN Finnish in Finland
GBR British in England and Scotland
IBS Iberian populations in Spain
TSI Toscani in Italia

South Asian ancestry

BEB Bengali in Bangladesh
GIH Gujarati Indian in Houston,TX
ITU Indian Telugu in the UK
PJL Punjabi in Lahore,Pakistan
STU Sri Lankan Tamil in the UK

 

Min MAF

 

The minimum MAF field specifies the minimum minor allele frequency every inputted marker and result marker must have in order to be displayed. If an input SNP entered as an 'rs' number has an MAF below the threshold for a population, the results page will dislay 'Not Passed' for that marker/population. Inputted markers that map to a queried region that do not pass this filter will not be displayed at all.

 

r2 Range

 

The r2 range fields are used to specify minimum and maximum r2 values for every inputted marker/result marker pair. Marker pairs with r2 values outside of this range will not be displayed in the results.

 

Max Distance (kb)

 

The max distance field specifies the maximum distance to calculate LD beween input and result markers. Values entered should be in units of kilobases. Please note that large distances will increase the LD caclulation time.

 

Max # Mendel errors

 

This field allows you to specify the maximum number of Mendel errors any input or result marker can have in order to be displayed in the results. This only applies to databases and populations containing parent-parent-child relationships (e.g., HapMap2 CEU and HapMap2 YRI). If an inputted marker entered as an 'rs' number has more than the specified number of Mendel errors for a population, the results page will dislay 'Not Passed' for that marker/population. Inputted markers that map to a queried chromosomal position that do not pass this filter will not be displayed at all.

 

HW p-value cutoff

 

This field allows you to specify the lowest Hardy-Weinberg Equilibrium p-value any input or result SNP can have in order to be displayed in the results. If an inputted marker entered as an 'rs' number has a HW p-value less than the cutoff for a population, the results page will dislay 'Not Passed' for that marker/Population. Hardy–Weinberg equilibrium is computed using an exact test.

 

 

Job Results
Job Summary

This section of the Results page shows all of the user-selected job options (including the job ID) in use by the software.

Download Results

All of the results for a job can be downloaded in this section. Two files are available: "All Results" and "Union of Results". Both files are compressed zip archives. Each file has the job ID as the prefix. The "All Results" zip file contains a .csv (comma-separated values) file for every query that was entered by the user and passed the user-defined thresholds. The "Union of Results" zip file contains one .csv file, which is the concatination of all the files in the "All Results" zip file. Every .csv file in the two archives contains the following columns:

SNP1 Name Name of the queried marker. Note that the ID will correspond with the ID in the genotypes database, and may not match your query if it matched as an alias.
SNP1 Chr Mapping chromosome of the queried marker.
SNP1 Pos Mapping position of the queried marker.
SNP1 Ref Reference allele of the queried marker. This is only available for certain databases.
SNP1 Ref Alternate allele of the queried marker. This is only available for certain databases.
SNP1 MAF Minor allele frequency of the queried marker
SNP1 Minor Allele Minor allele of the queried marker. Either "Ref" for reference allele or "Alt" for alternate allele.
SNP2 Name Name of the proxy marker.
SNP2 Chr Mapping chromosome of the proxy marker.
SNP2 Pos Mapping position of the proxy marker.
SNP2 Ref Reference allele of the proxy marker. This is only available for certain databases.
SNP2 Ref Alternate allele of the proxy marker. This is only available for certain databases.
SNP2 MAF Minor allele frequency of the proxy marker
SNP2 Minor Allele Minor allele of the proxy marker. Either "Ref" for reference allele or "Alt" for alternate allele.
Population Reference population code.
r^2 Pairwise r-squared value of the query and proxy markers.
D' Pairwise D' value of the query and proxy markers.
Distance Distance in bases between query and proxy markers


Query Results

This section displays a table of all the queries for every population selected. Every row contains the following columns:
 
Query Name Name of the query entered.
Population Population for this query results set.
Status Status of the results. When ready, a link to the HTML results are displayed (see Query Results section below).
"Not Found" indicates the query could not be found for any of the query types.
"Not Passed" indicates the query did not pass the user-defined job thresholds.
Notes Any additional notes corresponding with the query. This may indicate why the query did not pass or if multiple markers matched the query.
CSV File A link to download the results for this query as a CSV file. See "Download Results" section above for format details.
UCSC Bed File A link to download a BED file for loading into the UCSC Genome Browser. Contains all result SNPs for this query.

The query name will contain a link to dbSNP for this marker when applicable.

Query Results
When clicking on the "View Results" link for a query, the browser is redirected to the "Query Results" page, which shows the results for one query as a series of tables in HTML format. A maximum of 1000 results will be displayed in HTML. For the full results, downlolad the corresponding CSV file. Every column can be sorted by clicking on the column header. The results table contains the following columns:

SNP1 Name Name of the queried marker. Note that the ID will correspond with the ID in the genotypes database, and may not match your query if it matched as an alias. A link to dbSNP for this marker is provided when applicable.
SNP1 Chr Mapping chromosome of the queried marker.
SNP1 Pos Mapping position of the queried marker.
SNP1 Ref Reference allele of the queried marker. This is only available for certain databases.
SNP1 Ref Alternate allele of the queried marker. This is only available for certain databases.
SNP1 MAF Minor allele frequency of the queried marker
SNP1 Minor Allele Minor allele of the queried marker. Either "Ref" for reference allele or "Alt" for alternate allele.
SNP2 Name Name of the proxy marker. A link to dbSNP for this marker is provided when applicable.
SNP2 Chr Mapping chromosome of the proxy marker.
SNP2 Pos Mapping position of the proxy marker.
SNP2 Ref Reference allele of the proxy marker. This is only available for certain databases.
SNP2 Ref Alternate allele of the proxy marker. This is only available for certain databases.
SNP2 MAF Minor allele frequency of the proxy marker
SNP2 Minor Allele Minor allele of the proxy marker. Either "Ref" for reference allele or "Alt" for alternate allele.
Population Reference population code.
r^2 Pairwise r-squared value of the query and proxy markers.
D' Pairwise D' value of the query and proxy markers.