For example, an <img> element like this:
<img src="/cgi-bin/gbrowse_karyotype?src=Homo_sapiens">
Will generate this image:
gbrowse_karyotype can also be used to generate image-mapped
HTML either on its own or as the contents of an <iframe>
element.
For example (the URL parameters are described below):
<iframe height=300 width=150 frameborder=0 scrolling=noyields the image below. The chromosome bands now have mousover labels:
src="/cgi-bin/gbrowse_karyoype?src=Homo_sapiens;e=1;c=1+2+3;h=250">
</iframe>
The script can also be used to superimpose external features
onto the display. If we add the following argument to the
above iframe example:
a=1+gene+gene1+50000000..5100000+bgcolor=red;a=1+gene+gene1+61000000..6300000+bgcolor=blue
The script recognizes the following CGI arguments.
Argument pairs must be separated by semicolons or by
ampersands.
Most options have shorter aliases that can be used to reduce URL
lengths. Default values for each of the options in
bold
face can also be specified in the configuration file for the data
source, if one is used.
Argument | Alias | Description |
---|---|---|
type | t | built-in tracks to include in image |
rows |
r |
number of rows of chromosomes to draw |
chromosome |
c |
which chromosome(s) to draw |
style | s | style information for added features |
add | a | added feature(s) to superimpose on the image |
embed | e |
generate full HTML for image and imagemap for use in an embedded frame |
transparent | tr |
make the image background color transparent |
cwidth |
w |
chromsome width |
cheight |
h |
maximum chromosome height |
bandlabels |
b |
label chromosome bands |
band |
ba |
the type of feature to be
used as chromosome bands or centromeres |
list | l | get certain types of configuration information |
source | src | database name |
useform |
f |
print a data entry form if
there are no other CGI parameters. |
featuretext |
ft |
enter third-party
annotations as Bio::Graphics::FeatureFile-style
text. |
<img src="/cgi-bin/gbrowse_karyotype?src=rat;type=gene+allele">
If the track name has a space in it, put quotes around the name:
type="microbe tRNA"+NG+WABA+CG+ESTBAvailable track names can be listed by calling the gbrowse_karyotype script with the option "list=types",
/cgi-bin/gbrowse_karyotype/sourcename?list=types or /cgi-bin/gbrowse_karyotype?src=sourcename;list=types
c=1:250000000+2:200000000+3:180000000
style="Blast Hit"+glyph=generic+bgcolor=red
add=chr2+Type+Name+start..end,start2..end2+bgcolor=green+glyph=dot+desc='my+favorite+gene'
add=I+"curated+gene"+act-2+23897..28799+"C.+elegans+actin+gene"
"key=value" pairs will be added to the feature's attributes.
Certain reserved keywords will override any default style for that
feature type.
reserved keywords |
|
---|---|
glyph | the type of glyph used to represent the feature (eg: 'dot', 'diamond', 'triangle') |
fgcolor | glyph foreground (outline) color |
bgcolor | glyph background (fill) color |
orient | cardinal direction that glyph points in; N, S, E, W |
url (alias 'link') | URL destination if the feature is clicked (see Note below) |
desc (alias 'description') | feature description. This will be used as a mousover label. |
Note: URLs that contain special characters such as '&', ';', '~', etc must be URL escaped in order to be parsed properly. Allowed special characters and their ASCII escape codes are listed below. URLs containing special characters that are not on this list are not allowed.
character | escape code | character | escape code |
---|---|---|---|
# | %23 | ~ | %7E |
/ | %2F | ? | %3F |
: | %3A | @ | %40 |
= | %3D | & | %26 |
; | %3B |
|
|
Mnemonic <tab> Full description of feature <tab> [default]The third column contains the word "default" if the track will be shown by default when no type argument is provided.
<iframe src="/cgi-bin/gbrowse_karyotype?c=1+2+3;h=350;src=mouse;e=1;a=2+gene+gene1+1000000..1010000+bgcolor=red+url=http://www.google.com" width="150" height="350"></iframe>
Mouseover or click the triangle
## Default display options for gene ## [gene] glyph = triangle bgcolor = red fgcolor = red height = 7 bump = 1 [mutation] glyph = lightning bgcolor = yellow fgcolor = black height = 15 # chromosome 1 reference=1 gene gene1 20000001..20006000 "Carbonic Anhydrase gene" gene gene2 20066001..20069000 "ADH gene" gene gene3 50000001..50006000 bgcolor=blue;fgcolor=blue;glyph=dot;description="Unknown gene" # chromosome 5 reference=5 gene gene4 20000001..20006000 bgcolor=white gene gene5 30006001..30009000 gene gene6 80000001..80006000 mutation allele2 30006001..30009000 "cosmic radiation damage"The components of the annotation/config text (configuration sections, reference lines and data lines) are discussed in detail below:
The appearance of third-party annotations can be controlled by including one or more configuration stanzas in the annotation file.
Here is an example configuration section. It can appear at the top of the file, at the bottom, or interspersed among data sections:
[gene]
glyph = triangle
orient = E
bgcolor = red
fgcolor = red
height = 7
bump = 1
The configuration section is divided into a set of sections,
each one labeled with a [section title]. The [general] section
specifies global options for the entire image. Other sections
apply to particular feature types.
Inside each section is a series of name=value pairs, where the name is the name of an option to set. You can put whitespace around the = sign to make it more readable, or even use a colon (:) instead if you prefer. The following option names are recognized:
Option | Value | Example |
---|---|---|
bgcolor | Background color of each element | blue |
bump | Prevent features from colliding (0=no, 1=yes; default 1) | 1 |
fgcolor | Foreground color of each element | yellow |
glyph | Style of each graphical element (see list below) | triangle |
width | Width of each graphical element (pixels) | 10 |
linewidth | Width of lines (pixels) | 1 |
orient | Direction in which triangle and lightning glyphs point (N, S, E, W; default E) | E |
point | Place an unscaled glyph in the center of the range. (0=no, 1=yes; default 1) | 1 |
The bump option is the most important option for controlling the look of the image. If set to false (the number 0), then the features are allowed to overlap. If set to true (the number 1), then the features will movehorizontally to avoid colliding. If not specified, bump is turned on if the number of any given type of sequence feature is greater than 50.
Remote annotation files do not support callbacks (code references) as values in the attribute:value pairs.
white | coral | darkslateblue | green | lightpink | mediumslateblue | paleturquoise | sienna |
black | cornflowerblue | darkslategray | greenyellow | lightsalmon | mediumspringgreen | palevioletred | silver |
aliceblue | cornsilk | darkturquoise | honeydew | lightseagreen | mediumturquoise | papayawhip | skyblue |
antiquewhite | crimson | darkviolet | hotpink | lightskyblue | mediumvioletred | peachpuff | slateblue |
aqua | cyan | deeppink | indianred | lightslategray | midnightblue | peru | slategray |
aquamarine | darkblue | deepskyblue | indigo | lightsteelblue | mintcream | pink | snow |
azure | darkcyan | dimgray | ivory | lightyellow | mistyrose | plum | springgreen |
beige | darkgoldenrod | dodgerblue | khaki | lime | moccasin | powderblue | steelblue |
bisque | darkgray | firebrick | lavender | limegreen | navajowhite | purple | tan |
blanchedalmond | darkgreen | floralwhite | lavenderblush | linen | navy | red | teal |
blue | darkkhaki | forestgreen | lawngreen | magenta | oldlace | rosybrown | thistle |
blueviolet | darkmagenta | fuchsia | lemonchiffon | maroon | olive | royalblue | tomato |
brown | darkolivegreen | gainsboro | lightblue | mediumaquamarine | olivedrab | saddlebrown | turquoise |
burlywood | darkorange | ghostwhite | lightcoral | mediumblue | orange | salmon | violet |
cadetblue | darkorchid | gold | lightcyan | mediumorchid | orangered | sandybrown | wheat |
chartreuse | darkred | goldenrod | lightgoldenrodyellow | mediumpurple | orchid | seagreen | whitesmoke |
chocolate | darksalmon | gray | lightgreen | mediumseagreen | palegoldenrod | seashell | yellow |
coral | darkseagreen | green | lightgrey | mediumslateblue | palegreen | sienna | yellowgreen |
Name | Descripton |
---|---|
triangle | A triangle; use the 'orient' option (N,S,E or W) to control where it points. |
dot | A small circle. |
diamond | A diamond shape. |
lightning | For those features who do not take themselves too
seriously. |
reference = 1
You may have several reference lines in the file. Each reference landmark will apply to all data lines located beneath it until the next reference line occurs.
Column 1, the feature type
The first column is the feature type. Any description is valid,
but a short word, like "knockout" is better than a long one, like
"Transposon-mediated knockout". Later on you can provide a long
descriptive name in the formatting key if you desire. If the
feature type contains white space, you must surround it by double
or single quotes.
Column 2, the feature name
This is the name of the feature. The name will be used as a
mousover label unless there is a comment (see below). If
the name contains white space, you must surround it by white
space. Use empty quotes ("") if there is no name to display.
Column 3, the feature position
The third column contains one or more ranges occupied by the
feature. A range has a start and a stop, and is expressed either
as "start..stop" or "start-stop". Use whichever form you prefer.
You can express a feature that occupies a discontinuous set of
ranges, such as an mRNA aligned to the genome, by providing a
list of ranges separated by commas. Example:
1..10,49..80,110..200
Column 4, Description [optional]
This is the description column. If a description exists, it
will be used as a mouseover label for the feature, taking
precedence over the feature name from column 2.
Descriptions that contain whitespace should be wrapped in
quotes. The description column can also be used to specify
attribute=value information to override track display
options for an individual feature. For example, if the
features in a track are normally displayed as triangles, adding
glyph=diamond to column 4 will draw a diamond in place of a
triangle for just this one feature. Multiple
attribute=value pairs should be delimited with semicolons.
attribute=value pairs and text descriptions can be mixed by
prepending a "decription=" to the text. For
example:
gene gene1 10001..16000 bgcolor=white;glyph=dot;description="a very special gene"
You can place a comment in the annotation file by preceding it with a pound sign (#). Everything following the pound sign is ignored:
# this is a comment
[GENERAL] description = no source max segment = 999_999_999_999_999 # Web site configuration info tmpimages = /gbrowse/tmp stylesheet = /gbrowse/gbrowse.css buttons = /gbrowse/images/buttons [karyotype] cheight = 500 # max chromosome height cwidth = 25 # chromosome width cytoband_track = ideogram bands = chromosome_band cytoband centromere useform = 1 # print a data entry form if there are no CGI parameters ################################################################## # For a customized input form, the following options would be # uncommented and filled with HTML code ################################################################## # This option would contain HTML code for a customized page header #header = # # This option would contain HTML code for a customized page footer #footer = # # This option would contain HTML code for a customized input form #input_form = ################################################################### # Default glyph settings [TRACK DEFAULTS] glyph = triangle height = 8 bgcolor = black fgcolor = black description = 0 label = 0 orient = E point = 1 [ideogram] glyph = ideogram fgcolor = black bgcolor = gneg:white gpos25:silver gpos50:gray gpos:gray gpos75:darkgray gpos100:black gvar:var stalk:#666666 arcradius = 7 height = 20 bump = 0
Example: # This is a basic configuration to draw chromosome ideograms for the # human genome. It is a standalone file but could could also be part # of a normal gbrowse configuration file. [GENERAL] description = human db_adaptor = Bio::DB::GFF # this configuration just uses a simple flatfile db_args = -adaptor memory -gff 'usr/local/apache/htdocs/gbrowse/databases/ideograms/human_cytobands.gff' # Where temporary images are stored tmpimages = /gbrowse/tmp # gbrowse_karyotype-specific configuration [karyotype] max_height = 400 cwidth = 15 chromosome = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y rows = auto cytoband_track = CYT:karyotype band_labels = 1 # this controls how the chromosomes are drawn [CYT:karyotype] feature = cytoband chromosome_band centromere glyph = ideogram fgcolor = black bgcolor = gneg:white gpos25:silver gpos50:gray gpos:gray gpos75:darkgray gpos100:black gvar:var stalk:#666666 arcradius = 7 height = 21 # Other built-in feature configuration, if any, would be added here
The CGI script can be used as an interactive browser using the
useform=1
option.
In the configuration file, a customized data entry form can be specified using the following options:
header = ...Page header biolerplate (Visible title, etc.)
input_form = ...HTML to replace eveything between the header and footer
footer = ...Page footer
If the useform option is specified in the configuration file but the above fields are not; reasonable, albeit plain, defaults will be supplied by the CGI script.
chromosome = chr1 chr2 chr3This entry is required to specify which chromosomes to display and in which order. The space delimited list should contain reference sequences that exist in your database or GFF flat file.
cytoband = CYT:karyotypeThis specifies which built-in feature corresponds to the chromosome band data. If there are no band data, blank chromosomes will be drawn. However, a configuration stanza is still necessary to specify default chromosome attributes.
feature = cytoband chromosome_band centromereList individual feature types (typically chromosome_band and centromere). 'cytoband' is also recognized but it is not a valid SO term. List actual features, not aggregators.
glyph = ideogramThe ideogram glyph would be most commonly used in combination with cytoband data. An alternative for scored, non cytoband features (for example local gene densites, recombination rates, etc) is the heat_map_ideogram glyph, which can be used to draw chromosomes with color-coded bands representing scores, frequencies etc. The chromosome-like look can be enhanced by including the centromere amongst the scored "bands".
The starting point for drawing chromosome ideograms is the cytoband data. These data can be obtained from the UCSC genome browser, from NCBI and from EnsEMBL and other sources such as the primary literature. Once obtained, the cytobands are converted to GFF to be loaded into a Bio::DB::GFF database or saved as a flatfile.
Below is an example of GFF3-format cytoband data. Note that
the Parent= tag is used to facilitate aggregation of bands into
their parent chromosomes. Preprocessed GFF files for mouse, human
and rat can be found in the html (or htdocs) directory at
gbrowse/databases/ideogram.
[karyotype] # This option tells the CGI script which features to interpret as # "cytoband" features to be drawn on the chromosome ideograms. # The three feature types below are assumed by default but # Other types can be used as well. bands = chromosome_band centromere cytoband # This is where the names and order of the chromosomes are specified chomosome = 1 2 3 4 5 6 # This option specifies that band labels should be added band_labels = 1 # This option species the configuration section that applies to the chromosome ideograms cytoband_track = ideogram # Largest chromosome height cheight = 500 # This section provides default style options # for the chromosomes. Note the specially formatted bgcolor (see below) [ideogram] glyph = ideogram fgcolor = black bgcolor = gneg:white gpos25:silver gpos50:gray gpos:gray gpos75:darkgray gpos100:black gvar:var stalk:#666666 arcradius = 7 height = 20 bump = 0
##gff-version 3 ##sequence-region 1 1 245522847 1 ensembl chromosome_band 1 2300000 . . . Parent=1;Name=p36.33;Alias=1p36.33;Stain=gneg 1 ensembl chromosome_band 2300001 5300000 . . . Parent=1;Name=p36.32;Alias=1p36.32;Stain=gpos25 1 ensembl chromosome_band 15600001 20200000 . . . Parent=1;Name=p36.13;Alias=1p36.13;Stain=gneg 1 ensembl centromere 121000001 127900000 . . . Parent=1;Name=1_cent;Alias=11_centNote: If a GFF flatfile is being used as the database for chromosome banding data, be sure to have a 'sequence-region' directive for each chromosome.
Ensembl has cytoband data for human, mouse and rat. Below is an example of a script that retrieves cytoband features from the public mysql database and converts them to GFF3.
#!/usr/bin/perl -w use strict; use DBI; my $database = shift; my $host = 'ensembldb.ensembl.org'; my $query = 'SELECT name,seq_region_start,seq_region_end,band,stain FROM seq_region,karyotype WHERE seq_region.seq_region_id = karyotype.seq_region_id;'; my $dbh = DBI->connect( "dbi:mysql:$database:$host", 'anonymous' ) or die DBI->errstr; my $sth = $dbh->prepare($query) or die $dbh->errstr; $sth->execute or die $sth->errstr; my ($cent_start,$prev_chr,$chr_end,$segments,$gff); my $chr_start = 1; while (my @band = $sth->fetchrow_array ) { my ($chr,$start,$end,$band,$stain) = @band; my $class = 'Chromosome'; my $method; $chr =~ s/chr//; if ($stain eq 'acen' && !$cent_start) { $cent_start = $start; next; } elsif ($cent_start) { $method = 'centromere'; $band = "$chr\_cent"; $start = $cent_start; $stain = ''; $cent_start = 0; } else { $method = 'chromosome_band'; # SO term } my $alias = $method =~ /centromere/i ? $band : $chr.$band; $gff .= join("\t", $chr, 'ensembl', lc $method, $start, $end, qw/. . ./,qq{Parent=$chr;Name=$band;Alias=$alias}); $gff .= $stain ? ";Stain=$stain\n" : "\n"; if ($prev_chr && $prev_chr !~ /$chr/) { $segments .= "\#\#sequence-region $prev_chr $chr_start $chr_end\n"; $chr_start = 1; } $prev_chr = $chr; $chr_end = $end; } $segments .= "\#\#sequence-region $prev_chr $chr_start $chr_end\n"; print "##gff-version 3\n"; print $segments,$gff; __END__ # Currently ideograms for human, rat and mouse are available # To see the current database list, try the command: mysql -uanonymous -hensembldb.ensembl.org -e 'show databases' \ | grep core | grep 'sapiens\|rattus\|mus' | grep -v 'expression'
db_adaptor = Bio::DB::GFF db_args = -dsn dbi:mysql:database=my_database;host=localhost;user=user;password=passwd
db_adaptor = Bio::DB::GFF db_args = -adaptor memory -gff '/var/www/html/gbrowse/databases/ideograms/mouse_cytobands.gff'Ensure that the GFF data in the flatfile is formatted as the
##gff-version 3 1 calculated gene_density 1 1000000 35 . . Parent=Dense:1 1 calculated gene_density 1000001 2000000 45 . . Parent=Dense:1 1 calculated gene_density 2000001 3000000 18 . . Parent=Dense:1 1 calculated gene_density 3000001 4000000 21 . . Parent=Dense:1 1 calculated gene_density 4000001 5000000 1 . . Parent=Dense:1 1 calculated gene_density 5000001 6000000 1 . . Parent=Dense:1 1 calculated gene_density 6000001 7000000 21 . . Parent=Dense:1 1 calculated gene_density 7000001 8000000 12 . . Parent=Dense:1 1 calculated gene_density 8000001 9000000 18 . . Parent=Dense:1 1 calculated gene_density 9000001 10000000 20 . . Parent=Dense:1 1 calculated gene_density 10000001 11000000 18 . . Parent=Dense:1 1 calculated gene_density 11000001 12000000 28 . . Parent=Dense:1 1 calculated gene_density 12000001 13000000 25 . . Parent=Dense:1 1 calculated gene_density 13000001 14000000 10 . . Parent=Dense:1 1 calculated gene_density 14000001 15000000 3 . . Parent=Dense:1 1 calculated gene_density 15000001 16000000 27 . . Parent=Dense:1 1 calculated gene_density 16000001 17000000 45 . . Parent=Dense:1 1 calculated gene_density 17000001 18000000 15 . . Parent=Dense:1 1 calculated gene_density 18000001 19000000 6 . . Parent=Dense:1 1 calculated gene_density 19000001 20000000 23 . . Parent=Dense:1 1 calculated gene_density 20000001 21000000 20 . . Parent=Dense:1 1 calculated gene_density 21000001 22000000 14 . . Parent=Dense:1 1 calculated gene_density 22000001 23000000 17 . . Parent=Dense:1 1 calculated gene_density 23000001 24000000 27 . . Parent=Dense:1
description = Human gene density db_adaptor = Bio::DB::GFF db_args = -dsn dbi:mysql:database=human;host=localhost;user=nobody # Web site configuration info tmpimages = /gbrowse/tmp stylesheet = /gbrowse/gbrowse.css buttons = /gbrowse/images/buttons # how long to save cached images image cachetime = 2_592_000 [karyotype] cheight = 500 cwidth = 20 chromosome = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y cytoband_track = CYT [CYT] feature = gene_density centromere glyph = heat_map_ideogram # note the different glyph start_color = white end_color = red min_score = 0 max_score = 40 fgcolor = black
##sequence-region chromosome_name start end
<form method=POST name="f1" action="/cgi-bin/gbrowse_karyotype_devel" target="iframe1"> <input type="hidden" name="embed" value=1> <input type="hidden" name="band_labels" value=1> <input type=submit name="submit" value="Click to display chromosome 21"> <!-- The textarea is not visible --> <div style="width:0px;height:0px;visibility:hidden"> <textarea name="featuretext" type=hidden> ## Configuration [gene] glyph = triangle bgcolor = red fgcolor = red height = 7 bump = 1 point = 1 [mutation] glyph = lightning bgcolor = yellow fgcolor = black height = 15 # chromosome 1 reference=21 gene gene1 20000001..20006000 "Some sort of gene" gene gene2 30066001..30069000 bgcolor=white;desc="ADH gene" gene gene3 50000001..50006000 bgcolor=blue;fgcolor=blue;glyph=dot;description="Unknown gene" gene gene6 40000001..40006000 glyph=dot;bgcolor=white;fgcolor=black;height=15 mutation allele2 30006001..30009000 link=http://www.nasa.gov;description="cosmic radiation damage" # chromosome 21 cytoband data ##gff-version 3 ##sequence-region 21 1 46944323 21 ensembl chromosome_band 1 2900000 . . . Parent=21;Name=p13;Alias=21p13;Stain=gvar 21 ensembl chromosome_band 2900001 6300000 . . . Parent=21;Name=p12;Alias=21p12;Stain=stalk 21 ensembl chromosome_band 6300001 10000000 . . . Parent=21;Name=p11.2;Alias=21p11.2;Stain=gvar 21 ensembl chromosome_band 13200001 15300000 . . . Parent=21;Name=q11.2;Alias=21q11.2;Stain=gneg 21 ensembl chromosome_band 15300001 22900000 . . . Parent=21;Name=q21.1;Alias=21q21.1;Stain=gpos100 21 ensembl chromosome_band 22900001 25800000 . . . Parent=21;Name=q21.2;Alias=21q21.2;Stain=gneg 21 ensembl chromosome_band 25800001 30500000 . . . Parent=21;Name=q21.3;Alias=21q21.3;Stain=gpos75 21 ensembl chromosome_band 30500001 34700000 . . . Parent=21;Name=q22.11;Alias=21q22.11;Stain=gneg 21 ensembl chromosome_band 34700001 36700000 . . . Parent=21;Name=q22.12;Alias=21q22.12;Stain=gpos50 21 ensembl chromosome_band 36700001 38600000 . . . Parent=21;Name=q22.13;Alias=21q22.13;Stain=gneg 21 ensembl chromosome_band 38600001 41400000 . . . Parent=21;Name=q22.2;Alias=21q22.2;Stain=gpos50 21 ensembl chromosome_band 41400001 46944323 . . . Parent=21;Name=q22.3;Alias=21q22.3;Stain=gneg 21 ensembl centromere 10000001 13200000 . . . Parent=21;Name=21_cent;Alias=21_cent </textarea> </div> </form> <!-- The iframe where the image and map will be embedded --> <iframe name="iframe1" id="iframe1" height=600 frameborder=0></iframe>
Please report them to the author.
Sheldon McKay mckays@cshl.edu
Copyright (c) 2006 Cold Spring Harbor Laboratory
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
For additional help, see The GMOD Project pages.