OmniMapFree - Design - data files

Updated: 14 July 2011.

Data files

All data files used by OmniMapFree are tab-delimited text files which can be examined using a text editor e.g. NotePad or WordPad. The data files have three types of lines:

  • Lines beginning with a "#" symbol. The first one or more #-lines contain information used to draw each feature. Other #-lines are either comments or metadata describing the data fields.
  • Blank lines are used to make the file more easily read by the user and are ignored by the software.
  • All other lines contain actual data in tab-delimited text format. Each line describes the position, size and other information for one gene or feature. In all data files this consists of fields for chromosome id, start position, end position, strand and gene id. Different types of data file also contain other information. It is recommended (but not essential) that the data lines are sorted by chromosome id, and start position so that it is easier to find a gene or feature by position.

OmniMapFree uses 5 types of data files which are recognized by their file extensions:

Data file Type of data displayed
posn Any feature with just a position and size on a chromosome. For example features could be genes, primer sequences, genetic markers, etc. Even features as small as 1 base pair e.g. SNPs can be displayed.
blast Any sequence region identified in a blast search. The data includes e-values and OmniMapFree can selectively display regions with e-values less than a user-defined cut-off.
expr This allows different items in the same data file to be displayed in different colours depending on the last value on each line of data. Useful for microarray expression data - e.g. all up-regulated genes could be in red and all down-regulated genes in blue.
freq This is used to display frequency data. For example the recombination frequency between genetic markers or %GC. The data is displayed as a colour gradient which is user-defined. The colour of each region is determined by the last value on each line of data.
graph This is used to draw a graph or histogram along a chromosome. For example this will display a graph of SNP density along a chromosome if the data includes the number of SNPs/50,000 nt as the last value on each line.