Revision 423M

org.bridgedb
Class DataSource

java.lang.Object
  extended by org.bridgedb.DataSource

public final class DataSource
extends java.lang.Object

contains information about a certain DataSource, such as

The DataSource class uses the extensible enum pattern. You can't instantiate DataSources directly, instead you have to use one of the constants from the org.bridgedb.bio module such as BioDataSource.ENSEMBL, or the "getBySystemcode" or "getByFullname" methods. These methods return a predefined DataSource object if it exists. If a predefined DataSource for a requested SystemCode doesn't exists, a new one springs to life automatically. This can be used when the user requests new, unknown data sources. If you call getBySystemCode twice with the same argument, it is guaranteed that you get the same return object. However, there is no way to combine a new DataSource with a new FullName unless you use the "register" method.

This way any number of pre-defined DataSources can be used, but plugins can define new ones and you can handle unknown data sources in the same way as predefined ones.

Definitions for common DataSources can be found in BioDataSource.


Nested Class Summary
static class DataSource.Builder
          Uses builder pattern to set optional attributes for a DataSource.
 
Method Summary
static DataSource getByFullName(java.lang.String fullName)
          returns pre-existing DataSource object by full name, if it exists, or creates a new one.
static DataSource getBySystemCode(java.lang.String systemCode)
           
static java.util.Set<DataSource> getDataSources()
          get all registered datasoures as a set.
 Xref getExample()
           
static java.util.Set<DataSource> getFilteredSet(java.lang.Boolean primary, java.lang.Boolean metabolite, java.lang.Object o)
          returns a filtered subset of available datasources.
 java.lang.String getFullName()
          returns full name of DataSource e.g.
static java.util.List<java.lang.String> getFullNames()
          Get a list of all non-null full names.
 java.lang.String getMainUrl()
          Return the main Url for this datasource, that can be used to refer to the datasource in general.
 java.lang.Object getOrganism()
           
 java.lang.String getSystemCode()
          returns GenMAPP SystemCode, e.g.
 java.lang.String getType()
           
 java.lang.String getUrl(java.lang.String id)
          Turn id into url pointing to info page on the web, e.g.
 java.lang.String getURN(java.lang.String id)
          Creates a global identifier.
 boolean isMetabolite()
           
 boolean isPrimary()
           
static DataSource.Builder register(java.lang.String sysCode, java.lang.String fullName)
          Register a new DataSource with (optional) detailed information.
 java.lang.String toString()
          The string representation of a DataSource is equal to it's full name.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Method Detail

getUrl

public java.lang.String getUrl(java.lang.String id)
Turn id into url pointing to info page on the web, e.g. "http://www.ensembl.org/get?id=ENSG..."

Parameters:
id - identifier to use in url
Returns:
Url

getFullName

public java.lang.String getFullName()
returns full name of DataSource e.g. "Ensembl". May return null if only the system code is known. Also used as identifier in GPML

Returns:
full name of DataSource

getSystemCode

public java.lang.String getSystemCode()
returns GenMAPP SystemCode, e.g. "En". May return null, if only the full name is known. Also used as identifier in
  1. Gdb databases,
  2. Gex databases.
  3. Imported data
  4. the Mapp format.
We should try not to use the system code anywhere outside these 4 uses.

Returns:
systemcode, a short unique code.

getMainUrl

public java.lang.String getMainUrl()
Return the main Url for this datasource, that can be used to refer to the datasource in general. (e.g. http://www.ensembl.org/) May return null in case the main url is unknown.

Returns:
main url

getType

public java.lang.String getType()
Returns:
type of entity that this DataSource describes, for example "metabolite", "gene", "protein" or "probe"

getURN

public java.lang.String getURN(java.lang.String id)
Creates a global identifier. It uses the MIRIAM data type list to create a MIRIAM URI like "urn:miriam:uniprot:P12345", or if this DataSource is not included in the MIRIAM data types list, a bridgedb URI.

Parameters:
id - Id to generate URN from.
Returns:
the URN.

register

public static DataSource.Builder register(java.lang.String sysCode,
                                          java.lang.String fullName)
Register a new DataSource with (optional) detailed information. This can be used by other modules to define new DataSources.

Parameters:
sysCode - short unique code between 1-4 letters, originally used by GenMAPP
fullName - full name used in GPML. Must be 20 or less characters
Returns:
Builder that can be used for adding detailed information.

getBySystemCode

public static DataSource getBySystemCode(java.lang.String systemCode)
Parameters:
systemCode - short unique code to query for
Returns:
pre-existing DataSource object by system code, if it exists, or creates a new one.

getByFullName

public static DataSource getByFullName(java.lang.String fullName)
returns pre-existing DataSource object by full name, if it exists, or creates a new one.

Parameters:
fullName - full name to query for
Returns:
DataSource

getDataSources

public static java.util.Set<DataSource> getDataSources()
get all registered datasoures as a set.

Returns:
set of all registered DataSources

getFilteredSet

public static java.util.Set<DataSource> getFilteredSet(java.lang.Boolean primary,
                                                       java.lang.Boolean metabolite,
                                                       java.lang.Object o)
returns a filtered subset of available datasources.

Parameters:
primary - Filter for specified primary-ness. If null, don't filter on primary-ness.
metabolite - Filter for specified metabolite-ness. If null, don't filter on metabolite-ness.
o - Filter for specified organism. If null, don't filter on organism.
Returns:
filtered set.

getFullNames

public static java.util.List<java.lang.String> getFullNames()
Get a list of all non-null full names.

Warning: the ordering of this list is undefined. Two subsequent calls may give different results.

Returns:
List of full names

toString

public java.lang.String toString()
The string representation of a DataSource is equal to it's full name. (e.g. "Ensembl")

Overrides:
toString in class java.lang.Object
Returns:
String representation

getExample

public Xref getExample()
Returns:
example Xref, mostly for testing purposes

isPrimary

public boolean isPrimary()
Returns:
if this is a primary DataSource or not. Primary DataSources are preferred when annotating models. A DataSource is primary if it is not of type probe, so that means e.g. Affymetrix or Agilent probes are not primary. All gene, protein and metabolite identifiers are primary.

isMetabolite

public boolean isMetabolite()
Returns:
if this DataSource describes metabolites or not.

getOrganism

public java.lang.Object getOrganism()
Returns:
Organism that this DataSource describes, or null if multiple / not applicable.

Generated July 29 2010