Tripal
Functions
Chado Feature
Collaboration diagram for Chado Feature:

Functions

 chado_autocomplete_feature ($string='')
 
 chado_reverse_compliment_sequence ($sequence)
 
 chado_get_feature_sequences ($feature, $options)
 
 chado_get_bulk_feature_sequences ($options)
 
 chado_get_fasta_defline ($feature, $notes='', $featureloc=NULL, $type='', $length=0)
 
 chado_get_location_string ($featureloc)
 

Detailed Description

Provides API functions specifically for managing feature records in Chado especially retrieving relationships and sequences derived from relationships and feature alignments.

Function Documentation

◆ chado_autocomplete_feature()

chado_autocomplete_feature (   $string = '')

Used for autocomplete in forms for identifying for publications.

Parameters
$fieldThe field in the publication to search on.
$stringThe string to search for.
Returns
A json array of terms that begin with the provided string.

◆ chado_get_bulk_feature_sequences()

chado_get_bulk_feature_sequences (   $options)

Retrieves the bulk sequences for a given feature.

Parameters
$optionsAn associative array of options for selecting a feature. Valid keys include:
  • org_commonname: The common name of the organism for which sequences should be retrieved
  • genus: The genus of the organism for which sequences should be retrieved
  • species: The species of the organism for which sequences should be retrieved
  • analysis_name: The name of an analysis to which sequences belong. Only those that are associated with the analysis will be retrieved.
  • type: The type of feature (a sequence ontology term).
  • feature_name: the name of the feature. Can be an array of feature names.
  • feature_uname: the uniquename of the feature. Can be an array of feature unique names.
  • upstream: An integer specifing the number of upstream bases to include in the output
  • downstream: An integer specifying the number of downstream bases to include in the output.
  • derive_from_parent: Set to '1' if the sequence should be obtained from the parent to which this feature is aligned.
  • aggregate: Set to '1' if the sequence should only contain sub features, excluding intro sub feature sequence. For example, set this option to obtain just the coding sequence of an mRNA.
  • sub_feature_types: Only include sub features (or child features) of the types provided in the array
  • relationship_type: If a relationship name is provided (e.g. sequence_of) then any sequences that are in relationships of this type with matched sequences are also included
  • relationship_part: If a relationship is provided in the preceding argument then the rel_part must be either 'object' or 'subject' to indicate which side of the relationship the matched features belong
  • width: Indicate the number of bases to use per line. A new line will be added after the specified number of bases on each line.
  • is_html: Set to '1' if the sequence is meant to be displayed on a web page. This will cause a
    tag to separate lines of the FASTA sequence.
Returns
Returns an array of sequences. The sequences will be in an array with the following keys for each sequence: 'types' => an array of feature types that were used to derive the sequence (e.g. from an aggregated sequence) 'upstream' => the number of upstream bases in the sequence 'downstream' => the number of downstream bases in the sequence 'defline' => the definition line used to create a FASTA sequence 'residues' => the residues 'featureloc_id' => the featureloc_id if from an alignment

◆ chado_get_fasta_defline()

chado_get_fasta_defline (   $feature,
  $notes = '',
  $featureloc = NULL,
  $type = '',
  $length = 0 
)

Returns a definition line that can be used in a FASTA file.

Parameters
$featureA single feature object containing all the fields from the chado.feature table. Best case is to provide an object generated by the chado_generate_var() function.
$notesOptional: additional notes to be added to the definition line.
$featurelocOptional: a single featureloc object generated using chado_generate_var that contains a record from the chado.featureloc table. Provide this if the sequence was obtained by using the alignment rather than from the feature.residues column.
$typeOptional: the type of sequence. By default the feature type is used.
$lengthOptional: the length of the sequence.
Returns
A string of the format: uniquename|name|type|feature_id or if an alignment: srcfeature_name:fmin..fmax[+-]; alignment of uniquename|name|type|feature_id.

◆ chado_get_feature_sequences()

chado_get_feature_sequences (   $feature,
  $options 
)

Retrieves the sequences for a given feature.

If a feature has multiple alignments or multiple relationships then multiple sequences will be returned.

Parameters
$featureAn associative array describing the feature. Valid keys include:
  • feature_id: The feature_id of the feature for which the sequence will be retrieved.
  • name: The feature name. This will appear on the FASTA definition line.
  • parent_id: (optional) only retrieve a sequence if 'derive_from_parent' is true and the parent matches this ID.
  • featureloc_id: (optional) only retrieve a sequence if 'derive_from_parent' is true and the alignment is defined with this featureloc_id.
$optionsAn associative array of options. Valid keys include:
  • width: Indicate the number of bases to use per line. A new line will be added after the specified number of bases on each line.
  • is_html: Set to '1' if the sequence is meant to be displayed on a web page. This will cause a
    tag to separate lines of the FASTA sequence.
  • derive_from_parent: Set to '1' if the sequence should be obtained from the parent to which this feature is aligned.
  • aggregate: Set to '1' if the sequence should only contain sub features, excluding intro sub feature sequence. For example, set this option to obtain just the coding sequence of an mRNA.
  • upstream: An integer specifing the number of upstream bases to include in the output.
  • downstream: An integer specifying the number of downstream bases to include in the output.
  • sub_feature_types: Only include sub features (or child features) of the types provided in the array.
  • relationship_type: If a relationship name is provided (e.g. sequence_of) then any sequences that are in relationships of this type with matched sequences are also included.
  • relationship_part: If a relationship is provided in the preceding argument then the rel_part must be either 'object' or 'subject' to indicate which side of the relationship the matched features belong.
Returns
an array of matching sequence in the following keys for each sequence:
  • types: an array of feature types that were used to derive the sequence (e.g. from an aggregated sequence)
  • upstream: the number of upstream bases included in the sequence
  • downstream: the number of downstream bases included in the sequence
  • defline: the definition line used to create a FASTA sequence
  • residues: the residues
  • featureloc_id: the featureloc_id if the sequences is from an alignment

◆ chado_get_location_string()

chado_get_location_string (   $featureloc)

Returns a string representing a feature location in an alignment.

Parameters
$featurelocA single featureloc object generated using chado_generate_var that contains a record from the chado.featureloc table.
Returns
A string of the format: uniquename:featurelocmin..featurelocmax.strand

◆ chado_reverse_compliment_sequence()

chado_reverse_compliment_sequence (   $sequence)

Performs a reverse compliment of a nucleotide sequence.

Parameters
$sequenceThe nucelotide sequence.
Returns
an upper-case reverse complemented sequence.