Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166

Notice: Only variable references should be returned by reference in /var/www/vhosts/pmandr.com/httpdocs/includes/domit/xml_domit_nodemaps.php on line 166
Healthcare Headlines
Algorithms for Molecular Biology - Latest Articles
  • MRL and SuperFine+MRL: new supertree methods
    Background: Supertree methods combine trees on subsets of the full taxon set together to produce a tree on the entire set of taxa. Of the many supertree methods, the most popular is MRP (Matrix Representation with Parsimony), a method that operates by first encoding the input set of source trees by a large matrix (the "MRP matrix") over {0,1,?}, and then running maximum parsimony heuristics on the MRP matrix. Experimental studies evaluating MRP in comparison to other supertree methods have established that for large datasets, MRP generally produces trees of equal or greater accuracy than other methods, and can run on larger datasets. A recent development in supertree methods is SuperFine+MRP, a method that combines MRP with a divide-and-conquer approach, and produces more accurate trees in less time than MRP. In this paper we consider a new approach for supertree estimation, called MRL (Matrix Representation with Likelihood). MRL begins with the same MRP matrix, but then analyzes the MRP matrix using heuristics (such as RAxML) for 2-state Maximum Likelihood. Results: We compared MRP and SuperFine+MRP with MRL and SuperFine+MRL on simulated and biological datasets. We examined the MRP and MRL scores of each method on a wide range of datasets, as well as the resulting topological accuracy of the trees. Our experimental results show that MRL, coupled with a very good ML heuristic such as RAxML, produced more accurate trees than MRP, and MRL scores were more strongly correlated with topological accuracy than MRP scores. Conclusions: SuperFine+MRP when based upon a good MRP heuristic such as TNT, produces among the best scores for both MRP and MRL, and is generally faster and more topologically accurate than other supertree methods we tested.

  • A strand specific high resolution normalization method for chip-sequencing data employing multiple experimental control measurements
    Background: High-throughput sequencing is becoming the standard tool for investigating protein-DNA interactions or epigenetic modifications. However, the data generated will always contain noise due to e.g. repetitive regions or non-specific antibody interactions. The noise will appear in the form of a background distribution of reads that must be taken into account in the downstream analysis, for example when detecting enriched regions (peak-calling). Several reported peak-callers can take experimental measurements of background tag distribution into account when analysing a data set. Unfortunately, the background is only used to adjust peak calling and not as a pre-processing step that aims at discerning the signal from the background noise. A normalization procedure that extracts the signal of interest would be of universal use when investigating genomic patterns. Results: We formulated such a normalization method based on linear regression and made a proof-of-concept implementation in R and C++. It was tested on simulated as well as on publicly available ChIP-seq data on binding sites for two transcription factors, MAX and FOXA1 and two control samples, Input and IgG. We applied three different peak-callers to (i) raw (un-normalized) data using statistical background models and (ii) raw data with control samples as background and (iii) normalized data without additional control samples as background. The fraction of called regions containing the expected transcription factor binding motif was largest for the normalized data and evaluation with qPCR data for FOXA1 suggested higher sensitivity and specificity using normalized data over raw data with experimental background. Conclusions: The proposed method can handle several control samples allowing for correction of multiple sources of bias simultaneously. Our evaluation on both synthetic and experimental data suggests that the method is successful in removing background noise.

  • QTL/microarray approach using pathway information
    Background: A combined quantitative trait loci (QTL) and microarray-based approach is commonly used to find differentially expressed genes which are then identified based on the known function of a gene in the biological process governing the trait of interest. However, a low cutoff value in individual gene analyses may result in many genes with moderate but meaningful changes in expression being missed. Results: We modified a gene set analysis to identify intersection sets with significantly affected expression for which the changes in the individual gene sets are less significant. The gene expression profiles in liver tissues of four strains of mice from publicly available microarray sources were analyzed to detect trait-associated pathways using information on the QTL regions of blood concentrations of high density lipoproteins (HDL) cholesterol and insulin-like growth factor 1 (IGF-1). Several metabolic pathways related to HDL levels, including lipid metabolism, ABC transporters and cytochrome P450 pathways were detected for HDL QTL regions. Most of the pathways identified for the IGF-1 phenotype were signal transduction pathways associated with biological processes for IGF-1's regulation. Conclusion: We have developed a method of identifying pathways associated with a quantitative trait using information on QTL. Our approach provides insights into genotype-phenotype relations at the level of biological pathways which may help to elucidate the genetic architecture underlying variation in phenotypic traits.

  • A Partial Least Squares based algorithm for parsimonious variable selection
    Background: In genomics, a commonly encountered problem is to extract a subset of variables out of a large set of explanatory variables associated with one or several quantitative or qualitative response variables. An example is to identify associations between codon-usage and phylogeny based definitions of taxonomic groups at different taxonomic levels. Maximum understandability with the smallest number of selected variables, consistency of the selected variables, as well as variation of model performance on test data, are issues to be addressed for such problems. Results: We present an algorithm balancing the parsimony and the predictive performance of a model. The algorithm is based on variable selection using reduced-rank Partial Least Squares with a regularized elimination. Allowing a marginal decrease in model performance results in a substantial decrease in the number of selected variables. This significantly improves the understandability of the model. Within the approach we have tested and compared three different criteria commonly used in the Partial Least Square modeling paradigm for variable selection; loading weights, regression coefficients and variable importance on projections. The algorithm is applied to a problem of identifying codon variations discriminating different bacterial taxa, which is of particular interest in classifying metagenomics samples. The results are compared with a classical forward selection algorithm, the much used Lasso algorithm as well as Soft-threshold Partial Least Squares variable selection. Conclusions: A regularized elimination algorithm based on Partial Least Squares produces results that increase understandability and consistency and reduces the classification error on test data compared to standard approaches.

  • ViennaRNA Package 2.0
    Background: Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties. Results: The ViennaRNA Package has been a widely used compilation of RNA secondary structure related computer programs for nearly two decades. Major changes in the structure of the standard energy model, the Turner 2004 parameters, the pervasive use of multi-core CPUs, and an increasing number of algorithmic variants prompted a major technical overhaul of both the underlying RNAlib and the interactive user programs. New features include an expanded repertoire of tools to assess RNA-RNA interactions and restricted ensembles of structures, additional output information such as centroid structures and maximum expected accuracy structures derived from base pairing probabilities, or z-scores for locally stable secondary structures, and support for input in fasta format. Updates were implemented without compromising the computational efficiency of the core algorithms and ensuring compatibility with earlier versions. Conclusions: The ViennaRNA Package 2.0, supporting concurrent computations via OpenMP, can be downloaded from www.tbi.univie.ac.at/RNA

  • Comparative analysis of the quality of a global algorithm and a local algorithm for alignment of two sequences
    Background: Algorithms of sequence alignment are the key instruments for computer-assisted studies of biopolymers. Obviously, it is important to take into account the "quality" of the obtained alignments, i.e. how closely the algorithms manage to restore the "gold standard" alignment (GS-alignment), which superimposes positions originating from the same position in the common ancestor of the compared sequences. As an approximation of the GS-alignment, a 3D-alignment is commonly used not quite reasonably. Among the currently used algorithms of a pair-wise alignment, the best quality is achieved by using the algorithm of optimal alignment based on affine penalties for deletions (the Smith-Waterman algorithm). Nevertheless, the expedience of using local or global versions of the algorithm has not been studied. Results: Using model series of amino acid sequence pairs, we studied the relative "quality" of results produced by local and global alignments versus (1) the relative length of similar parts of the sequences (their "cores") and their nonhomologous parts, and (2) relative positions of the core regions in the compared sequences. We obtained numerical values of the average quality (measured as accuracy and confidence) of the global alignment method and the local alignment method for evolutionary distances between homologous sequence parts from 30 to 240 PAM and for the core length making from 10% to 70% of the total length of the sequences for all possible positions of homologous sequence parts relative to the centers of the sequences. Conclusion: We revealed criteria allowing to specify conditions of preferred applicability for the local and the global alignment algorithms depending on positions and relative lengths of the cores and nonhomologous parts of the sequences to be aligned. It was demonstrated that when the core part of one sequence was positioned above the core of the other sequence, the global algorithm was more stable at longer evolutionary distances and larger nonhomologous parts than the local algorithm. On the contrary, when the cores were positioned asymmetrically, the local algorithm was more stable at longer evolutionary distances and larger nonhomologous parts than the global algorithm. This opens a possibility for creation of a combined method allowing generation of more accurate alignments.

  • Random generation of RNA secondary structures according to native distributions
    Background: Random biological sequences are a topic of great interest in genome analysis since, according to a powerful paradigm, they represent the background noise from which the actual biological information must differentiate. Accordingly, the generation of random sequences has been investigated for a long time. Similarly, random object of a more complicated structure like RNA molecules or proteins are of interest. Results: In this article, we present a new general framework for deriving algorithms for the non-uniform random generation of combinatorial objects according to the encoding and probability distribution implied by a stochastic context-free grammar. Briefly, the framework extends on the well-known recursive method for (uniform) random generation and uses the popular framework of admissible specifications of combinatorial classes, introducing weighted combinatorial classes to allow for the non-uniform generation by means of unranking.This framework is used to derive an algorithm for the generation of RNA secondary structures of a given fixed size. We address the random generation of these structures according to a realistic distribution obtained from real-life data by using a very detailed context-free grammar (that models the class of RNA secondary structures by distinguishing between all known motifs in RNA structure).Compared to well-known sampling approaches used in several structure prediction tools (such as SFold) ours has two major advantages: Firstly, after a preprocessing step in time O(n^2) for the computation of all weighted class sizes needed, with our approach a set of m random secondary structures of a given structure size n can be computed in worst-case time complexity O(m n log(n)) while other algorithms typically have a runtime in O(m n^2). Secondly, our approach works with integer arithmetic only which is faster and saves us from all the discomforting details of using floating point arithmetic with logarithmized probabilities. Conclusion: A number of experimental results shows that our random generation method produces realistic output, at least with respect to the appearance of the different structural motifs.The algorithm is available as a webservice at http://wwwagak.cs.uni-kl.de/NonUniRandGen and can be used for generating random secondary structures of any specified RNA type. A link to download an implementation of our method (in Wolfram Mathematica) can be found there, too.

  • ReCoil - an Algorithm for Compression of Extremely Large Datasets of DNA Data.
    The growing volume of generated DNA sequencing data makes the problem of its long term storage increasingly important. In this work we present ReCoil - an I/O efficient external memory algorithm designed for compression of very large collections of short reads DNA data. Typically each position of DNA sequence is covered by multiple reads of a short read dataset and our algorithm makes use of resulting redundancy to achieve high compression rate.While compression based on encoding mismatches between the dataset and a similar reference can yield high compression rate, good quality reference sequence may be unavailable. Instead, ReCoil's compression is based on encoding the differences between similar or overlapping reads. As such reads may appear at large distances from each other in the dataset and since random access memory is a limited resource, ReCoil is designed to work efficiently in external memory, leveraging high bandwidth of modern hard disk drives.

  • Multi-membership gene regulation in pathway based microarray analysis
    Background: Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results: We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions: We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.

  • Exploiting bounded signal flow for graph orientation based on cause–effect pairs
    Background: We consider the following problem: Given an undirected network and a set of sender–receiver pairs, direct all edges such that the maximum number of "signal flows" defined by the pairs can be routed respecting edge directions. This problem has applications in understanding protein interaction based cell regulation mechanisms. Since this problem is NP-hard, research so far concentrated on polynomial-time approximation algorithms and tractable special cases. Results: We take the viewpoint of parameterized algorithmics and examine several parameters related to the maximum signal flow over vertices or edges. We provide several fixed-parameter tractability results, and in one case a sharp complexity dichotomy between a linear-time solvable case and a slightly more general NP-hard case. We examine the value of these parameters for several real-world network instances. Conclusions: Several biologically relevant special cases of the NP-hard problem can be solved to optimality. In this way, parameterized analysis yields both deeper insight into the computational complexity and practical solving strategies.


Drug Rehab
Our other Physiatry Related Sites by PM&R Resources R. Wilkerson