|
Program to Detect
Palindrome Concentrations
in an RNA Sequence
Clemente Aguilar
Bioinformatics
Program, The University of Texas at El Paso
Abstract
Palindromes are
especially interesting words in the genomic content due to their
different biological roles: among other functions, they can
serve as replication origins, or transcription sites in some
viruses. We are developing a program to estimate the
concentration of palindromes in a given sequence. The program is
expected to be useful for counting concentrations of palindromes
in a given RNA sequence and classifying them with a score
system. The program is written in C, and has interfaces with
EMBOSS and R statistical software.
Gene Therapy for
Hemophilia
Ana Betancourt
Bioinformatics
Program, The University of Texas at El Paso
Abstract
Hemophilia A and B
are X-linked bleeding disorders with Hemophilia A being the most
common hereditary coagulation disorder. The severe form of
Hemophilia is characterized by spontaneous bleeding of the
joints and internal organs. Patients are being treated
intravenously with plasma derived and recombinant FVIII or FIV
(Hemophilia B), but some of the disadvantages of protein
replacement therapy are the limited availability of purified
proteins, its high cost, and the development of antibodies
against FVIII or FIX. The development of successful gene therapy
for Hemophilia would transform the life of patients facing
abnormal bleeding and shortened life span by producing a stable
amount of coagulation factor and eliminating the risk of
infection with contaminated products, frequent IV injections,
and reduced immunogenicity.
Bayesian Models for
Identifying Changes in Gene Expression from Microarray
Experiments
Huiqin Chen
and Stephen Aley
Bioinformatics
Program and Department of Biological Sciences
The University of Texas at El Paso
Abstract
cDNA microarrays
have been extensively and successfully used in many fields, such
as basic scientific studies, drug-discovery, and diagnostic
purposes. cDNA microarray data is characterized by a large
amount of measurement, a high degree of measurement noise and
variability and low replications. The objective of this project
is to find a Bayesian framework for the analysis of DNA
microarray expression data accommodating the typical
characteristics of microarray data. Three Bayesian models
including the non-informative prior model, the conjugate prior
model and the hierarchical model have been proposed and tested
by simulation. The data set for testing the Bayesian approach is
the high density array experiment reported by Arfin et. al. which
compared Escherichia coli cells that were wild type to cells
that were mutant for the global regulatory protein Integration
Host Factor (IHF). The main advantage of this data set is its
four-fold replication for both wild type and mutant alleles. The
results show that the non-informative prior model is not
suitable for microarray data and the conjugate prior is a better
model. The hierarchical model is best in erasing noise from the
background.
The Protein
Information Tool
Anurag Gautam
Bioinformatics
Program, The University of Texas at El Paso
Abstract
The Protein
Information tool determines or deciphers the several properties
of protein like amino acid composition, the molecular weight,
extinction coefficient, hydrophobicity, nature of amino acids
telling whether a particular amino acid in a given sequence is
basic, acidic or neutral, Position of alpha Helix and Beta
sheets regarding secondary structure based on Chou-Fasman
algorithm. The tool asks the user to paste the particular
protein sequence in a FASTA format and by selecting any one of
the options depending on user’s choice of interest , it
determines or deciphers the required property. The protein
properties were deciphered accordingly to the code written
separately for each property of protein in PERL (Practical
Extraction and Reporting Language). For determination of
properties of proteins some data was collected from
Bioinformatics website like NCBI, SWISS-PROT, PDB etc. The data
was used in order to do create a particular code for the
determination of characteristic of a particular protein
sequence. To make Protein Information tool more powerful and
informative, several codes can be written in PERL to decipher
other properties of protein sequence. The Perl codes can be
combined with HTML (Hyper Text Markup Language) in order to make
it available to user.
Confirmation of
Computer Predicted Secondary Structure of
RNA 2 of NoV via Site
Directed Mutagenesis
Kimberly S. Hogle,¹,²
John H. Upton,² Abel Licon,³ Ming-Ying Leung,¹,4
and Kyle L. Johnson¹,²
(1) Bioinformatics Program, The University of Texas El Paso
(2) Department of Biological Sciences, The University of Texas
El Paso
(3) Department of Computer Science, The University of Texas El
Paso
(4) Department of Mathematical Sciences, The University of Texas
El Paso
Abstract
Nodaviruses are
small (25-34 nm) icosahedral, single stranded viruses with 2
strands of positive sense RNA genome. We hypothesize that the
RNA secondary structure at the 3’ end of NoV RNA2 is required
for viral replication. Bioinformatic computer predictions of RNA
secondary structure were analyzed to identify specific bases
involved in pairing to form these structures at the 3’ end of
NoV genomes. Construction of plasmids was carried out to include
full length genomic sequence of NoV RNA sequences which are
capable of transformation into competent cells and transfection
into yeast. Site directed mutagenesis was performed to delete
bases in NoV plasmid constructs that are involved in computer
predicted structures. Plasmid sequencing was performed to
confirm corresponding deletions and check for existence of other
mutations in these purified plasmid preparations.
Developing a
PostgreSQL Database for Hyper spectral Data Collected Using an
Automated Robotic Cart
Kuldeep Matharasi,1,2
Santonu Goswami,2 John A. Gamon,3 and
Craig E. Tweedie2
(1) Bioinformatics
Program, The University of Texas at El Paso
(2)
Systems Ecology Laboratory, Department of Biological Sciences, The University of Texas
at El Paso
(3) Center for Environmental Analysis (CEA-CREST) and Department
of Biological Sciences,
California State University, Los Angeles, CA
Abstract
Ecologists and
Environmental Scientists collect a huge amount of data from
different field sites trying to find an answer to a research
question. One of the main challenges faced by Ecologists and
Environmental Scientists is the proper storage and use of data
and sharing it for future use. These data hold the key for
finding an answer to different research questions and also for
proper monitoring of various environmental events.
Spectral data were collected as part of the Biocomplexity
experiment in Barrow Alaska for 2005, 2006 and 2007 using a
robotic cart and a hyper spectral spectrometer. Huge amounts of
hyper spectral data were collected as part of the project for
the three years. The cart system uses a hyper spectral
spectrometer which collects optical data in 256 bands along
three transects over a dry Arctic lake bed. Each of the
transects is 300m long and optical data were collected three
times a week for every meter of the tramline. Therefore in one
day the cart system collects 300 optical data files which totals
to about 900 optical data filed for the total length of 900m
length for each day. Therefore, it collects about 2700 data
files a week. So, on an average, it collects about 30,000 data
files for a three month long season. This huge volume of data
created strong challenges to handle data effectively for
effective data processing because of the lack of a proper
storage system. This challenge led us to design a database
system using PostgreSQL which allowed us the ease of accessing
the data effectively for the purpose of data processing and also
to do the quality check of the data at a minimal time.
Identification of
Determinants of Prevalence of
O-glycosylation
Sites in a Protein Sequence
Deepthi P Matta
and Ming-Ying Leung
Bioinformatics
Program and Department of
Mathematical Sciences
The University of Texas at El Paso
Abstract
O-glycosylation is
one of the most important, frequent and complex post
–translational modifications in proteins. Glycosylation affects
many protein critical functions including cellular
communication, half –life and structure (Jenkins and James
1980). Glycosylation also plays an important role in pathologies
of some diseases and the altered glycosylation is implicated in
cancer, mucosal diseases and pathogen–host interactions.
O-glycosylation is the most common post-translational
modification which occurs at Serine and Threonine sites in an
amino acid sequence. O-glycosylation is more probable in
sequences with a high proportion of serine, threonine and
proline residues. Determining and analyzing the determinants of
prevalence of O-glycosylation sites in the protein sequence is
an essential step towards establishing the roles that these
glycans play in health and disease. Statistical analysis and
techniques like correlation and regression are performed on the
242 glycosylated protein sequences present in the database, O-GLYCBASE.
The present paper deals with the analysis and results thereof.
A Hybrid
Optimization Approach
for Automated Parameter Estimation Problems
Carlos Quintero,1
Miguel Argaez,1
Hector Klie,1
Leticia Velazquez,1,2
and Mary Wheeler1
(1) Department of
Mathematical Sciences, The University of Texas at El Paso
(2) Bioinformatics
Program, The University of Texas at El Paso
Abstract
We present a hybrid
optimization approach for solving automated parameter estimation
problems that is based on the coupling of the Simultaneous
Perturbation Stochastic Approximation (SPSA) and a globalized
Newton-Krylov Interior Point algorithm (NKIP) presented by
Argáez et al. The procedure generates a surrogate model that
yield to use efficiently first order information and applies
NKIP algorithm to find an optimal solution. We implement the
hybrid optimization algorithm on a simple test case, and present
some preliminary numerical results.
A
User-friendly Database for Pseudoknots - RNAVBase-PK
(http://rnavlab.utep.edu/portal)
Vindhya Shatdarsanam
Bioinformatics
Program, The University of Texas at El Paso
Abstract
Among the RNA
secondary structures, Pseudoknots are complex to understand and
decipher. They have many biological functions, some already
known and some still under research. To help these researchers
by providing the information on these structures, we have
developed a database which is similar to the database developed
at Leiden University, Pseudobase. But, this database RNAVBase-PK
comes with many additional features which make it more
user-friendly. This database contains all the information
contained by Pseudobase with the additional features that allow
users to search, select, format and visualize Pseudoknots. The
database links each pseudoknot to the GenBank or EMBL record of
the corresponding nucleotide sequence and can invoke
PseudoViewer for automated graphical display of secondary
structures. It also comes with a tool that helps for adding new
pseudoknots in a user-friendly format. Our goal is to bring
together information from various sources about a pseudoknot
structure onto a single platform and make the information easily
accessible to all its users. We shall further develop and expand
this tool so that it will continue to fulfill the requirements
of the growing research on RNA structures and functions.
Identification and
Characterization of Small Molecules
That Specifically
Inhibit FKBP52 Regulation of
Steroid Hormone
Receptors
Dedeepya Vaka,1,2
Heather Balsiger,2 and Marc B. Cox2
(1) Bioinformatics
Program, The University of Texas at El Paso
(2) Border
Biomedical Research Center and Department of Biological Sciences, The University of Texas at El Paso
Abstract
Steroid hormone
receptors require the ordered assembly of various chaperone and
cochaperone proteins in order to reach a functional state. The
final stage in the receptor maturation process requires the
formation of a mutimeric complex consisting of an Hsp90 dimer,
p23, and one of several large immunophilins (FKBP51, FKBP52,
Cyp40, and PP5). All of the studies conducted by our laboratory
and others suggest that, unlike other Hsp90-associated
cochaperones, the immunophilins associate and/or regulate
preferentially depending upon which client protein is present in
the complex. For example, the large FK506-binding protein FKBP52
preferentially regulates androgen, progesterone, and
glucocorticoid receptor-mediated signal transduction. Consistent
with these findings, male FKBP52 knockout (52KO) mice display
characteristics of partial androgen insensitivity and female
52KO mice have implantation defects related to progesterone
receptor insensitivity. Thus, FKBP52 represents an attractive
therapeutic target for the treatment of diseases that are
dependent upon a functional hormone signaling pathway (e.g.
prostate cancer). We developed a yeast-based screening assay for
use in identifying small molecules that specifically inhibit
FKBP52 regulation of androgen receptor function. We then used
this assay to screen a diversified compound library containing
approximately 2000 compounds of known structure that were
selected to have representative diverse chemical structures.
These screens resulted in the identification of two candidate
compounds that potently and specifically inhibit FKBP52-mediated
potentiation of receptor function in yeast. We are currently
characterizing the inhibitory effects and specificity of
inhibition of the candidate inhibitors in mammalian cells.
Future studies, through the use of prediction modeling and
various experimental approaches, will be aimed at identifying
the inhibitor binding site(s) on FKBP52 and the active sites (pharmacophores)
on the molecules.
Expression of
Cruzain: A Major Cysteine Protease in T. Cruzi
Nobish Varghese
Bioinformatics
Program and Department of Biological Sciences
The University of Texas at El Paso
Abstract
Infection by the
protozoan parasite Trypanosoma Cruzi (T. Cruzi)
results in Chagas disease which is the principal cause of early
death out of heart disease in Latin America. Currently available
treatment methods are generally unsatisfactory; the drugs used
are highly toxic and ineffective, particularly in chronic stages
of the disease. Thus there is an urgent need to develop novel,
economic and effective drugs against the parasite. Our study
focuses on the folding pathway of the cysteine protease cruzain,
a key metabolic enzyme that plays a significant role in the
growth and pathogenicity of T. Cruzi. Folding studies
provide structural information necessary to develop novel
classes of inhibitors against cruzain. Cruzain is currently
being expressed and purified so that its folding pathway can be
characterized.
|