- About the Center
- Patient Care
- Community Outreach
- Academic Programs
- News & Media
David W. Mount, PhD, Director
Ritu Pandey, PhD, Co-Director
The Informatics/Bioinformatics Shared Service at The University of Arizona provides support in the following areas:
Who we are
The Informatics/Bioinformatics Shared Service was founded in 2002 to provide bioinformatics, genomics, and proteomics support to The University of Arizona Cancer Center researchers so that they can fully utilize the power of the human genome project in their research. We collect, store, and make available a variety of molecular and genetic data on cancer genomes. We use established computational tools but also develop new tools as needed for analysis of all genome-related data. A variety of computer-related services for informatics support of research projects is also provided. An example of our informatics support is web-based databases for storing clinical and pathological, proteomics, and cancer genome data. Through these various levels of support, the informatics/bioinformatics shared service provides an integrated approach that assists researchers in their quest for new biological information about cancer cells and tissues and thus aids in finding new drug targets and preventative methods. Some examples of the types of services we offer are given below.
The Bioinformatics group includes the following staff members
Attn: Person Name room#
The University of Arizona Cancer Center
1515 N. Campbell Ave.
PO Box 245024
Tucson, AZ 85724-5024
Some of our Informatics Initiatives at the Cancer Center
We have been working on several data management and data integration initiatives since past few years. These are some of the tools that were developed by the group or adopted and deployed to support Cancer Center programs, shared services and the researchers. The goal here is to leverage any existing tools in order to integrate heterogeneous data for research. These are currently set up in a testing environment.
Informatics system that have been discontinued for further use.
CaTissue — This is a CaBig biospecimen informatics system for inventory tracking and clinical and pathology annotation. caTissue permits users to track the collection, storage, quality assurance, and distribution of specimens as well as the derivation and aliquotting of new specimens from existing ones (e.g. for DNA analysis).
IBISS Project Pricing, Turnaround Time & Consulting
Interaction with Cancer Center Members:
The service we offer is our expertise in data analysis and management. If we can assist a cancer researcher on a temporary consulting basis, then we will charge for the time spent on a project. We welcome the opportunity to participate in laboratory meetings, research discussions and writing papers and grant applications. We can perform a free preliminary data analysis to support the feasibility of an application. We can also add our experience and expertise in data management and analysis, thereby helping to increase the fundability of many research grant applications. Presently, we are supported by the Cancer Center Core grant, the GI spore grant, and contracts with TGen. We usually have time available and are very interested in participating in more projects.
Our areas of expertise include both computational biology - biological sequence analysis, protein structure analysis, genome analysis, advanced computational analysis of large data sets such as gene expression, single nucleotide polymorphism (SNPs) and proteomics data and basic biological studies - population and molecular genetics, molecular and cell biology, biochemistry, and evolutionary biology. Our goal is to provide assistance with data analysis that will lead to testable hypotheses and fundamentally important discoveries in cancer research. We specialize in the biological interpretation of data, leading to a new understanding of cancer biology, and the discovery of new diagnostic markers, risk genetic markers (haplotypes), patterns in data, and drug targets. Our staff is well prepared to perform all of these types of analysis.
How we perform our services
First, we are a group of experienced computer programmers, database designers, and website developers. We program in all of the currently used languages in bioinformatics including Java, Perl, PHP, and R. We use programming tools that have been established by the bioinformatics community in the R programming language and the BioConductor project (http://www.bioconductor.org) and that are based on sophisticated but sound statistical principles. We routinely utilize several database management systems including Oracle, Postgres, and MySQL. Second, we keep ahead in the bioinformatics literature – books and peer-reviewed papers, participate in NIH study sections and NSF panels, and attend meetings and conferences to stay abreast of new technology. We utilize computational tools developed and published by others and public websites of biological data, often by automated computer scripts. When existing computer programs or commercially available software are not useful for a cancer research project, the Bioinformatics Shared Service develops the needed computer tools and databases locally. We perform many other types of analysis with various types of high throughput data including data smoothing, modeling, and mining. The examples given below illustrate these types of analysis.
Interaction with other Shared Services
We assist users of the Genomics and Proteomics Shared Services by providing storage on web-based databases that we have designed at http://www.protbase.org and http://azcc-microarray.arl.arizona.edu/index.php respectively. We also provide biological interpretation and modeling of genomic and proteomic results. For example, the Informatics/Bioinformatics Shared Service offers assistance with experimental design and a basic analysis of gene expression experiments to include a quality control analysis and analyses to obtain a list of varying genes that provides information on the biological functions of these genes and their metabolic and regulatory interactions. A similar service is offered to users of the Proteomics Shared Service. We also provide advanced computational modeling of the data in order to find biological patterns that are diagnostic for disease or candidate targets for new drugs. We offer the most current and powerful computational tools and models to help cancer center investigators get the most information from their experiments.
Methods for finding genes that are changing in cancer cells and tissues.
Pathway Analysis: In the first method, a pattern of significantly varying genes is input into our local database, which has collected all of the current information on the human genome that is available, as well as many other genomes (this database is described further below). An output of the Pathway Miner tool shows which genes are in well-known regulatory and metabolic pathways in pancreatic cancer tissues.
Methods for finding new drug targets.
Novel methods are used to determine which genes are suitable drug targets or predictive for disease. Often, targets are identified as products of significantly over-expressed genes in cancer tissues and cells. We also use a sophisticated computational method called graph theory to determine which genes appear to interact, such as one gene regulating another. These types of graphs can be used very effectively to predict gene interactions by combining different data sets and types of analysis (also see http://www.cytoscape.org).
Alternatively, the genes could be a synthetic lethal combination in that the cell needs one gene or the other for viability, or needs one gene expressed in order to compensate for over-expression of another. Through data analysis, we can determine if a cancer cell is depending totally on one gene, having lost or over-expressed the other, making the remaining gene a sensitive drug target. As an example, we are helping to discover synthetic lethal targets for over-expressed cell cycle gene in adrenocortical and pancreatic cancer. We are also able to search for genes that are likely to be strongly over-expressed because of a rearrangement in the genome that fuses the gene to a strong promoter, as occurs in chronic myelogenous leukemia and other B and T cell malignancies.
Future role for the Bioinformatics Shared Service as a Resource for Cancer Genome and Population Genetic Data.
The Bioinformatics Shared Service will continue to aid The University of Arizona Cancer Center investigators with access to genome and proteome data, data analysis, and integration of large data sets data with clinical and biological information for specific research projects. They will keep abreast of new data sets, analytical tools and generate new computational tools and methods, as needed. Large public data sets on the cancer gnome will become available in the next two years. The National Cancer Institute has initiated the Cancer Genome Anatomy Project in which designated Genome Centers will collect high quality gene expression, gene copy number, and DNA methylation array data on thousands of cancer tissues (http://genome.gov/19518624 ). A related project will identify sequence variations in the major cancer genes in as many tissues. The hapmap project is collecting large data sets of genetic variability in human populations. In most cases, it will be beyond the capability of individual laboratories to utilize these data; the Bioinformatics Shared Service is prepared to assist as we are experts in this area.