A Data Transformation System for Biological Data Sources.

Peter Buneman, Susan B. Davidson, Kyle Hart, G. Christian Overton, Limsoon Wong: A Data Transformation System for Biological Data Sources. VLDB 1995: 158-169
  author    = {Peter Buneman and
               Susan B. Davidson and
               Kyle Hart and
               G. Christian Overton and
               Limsoon Wong},
  editor    = {Umeshwar Dayal and
               Peter M. D. Gray and
               Shojiro Nishio},
  title     = {A Data Transformation System for Biological Data Sources},
  booktitle = {VLDB'95, Proceedings of 21th International Conference on Very
               Large Data Bases, September 11-15, 1995, Zurich, Switzerland},
  publisher = {Morgan Kaufmann},
  year      = {1995},
  isbn      = {1-55860-379-4},
  pages     = {158-169},
  ee        = {db/conf/vldb/BunemanDHOW95.html},
  crossref  = {DBLP:conf/vldb/95},
  bibsource = {DBLP,}


Scientific data of importance to biologists in the Human Genome Project resides not only in conventional databases, but in structured files maintained in a number of different formats (e.g. ASN.1 and ACE) as well as sequence analysis packages (e.g. BLAST and FASTA). These formats and packages contain a number of data types not found in conventional databases, such as lists and variants, and may be deeply nested. We present in this paper techniques for querying and transforming such data, and illustrate their use in a prototype system developed in conjunctionwith the Human Genome Center for Chromosome 22. We also describe optimizations performed by the system, a crucial issue for bulk data.

Copyright © 1995 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Umeshwar Dayal, Peter M. D. Gray, Shojiro Nishio (Eds.): VLDB'95, Proceedings of 21th International Conference on Very Large Data Bases, September 11-15, 1995, Zurich, Switzerland. Morgan Kaufmann 1995, ISBN 1-55860-379-4
Contents BibTeX


Serge Abiteboul, Richard Hull: IFO: A Formal Semantic Database Model. ACM Trans. Database Syst. 12(4): 525-565(1987) BibTeX
Carlo Batini, Maurizio Lenzerini, Shamkant B. Navathe: A Comparative Analysis of Methodologies for Database Schema Integration. ACM Comput. Surv. 18(4): 323-364(1986) BibTeX
Val Tannen, Peter Buneman, Shamim A. Naqvi: Structural Recursion as a Query Language. DBPL 1991: 9-19 BibTeX
Val Tannen, Peter Buneman, Limsoon Wong: Naturally Embedded Query Languages. ICDT 1992: 140-154 BibTeX
Peter Buneman, Leonid Libkin, Dan Suciu, Val Tannen, Limsoon Wong: Comprehension Syntax. SIGMOD Record 23(1): 87-96(1994) BibTeX
Luca Cardelli: A Semantics of Multiple Inheritance. Inf. Comput. 76(2/3): 138-164(1988) BibTeX
Ronald Fagin, Jürg Nievergelt, Nicholas Pippenger, H. Raymond Strong: Extendible Hashing - A Fast Access Method for Dynamic Files. ACM Trans. Database Syst. 4(3): 315-344(1979) BibTeX
Leonidas Fegaras, David Maier: Towards an Effective Calculus for Object Query Languages. SIGMOD Conference 1995: 47-58 BibTeX
Nathan Goodman, Steve Rozen, Lincoln Stein: Requirements for a Deductive Query Language in a Genome-Mapping Database. Workshop on Programming with Logic Databases (Book), ILPS 1993: 259-278 BibTeX
Zhuoan Jiao, Peter M. D. Gray: Optimization of Methods in a Navigational Query Language. DOOD 1991: 22-42 BibTeX
Won Kim: A New Way to Compute the Product and Join of Relations. SIGMOD Conference 1980: 179-187 BibTeX
Witold Litwin, Abdelaziz Abdellatif: Multidatabase Interoperability. IEEE Computer 19(12): 10-18(1986) BibTeX
David Maier, Bennet Vance: A Call to Order. PODS 1993: 1-16 BibTeX
Masaya Nakayama, Masaru Kitsuregawa, Mikio Takagi: Hash-Partitioned Join Method Using Dynamic Destaging Strategy. VLDB 1988: 468-478 BibTeX
Shamkant B. Navathe, Ramez Elmasri, James A. Larson: Integrating User Views in Database Design. IEEE Computer 19(1): 50-62(1986) BibTeX
Atsushi Ohori, Peter Buneman, Val Tannen: Database Programming in Machiavelli - a Polymorphic Language with Static Type Inference. SIGMOD Conference 1989: 46-57 BibTeX
Yannis Papakonstantinou, Hector Garcia-Molina, Jennifer Widom: Object Exchange Across Heterogeneous Information Sources. ICDE 1995: 251-260 BibTeX
Amit P. Sheth, James A. Larson: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Comput. Surv. 22(3): 183-236(1990) BibTeX
Amit P. Sheth, James A. Larson, Aloysius Cornelio, Shamkant B. Navathe: A Tool for Integrating Conceptual Schemas and User Views. ICDE 1988: 176-183 BibTeX
Philip W. Trinder: Comprehensions, a Query Notation for DBPLs. DBPL 1991: 55-68 BibTeX
Philip Wadler: Comprehending Monads. Mathematical Structures in Computer Science 2(4): 461-493(1992) BibTeX
Limsoon Wong: An Introduction to Remy's Fast Polymorphic Record Projection. SIGMOD Record 24(3): 34-39(1995) BibTeX
Limsoon Wong: Querying Nested Collections. Ph.D. thesis, Univ. Pennsylvania 1994

Referenced by

  1. Vassilis Christophides, Sophie Cluet, Jérôme Siméon: On Wrapping Query Languages and Efficient XML Integration. SIGMOD Conference 2000: 141-152
  2. Susan B. Davidson, Anthony Kosky: Specifying Database Transformations in WOL. IEEE Data Eng. Bull. 22(1): 25-30(1999)
  3. Anthony Kosky, I-Min A. Chen, Victor M. Markowitz, Ernest Szeto: Exploring Heterogeneous Biological Databases: Tools and Applications. EDBT 1998: 499-513
  4. Judith Bayard Cushing, Justin Laird, Emir Pasalic, Elizabeth Kutter, Tim Hunkapiller, Frank Zucker, David P. Yee: Beyond Interoperability - Tracking and Managing the Results of Computational Applications. SSDBM 1997: 223-236
  5. I-Min A. Chen, Anthony Kosky, Victor M. Markowitz, Ernest Szeto: Constructing and Maintaining Scientific Database Views in the Framework of the Object-Protocol Model. SSDBM 1997: 237-248
  6. Richard Hull: Managing Semantic Heterogeneity in Databases: A Theoretical Perspective. PODS 1997: 51-61
  7. Peter Buneman: Semistructured Data. PODS 1997: 117-121
  8. Serge Abiteboul, Sophie Cluet, Tova Milo: Correspondence and Translation for Heterogeneous Data. ICDT 1997: 351-363
  9. Susan B. Davidson, Anthony Kosky: WOL: A Language for Database Transformations and Constraints. ICDE 1997: 55-65
  10. Philip Wadler: Functional Programming: An Angry Half-Dozen. DBPL 1997: 25-34
  11. Atsuyuki Morishima, Hiroyuki Kitagawa: A Data Modelling and Query Processing Scheme for Integration of Structured Document Repositories and Relational Databases. DASFAA 1997: 145-154
  12. Graham J. L. Kemp, Joel Dupont, Peter M. D. Gray: Using the Functional Data Model to Integrate Distributed Biological Data Sources. SSDBM 1996: 176-185
  13. Leonid Libkin, Rona Machlin, Limsoon Wong: A Query Language for Multidimensional Arrays: Design, Implementation, and Optimization Techniques. SIGMOD Conference 1996: 228-239
  14. Peter Buneman, Susan B. Davidson, Gerd G. Hillebrand, Dan Suciu: A Query Language and Optimization Techniques for Unstructured Data. SIGMOD Conference 1996: 505-516
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (, Corrections:
DBLP: Copyright © by Michael Ley (, last change: Sat May 16 23:46:04 2009