Galileo-Logo dhs-logoepa-logo
edf
Overview | Documents | API | Software | Contact | HOME |  


Time-series data occurs in settings such as observations initiated by radars and satellites, checkpointing data representing state of the system at regular intervals, and analytics representing the evolution of extracted knowledge over time. Galileo is a demonstrably scalable storage framework for managing such time-series data. Key capabilities in the storage framework include:
  • The ability to manage billions of small files with trillions of observations.
  • Support for multiple scientific data formats such as netCDF, HDF, and the Defense Meteorological Satellite Program format.
  • Approximate queries, fuzzy queries, and probablistic queries
  • Hypothesis testing, significance evaluations, and kernel density estimations.
  • A scale-out architecture that enables the incremental assimilation of nodes in the system.
  • Accounting for spatiotemporal data characteristics.
  • Support for real-time, analytic queries over Petascale datasets.
  • Range geometry constrained queries over and proximity based relevance ranking over spatiotemporal datasets.
  • Supported queries can be point or continuous.
  • Support for a tunable replication framework

     
Project News


Paper on Spatiotemporal Sketches to appear in the IEEE Transactions on Knowledge & Data Engineering.

Paper on Ad Hoc Queries appears in the IEEE Transactions on Cloud Computing.

Paper on Anomaly Detection appears in Concurreny and Compuatation: Practice & Experience.

Paper on Approximate Queries appears in the IEEE Transactions on Knowledge & Data Engineering.


         

 


© The Galileo Project
Department of Computer Science
Colorado State University