Querying and Exploring Big Scientific Data

Thursday, January 30, 2014 - 15:30
University of Lugano, room SI-006, Informatics building (Via G. Buffi 13)

Today's scientific processes heavily depend on fast and accurate analysis of experimental data. Scientists are routinely overwhelmed by the effort needed to manage the volumes of data produced either by observing phenomena or by sophisticated simulations. As database systems have proven inefficient, inadequate, or insufficient to meet the needs of scientific applications, the scientific community typically uses special-purpose legacy software. With the exponential growth of dataset size and complexity, application-specific systems, however, no longer scale to efficiently analyze the relevant parts of their data, thereby slowing down the cycle of analyzing, understanding, and preparing new experiments.
In this talk I will illustrate the problem with a challenging application featuring brain simulation data and will show how the problems from neuroscience translate into interesting data management challenges. Finally I will also use the example of neuroscience to show how novel data management and, in particular, spatial indexing and navigation have enabled today's neuroscientists to simulate a meaningful percentage of the human brain.

Thomas Heinis is a post-doctoral researcher in the database group at EPFL. His research focuses on scalable data management algorithms for large-scale scientific applications. Thomas is a part of the “Human Brain Project” and currently works with neuroscientists to develop the data management infrastructure necessary for scaling up brain simulations. Prior to joining EPFL, Thomas completed his Ph.D. in the Systems Group at ETH Zurich, where he pursued research in workflow execution systems as well as data provenance.