From DIR
Jump to: navigation, search

Dataset Information Resources (DIR) - An Information Framework for Discovering and Learning Datasets (Dr. Yaorong Ge)
In this project, we aim to develop a web-based and eventually mobile-driven information framework for discovering and learning datasets that helps students and researchers in data science. A prototype of the framework has already been developed, which is accessible via
As recognizing the importance of available data for effective research in the era of big data, many data portals have been developed to provide data of various types for query, analysis, and download by citizens and researchers alike. Most of the existing data portals provide some instructions and tools for identifying and using datasets that are within individual portals. However, there is a lack of information resources that catalog and curate these datasets in a manner that facilitate research questions. This is especially true for new students and researchers entering this field. For them it is frequently a considerable challenge to find out what datasets are available, what data elements may be relevant to their research questions, and where and how they can obtain them. Particularly in health care, there are tens of thousands of publically available datasets and numerous proprietary datasets that can be purchased that are critical for health data analytics, but many of these datasets are in various difference places and often in very different formats. Thus, it is an important and urgent need to fill the gap and develop an effective information framework for disseminating knowledge about available datasets in data science.

Facts about "About"