Data Science Resources
We are in the process of launching a supplement to the sixth edition of our textbook that will have new chapters on Introduction to Data Science, International Informatics and Clinical Decision Support. We thought it might be worthwhile to post the data science resources we uncovered in our research.Datasets:University of California, Irvine (UCI) Repository: 325 validated datasets covering many domains, different sizes and data types and different analytical methods. These data sets are commonly used for machine learning exercises. 1KDNuggets: under data sets tab they include 71 data sets available for free download, from various industries. 2The Datahub: Managed by the Open Knowledge Foundation, this site hosts more than 10,000 datasets from a variety of international contributors covering most industries. 3Kaggle: provides free, interesting datasets for various user interests, analysis. Datasets updated frequently. Examples of downloadable data include horse racing, basketball, and current trends, as well as health care emergency calls data. These datasets offer the introductory and advance user an opportunity to explore data science.4Healthcare data:HealthData.gov: can search by data category and format (.csv, .xls, zip, PDF, rdf, JSON, html, txt and API. 5Centers for Disease Control and Prevention: public use files (PUFs) from surveys from multiple branches and agencies within the government. 6Expert Health Data Programming: hosts the links to about 45 large data sets. 7Health Services Research Information Central: extensive health datasets, statistics, international data and data tools 8Vanderbilt Biostatistics Datasets. Multiple health related data sets are available to download as Excel, ASCII, R and S-Plus files. Also includes links to international data sets.9MIMIC III Critical Care database is a repository of more than 40,000 de-identified critical care patient-level data. 10CMS Data Navigator. Expedites the search for Medicare and Medicaid data. 11Free online statistics resources:Data science textbooks: An Introduction to Statistical Learning with Applications in R. 12The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edition. 13Stat Trek. Online tutorials guide students through the introductory steps of statistics. There are brief quizzes and calculators to add interest and functionality 14Biostatistics. Open Learning Textbook. University of Florida.15OnlineStatBook. Excellent Introductory free online reference. The work was done primarily by David Lane from Rice University. There is also a free e-book for Mac or iOS devices “Introduction to Statistics: An Interactive e-Book. 16OpenIntro. Three free PDF download books. One book is associated with about 100 free datasets. 17Statistics How To. There is an online textbook as well as a companion e-book for sale. The web site includes calculators and stats tables.18Kaggle: There are forums for those just getting started in data science, as well as information about public data sets. Kaggle also provides job forums for those interested in careers in the data science field. If interested in learning how to run code, there is also an option to receive community feedback on your work. In addition, Kaggle hosts data science competitions for both health and non-health care data. Prizes/rewards are offered for meeting these challenges and getting the best prediction.4StatPages. A mega-site for essentially any free online statistical calculator you can think of.19Free online data science resourcesSchool of Data: Online course covers data fundamentals, data cleaning, exploring data, extracting data, mapping data and others.20Class Central: offers multiple free data science and big data-related courses 21Data Science Academy: aggregates courses from multiple universities 22Udacity: includes data science courses at beginner through advanced levels 23Oregon Health and Science University Free Healthcare Data Analytics Course. 24IBM Big Data University: offers multiple free courses related to data science and big data analytics for beginners and intermediate level learners 25Spreadsheet tutorialsUniversity of California at Berkeley. 26Google spreadsheets. 27R language tutorialsTutorials Point 28R Tutorial by Kelly Black 29Python language tutorialsThe Python Tutorial 30Tutorials Point 31SQL tutorialsSQLZoo 32Tutorials Point 33Web data extraction toolsImport .io 34Google Chrome extension scraper 35Geo-coding toolsGeonames 36QGIS (desktop GIS) 37Data science journalsData Mining and Knowledge Discovery. Six issues published by Springer each year. Available as open access and non-open access articles 38Data Science Journal. Peer-reviewed open-access journal 39Journal of Data Science. Publishes international research articles on data science. Online access is free 40KD Nuggets data science-related content 2TutorialsToolsAnalytical softwarePollsWebcastsReferences:UCI Machine Learning Repository.https://archive.ics.uci.edu/ml/datasets.htmlKD Nuggets.www.kdnuggets.comDataHubhttps://datahub.io/dataset Kaggle.www.kaggle.comHealthData. Gov.http://www.healthdata.gov/content/aboutCenters for Disease Control and Prevention. Public Use Data Files.http://www.cdc.gov/nchs/data_access/ftp_data.htmExpert Health Data Programming.http://www.ehdp.com/vitalnet/datasets.htm HSRIC.https://www.nlm.nih.gov/hsrinfo/datasites.html#488InternationalVanderbilt Biostatistics Datasets.http://biostat.mc.vanderbilt.edu/wiki/Main/DataSetsMIMIC Critical Care Database.https://mimic.physionet.orgCMS Data Navigatorhttps://dnav.cms.gov/Default.aspx An Introduction to Statistical Learning with Applications in R. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Free download:http://www-bcf.usc.edu/~gareth/ISL/The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edition. Hastie T, Tibshirani R, and Friedman J., Springer Series in Statistics 2009. ISBN 978-0-387-84858-7.http://www-stat.stanford.edu/~tibs/ElemStatLearn/Stat Trek.http://stattrek.com/tutorials/free-online-courses.aspx Biostatistics. University of Florida.http://bolt.mph.ufl.eduOnlineStatBook.http://onlinestatbook.com/2/index.htmland iTunes.apple.comOpenIntro Textbooks.www.OpenIntro.org Statistics How To.www.statisticshowto.comStat Pages.http://statpages.info/ School of Data.http://schoolofdata.org/courses/Class Centralhttps://www.class-central.com/subject/data-science Data Science Academyhttp://datascienceacademy.com/free-data-science-courses/ Udacityhttps://www.udacity.com/courses/data-science Oregon Health and Science University Free Healthcare Data Analytics Course.www.informaticsprofessor.blogspot.com.IBM Big Data University.https://bigdatauniversity.comUniversity of California, Berkeley Spreadsheet tutorials.http://multimedia.journalism.berkeley.edu/tutorials/spreadsheets/ Google Spreadsheets.https://sites.google.com/a/g.risd.org/training/RISD-Video-Tutorials/google-spreadsheet-tutorials R Language. Tutorials Point.http://www.tutorialspoint.com/r/index.htm R Tutorial by Kelly Blackhttp://www.cyclismo.org/tutorial/R/ The Python Tutorialhttps://docs.python.org/2/tutorial/ Tutorials Point. Python.http://www.tutorialspoint.com/python/ SQL Zoo.http://sqlzoo.net/wiki/SQL_Tutorial Tutorials Point. SQL.http://www.tutorialspoint.com/sql/ Import. Io.https://www.import.io/Google Chrome Extension Scraper.https://chrome.google.com/webstore/detail/scraper/mbigbapnjcgaffohmbkdlecaccepngjd?hl=enGeonames.QGIS.http://qgis.org/en/site/about/index.html Data Mining and Knowledge Discovery.http://www.springer.com/computer/database+management+%26+information+retrieval/journal/10618 Data Science Journal.http://datascience.codata.org/ Journal of Data Science.http://www.jds-online.com/
Published on September 20, 2016 11:45
No comments have been added yet.


