There's not much organization here, but there really are a LOT of datasets. What's nice about this website is that it allows for the combination of data from a number of sources, and can export the data in a number of formats.ġ,001 Datasets - This is a list of lists of datasets. Quandl - This is a web-based front end to a number of public data sets. It's updated regularly with news about newly available datasets. Reddit Datasets - This last one isn't a dataset itself, but rather a social news site devoted to datasets. GeoDa Center - This is a collection of geospatial datasets offered by Arizona State Univerisity's Center for Geospatial Analysis & Computation. Million Song Dataset - This is a collection of audio features and metadata for a million contemporary popular music tracks.Įnergy Information Administration - This site offers a number of datasets on energy production, consumption, sources, etc. Several datasets related to social networking & Wikipedia.
This list has several datasets related to social networking. SNAP - Stanford's Large Network Dataset Collection.
#STATA DATASETS FREE#
Each competition provides a data set that's free for download. Kaggle - Kaggle is a site that hosts data mining competitions. Most of these datasets come from the government. All of it is viewable online within Google Docs, and downloadable as spreadsheets. Gapminder - Hundreds of datasets on world health, economics, population, etc. Data is downloadable in Excel or XML formats, or you can make API calls. World Bank Data - Literally hundreds of datasets spanning many decades, sortable by topic or country. If you want more, it's easy enough to do a search. Here are a handful of sources for data to work with.Īll of the datasets listed here are free for download. If you work with statistical programming long enough, you're going ta want to find more data to work with, either to practice on or to augment your own research. R-directory > Reference Links > Free Data Sets Free Datasets