| Home Web Directory Metasearch Message Boards Classified Ads
|
Top: Computers: Artificial_Intelligence: Machine_Learning: Datasets:
Datasets (22)
Sites:
 |
|
» DELVE - Data for Evaluating Learning in Valid Experiments 
Data for Evaluating Learning Valid Experiments: A standardized environment designed to evaluate the performance of methods that learn relationships based primarily on empirical data. Delve makes it possible for users to compare their learning methods with
http://www.cs.utoronto.ca/~delve/
|
 |
|
» Dataset generator 
Datgen, formerly SCDS, is a computer program that generates data to systematically test programs that consume data. These synthetic datasets can be used to validate learning algorithms.
http://www.datgen.com/
|
 |
|
» Face recognition dataset 
A dataset of face images for face recognition algorithms.
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/avrim/www/ML94/face_homework.html
|
 |
|
» HS3D - Homo Sapiens Splice Sites Dataset 
HS3D (Homo Sapiens Splice Sites Dataset) is a database of Homo Sapiens Exon, Intron and Splice regions extracted from GenBank primate sequences Rel.123. The aim of this data set is to give standardized material to train and to assess the prediction accu
http://www.sci.unisannio.it/docenti/rampone/
|
 |
|
» NIST Special Database 4. 
This NIST database of fingerprint images contains 2000 8- bit gray scale fingerprint image pairs.
http://www.nist.gov/srd/nistsd4.htm
|
 |
|
» National Space Science Data Center 
Provides access to a wide variety of astrophysics, space physics, solar physics, lunar and planetary data from NASA space flight missions, in addition to selected other data and some models and software.
http://nssdc.gsfc.nasa.gov/
|
 |
|
» Penn Treebank Project 
A corpus of parsed sentences. Used by many researchers for training data-driven parsing algorithms.
http://www.cis.upenn.edu/~treebank/
|
 |
|
» TREC Data 
Text datasets used in information retrieval and learning in text domains.
http://trec.nist.gov/data.html
|
 |
|
» The 20 Newsgroups Data Set 
20 Newsgroups for text categorization. Widely used dataset.
http://www.ai.mit.edu/~jrennie/20_newsgroups/
|
 |
|
» The RCSB Protein Data Bank (PDB) 
Archive of experimentally-determined, biological macromolecule 3-D structures from the Brookhaven National Laboratory.
http://www.rcsb.org/pdb/
|
 |
|
» Time Series Data Library 
A collection of over 500 time series, maintained by Rob Hyndman. Time series are organized by subject.
http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/
|
 |
|
» UCI Machine Learning Repository 
A repository of databases, domain theories and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.
http://www.ics.uci.edu/~mlearn/MLRepository.html
|
 |
|
» University of Maryland, INFORUM EconData 
Several hundred thousand economic time series, produced by the U.S. Government and distributed by the government in a variety of formats and media, have been put into a standard, highly efficient, easy-to- use form for personal computers.
http://www.inforum.umd.edu/Econdata.html
|
 |
|
» Web->KB dataset 
Web pages partitioned into classes, with hyperlink data. The dataset has been used for text categorization and learning to extract symbolic knowledge from the World Wide Web.
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
|
 |
|
» WordSimilarity-353 Test Collection 
Contains 353 English word pairs along with human-assigned similarity judgements.
http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/wordsim353.html
|
This category needs an editor
Last Updated: 2006-10-10 05:02:39
The content of this directory is based on the Open Directory and has been modified by GoSearchFor.com
|