Tutorial by Examples: dataset

The built-in mtcars data frame contains information about 32 cars, including their weight, fuel efficiency (in miles-per-gallon), speed, etc. (To find out more about the dataset, use help(mtcars)). If we are interested in the relationship between fuel efficiency (mpg) and weight (wt) we may start p...
For ease of testing, sklearn provides some built-in datasets in sklearn.datasets module. For example, let's load Fisher's iris dataset: import sklearn.datasets iris_dataset = sklearn.datasets.load_iris() iris_dataset.keys() ['target_names', 'data', 'target', 'DESCR', 'feature_names'] You can ...
When it comes to geographic data, R shows to be a powerful tool for data handling, analysis and visualisation. Often, spatial data is avaliable as an XY coordinate data set in tabular form. This example will show how to create a spatial data set from an XY data set. The packages rgdal and sp provi...
Logistic regression is a particular case of the generalized linear model, used to model dichotomous outcomes (probit and complementary log-log models are closely related). The name comes from the link function used, the logit or log-odds function. The inverse function of the logit is called the lo...
--dataset schemas must be identical SELECT 'Data1' as 'Column' UNION ALL SELECT 'Data2' as 'Column' UNION ALL SELECT 'Data3' as 'Column' UNION ALL SELECT 'Data4' as 'Column' UNION ALL SELECT 'Data5' as 'Column' EXCEPT SELECT 'Data3' as 'Column' --Returns Data1, Data2, Data4, and Data5
Given below is a simple example to train a Caffe model on the Iris data set in Python, using PyCaffe. It also gives the predicted outputs given some user-defined inputs. iris_tuto.py import subprocess import platform import copy from sklearn.datasets import load_iris import sklearn.metrics ...
Caffe has a build-in input layer tailored for image classification tasks (i.e., single integer label per input image). This input "Data" layer is built upon an lmdb or leveldb data structure. In order to use "Data" layer one has to construct the data structure with all training d...
From dataframe: mtrdd <- createDataFrame(sqlContext, mtcars) From csv: For csv's, you need to add the csv package to the environment before initiating the Spark context: Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shel...
Method 1: proc sql; create table foo like sashelp.class; quit; Method 2: proc sql; create table bar as select * from sashelp.class (obs=0); quit; Method 1 should be the preferred option
To see built-in data sets from package dplyr data(package = "dplyr") No need to load the package first.
Import libraries (language dependency: python 2.7) import tensorflow as tf import numpy as np from sklearn.datasets import fetch_mldata from sklearn.model_selection import train_test_split load data, prepare data mnist = fetch_mldata('MNIST original', data_home='./') print "MNIST data,...
In [1]: import pandas as pd import numpy as np In [2]: df = pd.DataFrame(np.random.choice(['foo','bar','baz'], size=(100000,3))) df = df.apply(lambda col: col.astype('category')) In [3]: df.head() Out[3]: 0 1 2 0 bar foo baz 1 baz bar baz 2 foo foo b...
The Iris flower data set is a widely used data set for demonstration purposes. We will load it, inspect it and slightly modify it for later use. import java.io.File; import java.net.URL; import weka.core.Instances; import weka.core.converters.ArffSaver; import weka.core.converters.CSVLoader; i...
The basic use of fit is best explained by a simple example: f(x) = a + b*x + c*x**2 fit [-234:320][0:200] f(x) ’measured.dat’ using 1:2 skip 4 via a,b,c plot ’measured.dat’ u 1:2, f(x) Ranges may be specified to filter the data used in fitting. Out-of-range data points are ignored. (T. W...
In case of big data sets, the call of grepl("fox", test_sentences) does not perform well. Big data sets are e.g. crawled websites or million of Tweets, etc. The first acceleration is the usage of the perl = TRUE option. Even faster is the option fixed = TRUE. A complete example would be: ...
data newclass(keep=first_name sex weight yearborn); set sashelp.class(drop=height rename=(name=first_name)); yearborn=year(date())-age; if yearborn >2002; run; Data specifies the target data set. Keep option specifies columns to print to target. Set specifies source data set. Drop s...
Rhas a vast collection of built-in datasets. Usually, they are used for teaching purposes to create quick and easily reproducible examples. There is a nice web-page listing the built-in datasets: https://vincentarelbundock.github.io/Rdatasets/datasets.html Example Swiss Fertility and Socioecono...
There are packages that include data or are created specifically to disseminate datasets. When such a package is loaded (library(pkg)), the attached datasets become available either as R objects; or they need to be called with the data() function. Gapminder A nice dataset on the development of c...

Page 1 of 1