California Housing

Attribute

Value

pretty_name

California Housing

annotations_creators

language_creators

languages

licenses

CC0

multilinguality

size_categories

10k<n<100k

source_datasets

task_categories

task_ids

paperswithcode_id

Dataset Description

Dataset Summary

Tabular data containing California housing prices from the 1990 census. Also see this Kaggle description

Download and prepare data

The dataset can be loaded directly via the squirrel Catalog API. Make sure that squirrel-dataset-core is installed via pip, which will register this dataset. Use the following code to load the data:

from squirrel.catalog import Catalog

plugin_catalog = Catalog.from_plugins()
it = plugin_catalog["california_housing"].get_driver().get_iter()

Dataset Structure

Data Instances

A sample from the training set is provided below:

{
    'longitude': '-122.300000',
    'latitude': 37.81,
    'housingMedianAge': 52.0,
    'totalRooms': 1224.0,
    'totalBedrooms': 237.0,
    'population': 521.0,
    'households': 159.0,
    'medianIncome': 1.191,
    'medianHouseValue': 76100.0
}

Dataset Schema

All features are continuous floats

Data Splits

name

California housing

20,640