Wolfram Computation Meets Knowledge

Wolfram Language & System Documentation Center Wolfram Language Home Page »

Dataset

Dataset[data]

represents a structured dataset based on a hierarchy of lists and associations.

Details and Options

Examples

open allclose all

Basic Examples  (1)

Create a simple Dataset with a list of associations:

Get the second row:

Get the second column:

Compute the Total of each column:

Scope  (1)

Create a Dataset object from tabular data:

Take a set of rows:

Take a specific row:

A row is merely an association:

Take a specific element from a specific row:

Take the contents of a specific column:

Take a specific part within a column:

Take a subset of the rows and columns:

Apply a function to the contents of a specific column:

Partition the dataset based on a column, applying further operators to each group:

Apply a function to each row:

Apply a function both to each row and to the entire result:

Apply a function f to every element in every row:

Apply functions to each column independently:

Construct a new table by specifying operators that will compute each column:

Use the same technique to rename columns:

Select specific rows based on a criterion:

Take the contents of a column after selecting the rows:

Take a subset of the available columns after selecting the rows:

Take the first row satisfying a criterion:

Take a value from this row:

Sort the rows by a criterion:

Take the rows that give the maximal value of a scoring function:

Give the top 3 rows according to a scoring function:

Delete rows that duplicate a criterion:

Compose an ascending and a descending operator to aggregate values of a column after filtering the rows:

Do the same thing by applying Total after the query:

Options  (42)

Alignment  (2)

Right-align all items:

Right-align the "age" column:

Background  (10)

Give all dataset items a pink background:

Make the first row pink:

Make the first column pink:

Make one item pink:

Pink and gray backgrounds for the first two rows:

An equivalent syntax:

Alternating pink and gray rows:

Alternating pink and gray columns with the first and last columns yellow:

Alternating pink and gray columns with the third column yellow:

Blending of colors:

Set background color by value:

Set background color by position:

Use patterns in positions:

DatasetTheme  (4)

Use a theme with alternating row backgrounds:

Combine themes for a customized presentation:

Use a theme to emphasize low-level groupings:

Use a theme to stripe long rows and columns to make them easier to follow:

HeaderAlignment  (2)

Right-align row headers:

Center column headers:

HeaderBackground  (2)

Make headers pink:

Pink row headers and cyan column headers:

HeaderDisplayFunction  (1)

Lowercase headers:

HeaderSize  (1)

Make headers a fixed number of characters tall:

Make row headers a fixed number of characters wide:

HeaderStyle  (4)

Set one overall style for headers:

Use Directive to wrap multiple style directives:

Use a style from the current stylesheet:

Style specific headers:

HiddenItems  (4)

Hide a row:

Hide a column:

Hide a row except for one item:

Hide items with a given value:

ItemDisplayFunction  (1)

Frame each item:

Replace strings with symbols:

ItemSize  (3)

Make each item a fixed number of character widths wide:

Make one tall row:

Make one narrow column:

ItemStyle  (6)

Set one overall style for dataset items:

Use Directive to wrap multiple style directives:

Use a style from the current stylesheet:

Style specific elements:

Style items by value:

Style items by position:

MaxItems  (2)

Display up to a given number of rows:

Display up to a given number of rows and columns:

Applications  (3)

Tables (Lists of Associations)  (1)

Load a dataset of passengers of the Titanic:

Get the number of rows:

Get a random sample of passengers:

The underlying data is a list of associations:

Count the number of passengers with a missing age:

Count the number of passengers in 1st, 2nd and 3rd class:

Get a histogram of passenger ages:

Get a histogram of passenger ages, grouped by passenger class:

Find the age of the oldest passenger:

Calculate the overall survival ratio:

Show the survival ratio against sex and passenger class:

Show the survival ratio as a function of age:

Indexed Tables (Associations of Associations)  (1)

Load a dataset of planets and their properties:

Look up the mass of the Earth:

Get the subtable corresponding to moons of a specific planet:

Produce a dataset of the radii of the planets:

Visualize the radii of the planets:

Produce a dataset of the number of moons of each planet:

Obtain a list of the planets and their masses, sorted by increasing mass:

Find the total mass of each planet's moons:

Obtain a list of only those moons that have a mass larger than half that of Earth's moon:

Find the heaviest moon of each planet:

Obtain a list of all planetary moons:

Make a scatter plot of the mass against radius:

Calculate and make a histogram of the densities:

Compute the mean density for the moons of each planet:

Create a table comparing the density of each planet with the mean density of its moons:

Hierarchical Data (Associations of Associations of Other Data)  (1)

Load a dataset associating countries to their administrative divisions to their populations:

The underlying data is an association whose keys are countries and whose values are further associations between administrative divisions and their populations:

Look up the populations for a specific country:

Give the total population (not all countries in the world are included in this dataset):

Count the number of divisions within each country:

Total the number of divisions:

Build a histogram of the number of divisions:

Calculate the total population of each country by adding the populations of each division:

Obtain the five most populous countries:

Obtain the most populous divisions for each country:

Correlate the number of divisions a country has with its population:

The underlying data being passed to ListLogPlot is an association of lists, each of length 2:

Properties & Relations  (4)

Query is the operator form of the query language supported by Dataset:

Use EntityValue to obtain a Dataset of the properties of Entity objects from the Wolfram Knowledgebase:

Plot the boiling point versus density on a log plot:

Use SemanticImport to import a file as a Dataset:

Calculate the total quantity of sales:

Obtain a small sample of the Titanic passenger dataset:

Export the sample as "JSON" format:

Data with named columns can be more compactly represented if it is first transposed:

Export the sample as "CSV":

Possible Issues  (3)

Data without a consistent structure will not usually format in the same way as structured data:

If a sub-operation of a query is inferred to fail, the entire query will not be performed, and a Failure object will be returned:

By default, if any messages are generated during an operation on a Dataset, a Failure object will be returned:

To specify different behavior, use an explicit Query expression in conjunction with the option FailureAction:

Neat Examples  (2)

Calculate the survival likelihood of the characters Jack Dawson and Rose DeWitt Bukater from the movie Titanic by matching them with real passengers:

Make an interactive control that plots a heat map of a country's divisions when a country is selected:

Wolfram Research (2014), Dataset, Wolfram Language function, https://reference.wolfram.com/language/ref/Dataset.html (updated 2021).

Text

Wolfram Research (2014), Dataset, Wolfram Language function, https://reference.wolfram.com/language/ref/Dataset.html (updated 2021).

CMS

Wolfram Language. 2014. "Dataset." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2021. https://reference.wolfram.com/language/ref/Dataset.html.

APA

Wolfram Language. (2014). Dataset. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/Dataset.html

BibTeX

@misc{reference.wolfram_2024_dataset, author="Wolfram Research", title="{Dataset}", year="2021", howpublished="\url{https://reference.wolfram.com/language/ref/Dataset.html}", note=[Accessed: 18-May-2024 ]}

BibLaTeX

@online{reference.wolfram_2024_dataset, organization={Wolfram Research}, title={Dataset}, year={2021}, url={https://reference.wolfram.com/language/ref/Dataset.html}, note=[Accessed: 18-May-2024 ]}

Top