titanic is an R package containing data sets providing information on the fate of passengers on the fatal maiden voyage of the ocean liner “Titanic”, with variables such as economic status (class), sex, age and survival. These data sets are often used as an introduction to machine learning on Kaggle. More details about the competition can be found here, and the original data sets can be found here.


You can install the latest development version from CRAN:


Or from GitHub with:

if (packageVersion("devtools") < 1.6) {

If you encounter a clear bug, please file a minimal reproducible example on GitHub.


To cite package ‘titanic’ in publications use:

Paul Hendricks (2015). titanic: Titanic Passenger Survival Data Set. R package version 0.1.0. https://github.com/paulhendricks/titanic

A BibTeX entry for LaTeX users is

  title = {titanic: Titanic Passenger Survival Data Set},
  author = {Paul Hendricks},
  year = {2015}, 
  note = {R package version 0.1.0},
  url = {https://github.com/paulhendricks/titanic},