Data Visualization

d2_160Given the extent to which statistics classes alienate undergraduates without necessarily providing a tangible benefit, and given the increasing availability of data and open-source tools with which to analyze data, I’ve decided to focus my undergraduate methods teaching on data literacy, data visualization, and exploratory data analysis. The result is my iTunes U course, “Data Literacy and Data Visualization.”

Some of the most useful resources I’ve found while creating the course are listed below.

 

Overviews

Design: How to Think About Visualization

All-in-One Examples: Data + Visualization

Visualization Tools

  • An easy-to-use wordcloud generator, Wordle (and a thoughtful essay on why wordclouds are bad, and an example of an alternative)
  • A more advanced and style-forward wordcloud generator, Tagxedo
  • The Overview project for visualizing relationships in large numbers of documents, with an example
  • Google’s easy-to-use Google Chart Tools API (helps to know HTML)
  • A basic plug-and-play online chart builder, Hohli
  • The JavaScript-based JSCharts site, which allows easy construction of basic charts and graphs
  • The glorious animated multicolored scatterplot engine, Gapminder.org
  • Google’s similar and equally awesome Public Data Explorer
  • A general data-visualization engine, IBM’s ManyEyes
  • The extremely easy-to-use and attractive Datawrapper reactive-chart website
  • Infogram, an online chart and graph maker. Free version has 30 different kinds of graphs; paid version has more features, including live update from JSON or Google Drive
  • Lyra, a really slick interactive visualization design environment (coming soon)
  • Plotly, a collaborative online visualization and data analysis tool with some handy APIs
  • The Tableau Public data-visualization tool [requires Windows]
  • The easy-to-use GPS Visualizer (requires longitude, latitude data)
  • The Flash-based map- and trend-generation engine, StatSilk
  • The Flash-based (and web-centric, but gorgeous) Flare[requires nontrivial compilation]
  • The cross-platform, open-source Gephi tool for visualizing networks and complex systems
  • The Cytoscape network visualization platform
  • The NodeXL network graphing tool [requires Windows]
  • The dead-simple and very impressive GunnMap world map visualization tool
  • The OpenHeatMap distribution heatmap site
  • The CartoDB site for creating dynamic, data-driven maps quickly and easily
  • The stunning Tilemill program at Mapbox for visualizing data on maps, and some examples (tiered pricing includes free option)
  • The “free for now” ChartsBin world map creation tool
  • The easy-to-use, beautiful, and free GeoCommons map tool
  • Chart Chooser, a website for graphs and tables from Excel or PowerPoint templates
  • The Mondrian interactive-graph interface for creating graphs from ASCII, R, or database files
  • The Chartle tool (beta) for creating and exporting a variety of graphs and maps from Excel data
  • The Science of Science meta-tool for data analysis and visualization
  • A blog with relevant resources and links, Visualizing Data
  • Some information on flow maps, with source code (alpha version, far from user-friendly) and a demo
  • A fairly useful-looking web-based general data analysis tool, StatCrunch
  • Stunning graphics for the programming-oriented at processing.org
  • FF Chartwell, a typeface for creating simple graphs

Javascript Libraries (knowledge of Javascript required… but wow)

Data Resources

Dataset Archives

Presentation of Results

Resources for Connecting with R

  • The R Project
  • The R Commander graphic interface
  • Tom Short’s R Reference Card, with a great summary of some of the most useful commands in R
  • Tools for making LaTeX tables in R
  • The Rdatasets site, which catalogs all of the datasets available natively in R
  • A useful blog post on importing data of different formats into R
  • The rOpenSci catalog of R packages that interface with data repositories
  • Shiny, an R package for creating interactive graphics with no (non-R) programming required
  • healthvis, an R package for creating D3-enabled versions of some common graph types with no (non-R) programming required
  • rCharts, Ramnath Vaidyanathan’s Javascript-fueled interactive-graphics package for R
  • The Plotly API for R (see Plotly, above, for full description)
  • ggvis, an R package for creating interactive graphics
  • D3Network, an R package for creating network, tree, dendogram, and Sankey diagrams in D3
  • Quandl for R, a package for importing time series data from Quandl (above)
  • RExcel, a program that integrates R into Excel
  • Lubridate, an R package for handling dates (this will seem trivial unless you’ve tried to use dates in R)
  • The Mondrian data-visualization interface, which can pull data from R to create interactive graphs
  • The rdatamarket package for pulling data from DataMarket directly into R
  • R datasets on truly random (but generally interesting) topics at reddit
  • The incredible R Graph Gallery, with source code
  • The ggplot2 R library [now in maintenance mode; being phased out in favor of ggvis]
  • The ggmap R library, which plots latitude/longitude data on maps
  • A good example of how to make a heatmap (not geographical) with R’s heatmap library
  • A useful example of how to make 3D maps with R’s persp library
  • Two tutorials (here and here) on combining maps with data
  • A tutorial on how to turn time series data into calendar heatmaps in R
  • Large Datasets and You,” a primer on big data in R by Matthew Blackwell and Maya Sen

Why Dataviz Isn’t Enough