Data Visualization

d2_160Given the extent to which statistics classes alienate undergraduates without necessarily providing a tangible benefit, and given the increasing availability of data and open-source tools with which to analyze data, I’ve decided to focus my undergraduate methods teaching on data literacy, data visualization, and exploratory data analysis. The result is my iTunes U course, “Data Literacy and Data Visualization.”

Some of the most useful resources I’ve found while creating the course are listed below.



Design: How to Think About Visualization

All-in-One Examples: Data + Visualization

Visualization Tools

  • An easy-to-use wordcloud generator, Wordle (and a thoughtful essay on why wordclouds are bad, and an example of an alternative)
  • A more advanced and style-forward wordcloud generator, Tagxedo
  • The Overview project for visualizing relationships in large numbers of documents, with an example
  • Google’s easy-to-use Google Chart Tools API (helps to know HTML)
  • A basic plug-and-play online chart builder, Hohli
  • The JavaScript-based JSCharts site, which allows easy construction of basic charts and graphs
  • The glorious animated multicolored scatterplot engine,
  • Google’s similar and equally awesome Public Data Explorer
  • A general data-visualization engine, IBM’s ManyEyes
  • The extremely easy-to-use and attractive Datawrapper reactive-chart website
  • Lyra, a really slick interactive visualization design environment (coming soon)
  • Plotly, a collaborative online visualization and data analysis tool with some handy APIs
  • The Tableau Public data-visualization tool [requires Windows]
  • The easy-to-use GPS Visualizer (requires longitude, latitude data)
  • The Flash-based map- and trend-generation engine, StatSilk
  • The Flash-based (and web-centric, but gorgeous) Flare[requires nontrivial compilation]
  • The cross-platform, open-source Gephi tool for visualizing networks and complex systems
  • The Cytoscape network visualization platform
  • The NodeXL network graphing tool [requires Windows]
  • The dead-simple and very impressive GunnMap world map visualization tool
  • The OpenHeatMap distribution heatmap site
  • The CartoDB site for creating dynamic, data-driven maps quickly and easily
  • The stunning Tilemill program at Mapbox for visualizing data on maps, and some examples (tiered pricing includes free option)
  • The “free for now” ChartsBin world map creation tool
  • The easy-to-use, beautiful, and free GeoCommons map tool
  • Chart Chooser, a website for graphs and tables from Excel or PowerPoint templates
  • The Mondrian interactive-graph interface for creating graphs from ASCII, R, or database files
  • The Chartle tool (beta) for creating and exporting a variety of graphs and maps from Excel data
  • The Science of Science meta-tool for data analysis and visualization
  • A blog with relevant resources and links, Visualizing Data
  • Some information on flow maps, with source code (alpha version, far from user-friendly) and a demo
  • A fairly useful-looking web-based general data analysis tool, StatCrunch
  • Stunning graphics for the programming-oriented at
  • FF Chartwell, a typeface for creating simple graphs

Javascript Libraries (knowledge of Javascript required… but wow)

  • The flat-out-jawdropping Data-Driven Documents (or D3 for short)
  • Raphaël, a simple library for impressive vector graphics
  • Crossfilter, a D3 library for creating dynamic views of different dimensions of a dataset
  • The free amCharts JavaScript bundle
  • Tangle, a library that allows reactive visualization of the results of complex interactions or equations
  • Polymaps, a mapping library designed around data visualization
  • Kartograph, a Python library for really impressive interactive map visualizations

Data Resources

Dataset Archives

Presentation of Results

Resources for Connecting with R

  • The R Project
  • The R Commander graphic interface
  • Tom Short’s R Reference Card, with a great summary of some of the most useful commands in R
  • Tools for making LaTeX tables in R
  • The Rdatasets site, which catalogs all of the datasets available natively in R
  • A useful blog post on importing data of different formats into R
  • The rOpenSci catalog of R packages that interface with data repositories
  • Shiny, an R package for creating interactive graphics with no (non-R) programming required
  • healthvis, an R package for creating D3-enabled versions of some common graph types with no (non-R) programming required
  • rCharts, Ramnath Vaidyanathan’s Javascript-fueled interactive-graphics package for R
  • Quandl for R, a package for importing time series data from Quandl (above)
  • Lubridate, an R package for handling dates (this will seem trivial unless you’ve tried to use dates in R)
  • The Mondrian data-visualization interface, which can pull data from R to create interactive graphs
  • The rdatamarket package for pulling data from DataMarket directly into R
  • R datasets on truly random (but generally interesting) topics at reddit
  • The incredible R Graph Gallery, with source code
  • The ggplot2 R library [now in maintenance mode; being phased out in favor of ggvis]
  • The ggmap R library, which plots latitude/longitude data on maps
  • A good example of how to make a heatmap (not geographical) with R’s heatmap library
  • A useful example of how to make 3D maps with R’s persp library
  • Two tutorials (here and here) on combining maps with data
  • A tutorial on how to turn time series data into calendar heatmaps in R
  • Large Datasets and You,” a primer on big data in R by Matthew Blackwell and Maya Sen

Why Dataviz Isn’t Enough