Subject Guides
- Binghamton University Libraries
- Subject Guides
- Digital Scholarship
- Data visualization
Digital Scholarship
Guide Contents
Data Visualization
Data Visualization
Data Visualization is a way to communicate research and create arguments with your data. It is a wide term that sometimes overlaps with other digital scholarship concepts, such as text analysis. Simply put, it means that you are taking organized data, usually quantitative (with exceptions), and using tools to create a visual that helps describe the data. This can include charts, graphs, maps, infographics, network analysis, etc. If you have ever used Excel to make a graph, that is data visualization.
Depending on the tools you use and the platform you wish to share them on, these visuals can be static or interactive. Alongside a grasp of your data and what you hope to communicate, graphic design principles are important to consider when creating a visualization. You want the digital objects you create to be both easily understood and eye-catching, without the viewer being misled or overwhelmed. To see a wide range of examples of well-received data visualization, check out the Data Visualization Society's Information is Beautiful Winners.
Each tab here discusses different categories of visualizations and examples of tools.
An example of static bar chart visualization created in Canva.
Charts, Graphs, and Infographics
Data visualization is all around us. Many of us already use it or at least interact with it daily. Many tools are available to create charts, graphs, and infographics. Below are three common tools with the pros and cons of using them.
Excel or Google Sheets
Excel or Google Sheets are one of the most common modes for making graphs and charts since they can be embedded among the data that you are already using. If you choose to go this route, there is plenty of documentation available through both Microsoft and Google.
- Pros
- Most people are already familiar with and have access to Excel or Sheets
- Visualizations and data are in the same place
- Simple to create and update with available settings
- Visualizations can be updated as you update the data
- Cons
- Not many options
- No interactivity
- Not always visually appealing
Tableau
Tableau is an accessible way to make many different types of visualizations and offers ways to combine them into narrative-driven infographics. Tableau Public is free to everyone to be downloaded. There is also a free academic license available to students, instructors, and non-profit academic researchers, but note it is processed manually and may take up to a couple of weeks. You will need to create an account for both so you can share your maps through their public server (Tableau Public).
- Pros
- Once you understand the different features, the program just involves dragging and dropping your data in their interface.
- It accepts several types of data, including csv, xlsx, JSON, pdfs, and spatial data.
- Tableau allows for interactivity, including clicking on different data points to reveal information, zooming in and out of the visualization, and animating different parts to show change over time.
- Visualizations shared through the Tableau Public server can be linked and embedded in other sites.
- Story and Dashboard functions help to design interactive infographics with many design options.
- Cons
- If you use Tableau Public, your data will be made public when you create a shareable version of your visualization, so make sure there are no privacy issues.
- The academic license version does allow for local exports and more data connection options; however, exporting them locally means giving up the interactive features.
- You cannot update your data in Tableau, so if you need to make sure your data is clean and ready before uploading otherwise, you will need to keep updating and uploading it.
- There is not much analysis that can be done outside of creating the visualizations.
Canva
Canva is a user-friendly design tool that has some visualization options. It is a graphic design-driven service, not a data one, so it is best utilized when your graphs and charts are secondary to other features, such as text and images.
- Pros
- Very simple to use and all cloud-based so you do not need to do anything past making an account.
- It relies heavily on templates and suggested design elements with drag-and-drop features and graphic design recommendations.
- It has built in color palettes, graphics, and font recommendations to make a very cohesive-looking projects
- Charts and graphs are drag and drop on your existing creation, meaning they can be added to many different document formats, such as social media posts, flyers, and posters.
- You can upload spreadsheet files to create your graphs and charts
- Easily exportable into many different document types
- Cons
- Very limited visualization options
- Many features are hidden behind the paywall
- Although you can create moving gif-like graphics that export as videos, there is no interactivity available for data visualizations.
Maps
Maps are one of the most popular forms of data visualization as they help us organize data by place. There are several different tools to help you create maps. Your choice of tool will depend on your data, the level of interactiveness you want in your visualization, and how you wish to share your map. See the Spatial Humanities Working Group workshops to see examples of how spatial data is used by the humanities on campus.
Choropleth population map of the Netherlands by nerdy.maps from Wikimedia Commons
Licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
Interactive Maps
Tableau is an accessible way to make interactive maps without knowledge of coding or GIS. Once you understand the different features, the program just involves dragging and dropping your data in their interface. It accepts several different types of data, but to make maps successfully, you must make sure that there is a column with locations (countries, states, longitude/latitude, etc).
Tableau Public is free to everyone. There is also a free academic license available to students, instructors, and non-profit academic researchers, but note it is processed manually and may take up to a couple of weeks. You will need to create an account for both so you can share your maps through their public server (Tableau Public), so make sure the data you are using is not private. The academic license version does allow for local exports and more data connection options.
Maps shared through the Tableau Public server can be linked and embedded in other sites. They allow for interactivity, including clicking on different links to reveal information, moving around the map, and animating the maps to show change over time. To see examples of Tableau Public visualizations, you can browse their Viz of the Day archives.
GIS Maps
GIS (Geographic Information System) Maps use vector or raster data to build maps using spatial data.
- Vector data is made up of points, lines, and polygons
- Raster data is made up of pixels.
Often the data needed to complete these maps is very granular and specific. You usually need several types of files to build these maps, including shapefiles. Geoda Data and Lab has examples of sets of GIS data. Depending on your topic, there are many government agencies and public institutions that publish GIS data for public use, such as the United States Geological Survey and the National Park Service. Many other fields create, use, and disseminate this data, including urban planning, public health, environmental science, history, and business.
Tools
- The most popular product for GIS mapping is ArcGIS. The campus has the GIS and Remote Sensing Core Facility that works on projects using ArcGIS. As a member of the Binghamton University community, you have access to ArcGIS and other ESRI products. Information on how to set them up can be found on their Lab Resources page.
- An open-source and free alternative is QGIS.
Both ArcGIS and QGIS use specific terminology, which is helpful to know when trying to create your maps. A basic list can be found in this QGIS tutorial by Antony Scott.
Other Maps
To see more options for incorporating maps with text, image, and video content, see the section on StoryMaps.
Network Analysis
Network or social network analysis involves the evaluation of connections between entities in a network. Often it is focused on the way people connect. For example, a visualization of a network that shows the authors in a field who have published together or characters in a tv show that have scenes together.
Unlike other types of visualization that aim for readability, networks are often hard to read on a micro level but allow for a macro read of clusters and distance. They also allow for different measurements, including but not limited to:
- Number of nodes - points in the network
- Number of edges - connections between the nodes
- Density - how many ties between nodes that exists compared to how many are possible
- Average path length - average number of steps between possible pairs of nodes
- Clustering - how nodes share commonalities
- Degree distributions - the number of connections a nodes has
- Centrality
- Closeness - how close a node is to all of the the other nodes
- Betweeness - the number of times a node bridges the shortest path for two other nodes.
Creating Networks
Gephi is open source and free software for creating network analysis. You will need to download the program to your computer.
Before uploading data to Gephi, make sure to transform your tables to Social Network Anaylsis data. An easy way to do this is to upload your CSV file(s) to Table2Net. Choose the type of Network (Normal has one type of nodes). Then select which column is the nodes column. Choose the columns that will create the links and build your network. A GEXF file will be created for use in Gephi.
Once your data is in Gephi, you will be able to create visualizations, extract measurements, and export your network.
Graph representing the metadata of thousands of archive documents, documenting the social network of hundreds of League of Nations personals from Wikimedia Commons.
Grandjean, Martin (2014). "La connaissance est un réseau". Les Cahiers du Numérique 10 (3): 37-54. DOI:10.3166/LCN.10.3.37-54.
This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.
Coding
Python and R are the most popular options for using programming to create data visualizations. Most of the data visualization features are added from third-party packages. For Python, one of the most popular packages is matplotlib. For R, ggplot2 is popular.
The advantages of using programming to create visualizations include increased flexibility, transparency, and reproducibility in the creation process. Coding the visualization leaves clear instructions on how you created it. You can also perform different analyses alongside your visualizations in ways you can not with other tools. However, they do involve a greater time investment if you are working with the languages for the first time.
To get started in Python
- There are many different programs you can download to work with Python, including Jupyter Notebooks, Visual Studio code, and Pycharm.
- However, to get started without having to install or set anything up, you can use Google Colaboratory.
- Like a Google doc, Colab documents can be easily shared and edited. You will be able to find notebooks that people have made for different topics. If you are using someone else's notebook, make sure to make a copy so you are not editing the original.
- Recommended tutorial: Once you open Colab, Google offers several tutorials on their Welcome page. Our Text Analysis product Constellate also offers a Data Visualization Tutorial using matplotlib (note: this is offered through Jupyter Notebooks and not Colab, although it is similar)
To get started in R
- Download R
- Download R Studio for an environment to code in
- Recommended tutorial: Swirl - this is an interactive tutorial that is done through R. There are several different lessons you can do at different levels, including ones on data analysis, regression models, and data visualization.
There is an infinite number of resources available to learn and use both Python and R. Part of the journey is finding the learning materials that work best for you. So don't be afraid to explore.