journal

wordflow - realtime data visualization of 2016 election activity

Check out a very early/alpha (buggy!) online project that I'd hoped to formally release before today (November 8, 2016: #Election2016):

wordFlow

wordFlow uses real-time data feeds from various sources (such as @HillaryClinton, @realDonaldTrump) and displays the text.

The data sources (i.e., CNN) are shown as circles, where the size of the circle is proportional to the number of words published since I started the project (about a year ago?).

A data source has one or more "channels." For instance, CNN has a Twitter feed, a YouTube channel and a plain-old website. A channel is also displayed as a circle, and it's size proportional to the words published in only that particular channel. Channels are "linked" to its source by lines.

As real-time data is published by a source via one of its channels, the text emerges from the channel circle, and its movement, font, font size, color and opacity are determine by various metrics, as well as the "physics" settings of the visualization.

Text font size is based on the number of times a word is used.

Text color is based on keywords that I've selected loosely categorized as:

  • "positive",
  • "negative",
  • "neutral",
  • "politically-left",
  • "politically-right"

For instance, the words "Hillary" and "Clinton" are in the "politically-left" category. At this time, the categorization is only based on my opinion, and will most likely change.

The "physics" can be modified by adjusting the settings for:

  • "gravity"
  • "charge"
  • "velocity decay"
  • "link strength"
  • "link distance"
  • "age"

To adjust the physics settings, click on the "SHOW CONTROL" button that appear in the lower left corner when the mouse is moved. Again, it's alpha software, so some controls may do nothing, and unpredictable results are likely!

Sources include: Twitter, YouTube closed captions, web site RSS feeds, transcripts, and realtime CSPAN data streams.

Future plans include:

  • data analysis (i.e., histograms, social graphs, etc.) of language and language trends used in the campaign
  • ability to re-play archived data streams
  • automatic discovery of new data sources (i.e., Twitter followers of @HillaryClinton)
  • selectively filter based on criteria such as:
    • data source
    • word use frequency
    • phrases

Please check it out and let me know what you think. Please feel free to let me know about problems you may have when viewing the site.

Thanks!