<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vis4.net</title>
	<atom:link href="http://vis4.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://vis4.net</link>
	<description>Random thoughts on information visualization and data journalism by &#60;a href=&#34;http://driven-by-data.net&#34;&#62;Gregor Aisch&#60;/a&#62;</description>
	<lastBuildDate>Sun, 05 May 2013 23:34:31 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Start using databases, today!</title>
		<link>http://vis4.net/blog/posts/introducing-dataset/</link>
		<comments>http://vis4.net/blog/posts/introducing-dataset/#comments</comments>
		<pubDate>Tue, 23 Apr 2013 12:20:29 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3952</guid>
		<description><![CDATA[This post is written to welcome dataset, a new library to simplify working with databases in Python. Let&#8217;s face it. Relational databases, such as MySQL, SQLite and PostgreSQL, are pretty cool – but nobody actually uses them. At least not in the day-to-day work with small to medium scale datasets. But why is that? Why do we [...]]]></description>
				<content:encoded><![CDATA[<div style="font-style: italic; text-align: center; background: #f4eded; padding: 10px;">This post is written to welcome <a href="http://dataset.rtfd.org">dataset</a>, a new library to simplify working with databases in Python.</div>
<p>Let&#8217;s face it. Relational databases, such as MySQL, SQLite and PostgreSQL, are pretty cool – but nobody actually uses them. At least not in the day-to-day work with small to medium scale datasets. But <em>why</em> is that? Why do we see an awful lot of data stored in static files in CSV or JSON format, even though</p>
<ul>
<li>they are <strong>hard to query</strong> (you need to write a custom script every time)</li>
<li>they are <strong>messy</strong>, as they cannot store meta data such as data types</li>
<li>it is a <strong>pain to update</strong> them incrementally, say if some record has changed</li>
</ul>
<p><span id="more-3952"></span></p>
<h2>Programmers are lazy people</h2>
<p>The answer is that <em>programmers are lazy</em>, and thus they tend to prefer the easiest solution they find. And in most programming languages, a database isn’t the simplest solution for storing a bunch of structured data.</p>
<p>At least in Python, things really shouldn&#8217;t be this way.</p>
<p>So, say hello to <a href="http://dataset.rtfd.org">dataset</a>, a new Python library to simplify your every-days work with databases. In a nutshell, <strong>dataset</strong> makes managing databases as simple as reading and writing plain JSON files.</p>
<p>Here&#8217;s a brief list of the key features:</p>
<p><a href="http://dataset.readthedocs.org"><img class="alignleft" alt="" src="http://dataset.readthedocs.org/en/latest/_static/dataset-logo.png" width="200" height="233" /></a></p>
<ul>
<li><strong>Automatic schema</strong>: You never need to worry about the database schema again. If a table or column is written that does not exist in the database, it will be created automatically.</li>
<li><strong>Upserts</strong>: Very handy when running a scraper the second time: Records are either created or updated, depending on whether an existing version can be found.</li>
<li>There are some nice<strong> query helpers</strong> for simple queries such as <a title="dataset.Table.all" href="http://dataset.readthedocs.org/en/latest/api.html#dataset.Table.all"><tt>all</tt></a> rows in a table or all <a title="dataset.Table.distinct" href="http://dataset.readthedocs.org/en/latest/api.html#dataset.Table.distinct"><tt>distinct</tt></a> values across a set of columns.</li>
<li><strong>Compatibility</strong>: Being built on top of <a href="http://www.sqlalchemy.org/">SQLAlchemy</a>, <tt>dataset</tt> seamlessly works with all major databases, such as SQLite, PostgreSQL and MySQL.</li>
<li><strong>Scripted exports</strong>: Data can be exported based on a scripted configuration, making the process easy and replicable.</li>
</ul>
<p lang="python">Hope this comes handy to some of you, since I cannot live without the library anymore. If you want to read more, check out the full documentation at <a href="http://dataset.readthedocs.org/en/latest/">dataset.readthedocs.org</a>?</p>
<p lang="python">Happy databasing!</p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/introducing-dataset/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Playing with CIE Lab Colors in R</title>
		<link>http://vis4.net/blog/posts/playing-with-cie-lab-colors-in-r/</link>
		<comments>http://vis4.net/blog/posts/playing-with-cie-lab-colors-in-r/#comments</comments>
		<pubDate>Thu, 24 Jan 2013 13:13:25 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Color]]></category>
		<category><![CDATA[tutorials]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3888</guid>
		<description><![CDATA[Currently I&#8217;m taking the wonderful course Computing for Data Analysis on Coursera, and in this weeks lecture I learned about how to define custom color palettes in R. You can do this using the colorRampPalette() function that comes with the grDevices package. Calling this function will return another function that you can call to generate [...]]]></description>
				<content:encoded><![CDATA[<p>Currently I&#8217;m taking the wonderful course <a href="https://class.coursera.org/compdata-002/class/index">Computing for Data Analysis</a> on <a href="https://coursera.org/">Coursera</a>, and in this weeks lecture I learned about how to define custom color palettes in R.</p>
<p>You can do this using the <code>colorRampPalette()</code> function that comes with the <strong>grDevices</strong> package. Calling this function will return another function that you can call to generate the color palette.</p>
<p><span id="more-3888"></span></p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="rsplus"><pre class="de1"><span class="sy0">&gt;</span> <span class="kw2">library</span><span class="br0">&#40;</span>grDevices<span class="br0">&#41;</span>
<span class="sy0">&gt;</span> <span class="kw5">palette</span> <span class="sy0">=</span> <span class="kw5">colorRampPalette</span><span class="br0">&#40;</span><span class="kw2">c</span><span class="br0">&#40;</span><span class="st0">&quot;red&quot;</span>, <span class="st0">&quot;yellow&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="sy0">&gt;</span> <span class="kw5">palette</span><span class="br0">&#40;</span><span class="nu0">5</span><span class="br0">&#41;</span>
<span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span> <span class="st0">&quot;#FF0000&quot;</span> <span class="st0">&quot;#FF3F00&quot;</span> <span class="st0">&quot;#FF7F00&quot;</span> <span class="st0">&quot;#FFBF00&quot;</span> <span class="st0">&quot;#FFFF00&quot;</span></pre></div></div></div></div></div></div></div>


<p>By default, R uses the RGB color space to interpolate between colors. As I mentioned in earlier posts, this is not as ideal choice for data visualization. Fortunately, R allows you to interpolate in CIE Lab color space. Let&#8217;s write a quick function <code>plotColors()</code> to plot the colors. The function takes two arguments: <em>palette</em> is the palette function returned by colorRampPalette and <em>n</em> defines the number of color steps we want to display. Internally the function simply draws a bar chart without axes and with no spacing between the bars.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="rsplus"><pre class="de1">plotColors <span class="sy0">=</span> <span class="kw2">function</span><span class="br0">&#40;</span><span class="kw5">palette</span>, n<span class="sy0">=</span><span class="nu0">10</span><span class="br0">&#41;</span> <span class="br0">&#123;</span>
   <span class="kw5">colors</span> <span class="sy0">=</span> <span class="kw5">palette</span><span class="br0">&#40;</span>n<span class="br0">&#41;</span>
   <span class="kw4">barplot</span><span class="br0">&#40;</span><span class="kw2">rep</span><span class="br0">&#40;</span><span class="nu0">1</span>,n<span class="br0">&#41;</span>, <span class="kw2">col</span><span class="sy0">=</span><span class="kw5">colors</span>, border<span class="sy0">=</span><span class="kw5">colors</span>, space<span class="sy0">=</span><span class="nu0">0</span>, yaxt<span class="sy0">=</span><span class="st0">'n'</span><span class="br0">&#41;</span>
<span class="br0">&#125;</span></pre></div></div></div></div></div></div></div>


<p>Now let&#8217;s look at the default interpolation between red and yellow in five steps. On my screen the first two colors look almost the same, although they are equally-spaced in RGB.</p>
<p><img src="http://vis4.net/blog/wp-content/uploads/2013/01/screenshot-2013-01-24-um-13.38.47.png" alt="rgb interpolation between red and yellow" width="446" height="101" class="aligncenter size-full wp-image-3899" /></p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="rsplus"><pre class="de1"><span class="sy0">&gt;</span> pal <span class="sy0">=</span> <span class="kw5">colorRampPalette</span><span class="br0">&#40;</span><span class="kw2">c</span><span class="br0">&#40;</span><span class="st0">&quot;red&quot;</span>, <span class="st0">&quot;yellow&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="sy0">&gt;</span> plotColors<span class="br0">&#40;</span>pal, <span class="nu0">5</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>


<p>You can set the color space to CIE Lab by setting the parameter <em>space</em> to &#8220;Lab&#8221;. As you can see, the difference is huge. That&#8217;s not a question of taste.</p>
<p><img src="http://vis4.net/blog/wp-content/uploads/2013/01/screenshot-2013-01-24-um-13.57.03.png" alt="CIE Lab interpolation" width="448" height="106" class="aligncenter size-full wp-image-3902" /></p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="rsplus"><pre class="de1"><span class="sy0">&gt;</span> pal <span class="sy0">=</span> <span class="kw5">colorRampPalette</span><span class="br0">&#40;</span><span class="kw2">c</span><span class="br0">&#40;</span><span class="st0">&quot;red&quot;</span>, <span class="st0">&quot;yellow&quot;</span><span class="br0">&#41;</span>, space<span class="sy0">=</span><span class="st0">&quot;Lab&quot;</span><span class="br0">&#41;</span>
<span class="sy0">&gt;</span> plotColors<span class="br0">&#40;</span>pal, <span class="nu0">5</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>


<p>Feel free to play around with that code snippets yourself. Instead of the named colors you can also create palettes between arbitrary colors by providing hexadecimal codes:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="rsplus"><pre class="de1"><span class="sy0">&gt;</span> pal <span class="sy0">=</span> <span class="kw5">colorRampPalette</span><span class="br0">&#40;</span><span class="kw2">c</span><span class="br0">&#40;</span><span class="st0">&quot;#FF0000&quot;</span>, <span class="st0">&quot;#0000FF&quot;</span><span class="br0">&#41;</span>, space<span class="sy0">=</span><span class="st0">&quot;Lab&quot;</span><span class="br0">&#41;</span>
<span class="sy0">&gt;</span> plotColors<span class="br0">&#40;</span>pal, <span class="nu0">5</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>


<p>You can find the <a href="https://gist.github.com/4621398">full code for this example on Github</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/playing-with-cie-lab-colors-in-r/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Forget About Parties, Visualize the Coalitions!</title>
		<link>http://vis4.net/blog/posts/forget-about-parties-show-the-coalitions/</link>
		<comments>http://vis4.net/blog/posts/forget-about-parties-show-the-coalitions/#comments</comments>
		<pubDate>Mon, 21 Jan 2013 11:30:42 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Advice]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3870</guid>
		<description><![CDATA[If I was asked for the golden rule of information visualization, it would be: “Show the most important thing first!” Not second or third, but first! And what is the most important thing to show about the outcome of an election? Who actually won. In political systems like Germany’s, where we have no party getting [...]]]></description>
				<content:encoded><![CDATA[<p>If I was asked for the <em>golden rule of information visualization</em>, it would be:<b><br />
</b></p>
<p dir="ltr"><strong>“Show the most important thing first!”</strong></p>
<p>Not second or third, but first! And what is the most important thing to show about the outcome of an election? Who actually won.</p>
<p>In political systems like Germany’s, where we have no party getting anywhere near 50% of the vote, the usual one-bar-per-party bar charts totally fail to answer this most important question.</p>
<p>For example, in the following chart we can see the number of seats won by different political parties &#8211; but this does not tell us who won the election.<br />
<span id="more-3870"></span><br />
<img class="aligncenter size-medium wp-image-3871" alt="wahlergebnis-balken" src="http://vis4.net/blog/wp-content/uploads/2013/01/wahlergebnis-balken-522x376.png" width="522" height="376" /></p>
<p>This is because the elections are not won by the party with the most votes, but the party who manages to get a majority of seats in parliament. And, with the exception of Bavaria and Hamburg<sup class='footnote'><a href='#fn-3870-1' id='fnref-3870-1'>1</a></sup>, in Germany there&#8217;s no way to take government without forming a coalition.</p>
<p>The bar chart above makes it tremendously difficult for readers to figure out which coalition has won. Therefore one must calculate the total number of seats for each coalition, and compare it to the number of seats needed for a majority (which itself is the sum of each parties seats divided by two).</p>
<p>Humans aren&#8217;t particularly good at calculating and weighing up these different possibilities on the fly. That&#8217;s why most election reporting websites show an additional coalition view. But where is it? Right &#8211; often it is the last thing that they show, such as in <a href="http://www.zeit.de/politik/deutschland/2013-01/Niedersachsen-wahl-situation">this recent example</a> from the Zeit Online:</p>
<p><img class="aligncenter size-medium wp-image-3872" alt="mgl-koalitionen" src="http://vis4.net/blog/wp-content/uploads/2013/01/mgl-koalitionen-522x302.png" width="522" height="302" /></p>
<h2>How coalitions have been visualized in the past</h2>
<p>In past elections in Germany coalitions have been visualized in two different ways: either as simple horizontal bar chart or as an interactive coalition calculator.</p>
<p><img class="aligncenter size-medium wp-image-3874" alt="mgl-koalitionen-2" src="http://vis4.net/blog/wp-content/uploads/2013/01/mgl-koalitionen-2-522x301.png" width="522" height="301" /></p>
<p>The simple bar chart as seen above usually shows a limited selection of two or three coalitions having a majority. One problem is that sometimes it would be interesting to compare those coalitions with other possible, but politically unlikely coalitions (such as CDU-GREENs in Germany).</p>
<p>The second problem is that excluding the coalitions that fail to have a majority eliminates valuable context information.</p>
<h2>Do it yourself: the coalition calculator</h2>
<p>An alternative approach is the coalition calculator. The main idea is to let the users select parties and show them wether or not a coalition of them would have a majority.</p>
<p>&nbsp;</p>
<p style="text-align: center;"><img class="aligncenter size-medium wp-image-3875" alt="ltwnds-spon-koalitionsrechner" src="http://vis4.net/blog/wp-content/uploads/2013/01/ltwnds-spon-koalitionsrechner-522x263.png" width="522" height="263" /></p>
<p>&nbsp;</p>
<p>However, this puts some effort on the users, who might be checking back several times during election nights. Also the calculator only shows one or two coalitions at a time, so it&#8217;s hard to actually compare different coalitions.</p>
<h2>The new approach: extended coalition charts</h2>
<p>Actually this isn&#8217;t really ground-braking, but for some reasons nobody ever visualized elections this way. The idea is to show as much coalitions as possible side by side, including the politically unlikely and those who fail the majority.</p>
<p>To visually separate the winning coalitions from the rest I finally decided to simply pull them apart. Since I need some more space I went for vertical bars instead.</p>
<p><a href="http://nds2013.vis4.net/koalitionen/#activate"><img class="aligncenter size-medium wp-image-3876" alt="koalitionen-940" src="http://vis4.net/blog/wp-content/uploads/2013/01/koalitionen-940-522x264.png" width="522" height="264" /></a></p>
<p>There&#8217;s a nice side effect of showing all coalitions: When new preliminary results are coming in during election nights, the visualization doesn&#8217;t show an entirely different picture, but some coalitions simply &#8216;changed sides&#8217;.</p>
<p>Since the actual total number of seats depends on the election results, I decided to label the coalitions seats with relative numbers. This means that instead of saying *coalition X has 70 seats* we say *3 seats are missing for majority*.</p>
<h2>Coalition maps</h2>
<p>It was a small step from here to **coalition maps**. A coalition map shows in which election districts a coalition holds the majority of votes. To indicate the coalition I decided to go for diagonal stripes, although I don&#8217;t recommend looking at them for too long :)</p>
<p><a href="http://nds2013.vis4.net/koalitionskarten"><img class="aligncenter size-medium wp-image-3877" alt="koalitionskarten1" src="http://vis4.net/blog/wp-content/uploads/2013/01/koalitionskarten1-522x477.png" width="522" height="477" /></a></p>
<p>Actually interesting is to compare the coalition maps between coalitions. For instance from comparing the two most preferred coalitions you can see a clear divide within the state Lower Saxony.</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2013/01/koalitionskarten-2.png"><img class="aligncenter size-medium wp-image-3878" alt="koalitionskarten-2" src="http://vis4.net/blog/wp-content/uploads/2013/01/koalitionskarten-2-522x227.png" width="522" height="227" /></a></p>
<p>&nbsp;</p>
<p>Note: this post is a translation of <a href="http://visualisiert.net/artikel/vergesst-die-parteien-visualisiert-die-koalitionen/">this one</a>.</p>
<div class='footnotes'>
<div class='footnotedivider'></div>
<ol>
<li id='fn-3870-1'>thanks to &lt;a href=&#8221;https://twitter.com/giereow&#8221;&gt;@giereow&lt;/a&gt; for correcting me <span class='footnotereverse'><a href='#fnref-3870-1'>&#8617;</a></span></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/forget-about-parties-show-the-coalitions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Map Symbol Clustering — k-Means vs. noverlap</title>
		<link>http://vis4.net/blog/posts/map-symbol-clustering-k-means-vs-noverlap/</link>
		<comments>http://vis4.net/blog/posts/map-symbol-clustering-k-means-vs-noverlap/#comments</comments>
		<pubDate>Wed, 05 Dec 2012 11:00:07 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Cartography]]></category>
		<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3848</guid>
		<description><![CDATA[While working on the soon-to-be-released map widget for Piwik (heck, it&#8217;s been over two years since the first sketches!) I implemented two map symbol clustering algorithms into Kartograph.js. Last year I wrote about why this is a good idea, and now I turned that advice into re-usable code. In this post I want to share my findings [...]]]></description>
				<content:encoded><![CDATA[<p>While working on the soon-to-be-released <a href="/blog/posts/piwik-maps-2/">map widget</a> for <a href="http://piwik.org">Piwik</a> (heck, it&#8217;s been over two years since the first sketches!) I implemented two map symbol clustering algorithms into <a href="http://kartograph.org">Kartograph.js</a>. Last year <a href="http://vis4.net/blog/posts/clean-your-symbol-maps/">I wrote about</a> why this is a good idea, and now I turned that advice into re-usable code.</p>
<p>In this post I want to share my findings after experimenting with different clustering techniques.<br />
<span id="more-3848"></span></p>
<h2><em>k</em>-Means</h2>
<p>Inspired by an <a href="http://polymaps.org/ex/cluster.html">example on the Polymaps website</a> the first thing I tried was <a href="https://en.wikipedia.org/wiki/K-means_clustering"><em>k</em>-Means clustering</a>. The code provided with the demo worked really well and was easy to integrate with the symbol API in Kartograph. The only parameter k-Means needs is the desired number of clusters.</p>
<p>Below you can see an interactive demo of k-Means clustering. You can change the number of clusters using the slider.</p>
<p><iframe src="http://vis4.net/labs/clustering/kmeans.html" width="522" height="400" frameborder="0"></iframe></p>
<p>The main problem with k-means is that it doesn&#8217;t fix the overlapping symbols. However, since it reduces the number of displayed symbols it does improve the readability of the map. The tricky part is to find the ideal number of clusters. The fewer clusters, the more details we&#8217;re losing in the less &#8220;populated&#8221; places.</p>
<p>Instead of optimizing this, I decided to go back to my original idea, which is to simply cluster the overlapping symbols.</p>
<h2>noverlap</h2>
<p>The technique is described in <a href="http://vis4.net/blog/posts/clean-your-symbol-maps/">this post</a>. It takes two parameters: the <em>tolerance</em> controls to which amount overlapping is accepted. A value of 0.1 means we tolerate 10% overlapping of adjacent symbols. The parameter <em>maxRatio</em> lets you prevent overlapping of equally sized symbols. A value of 0.8 means that no symbols are grouped if the radius of the smaller symbol is larger than 80% of the radius of the larger symbol.</p>
<p><iframe src="http://vis4.net/labs/clustering/noverlap.html" width="522" height="400" frameborder="0"></iframe></p>
<p>The resulting clustering looks much better to me. What&#8217;s nice about it is that this algorithm doesn&#8217;t affect non-overlapping symbols, so we don&#8217;t loose details outside the big cities. The name of this technique is inspired by a <a href="https://gephi.org/plugins/noverlap/">Gephi plugin</a>.</p>
<p><em>Update:</em> Just wanted to note the fact that for the noverlap clustering the maximum radius of the symbols is a crucial parameter as well. The larger the symbols are, the more of their neighborhood they &#8220;occupy&#8221;. Therefor I added a third slider in both demos.</p>
<h2>How to use it</h2>
<p>You can see a larger comparison of both techniques side-by-side in <a href="http://kartograph.org/showcase/clustering/">this example</a>. There you also find instructions how to use the clustering in your own maps.</p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/map-symbol-clustering-k-means-vs-noverlap/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Making Pictographs? Choose Your Icons Wisely!</title>
		<link>http://vis4.net/blog/posts/choose-your-icons-wisely/</link>
		<comments>http://vis4.net/blog/posts/choose-your-icons-wisely/#comments</comments>
		<pubDate>Wed, 24 Oct 2012 10:43:41 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Advice]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3778</guid>
		<description><![CDATA[Icons are widely used in infographics such as maps and pictographs. So as a visualization designer, you&#8217;ll get to the point where you must choose which icons or pictograms to use. But please, choose wisely. The reason I got to this topic is a recent post by Naomi Robbins about two opinions on the usefulness [...]]]></description>
				<content:encoded><![CDATA[<p>Icons are widely used in infographics such as maps and pictographs. So as a visualization designer, you&#8217;ll get to the point where you must choose which icons or pictograms to use. But please, <a href="http://www.youtube.com/watch?v=oF2UrYSDb3k">choose wisely</a>.</p>
<p>The reason I got to this topic is a recent post by Naomi Robbins about <a href="http://www.forbes.com/sites/naomirobbins/2012/10/10/two-opinions-on-the-usefulness-of-pictographs/">two opinions on the usefulness of pictographs</a>. She reminded my of a critical article by Stephen Few, who stated that unit charts (another term for pictographs) <a href="http://www.perceptualedge.com/articles/visual_business_intelligence/unit_charts_are_for_kids.pdf">are for kids</a>, but not for serious information displays. My biggest complaint about his article is that he picked some of the worst imaginable examples to back up his arguments.</p>
<p><span id="more-3778"></span></p>
<p>Let&#8217;s take a closer look at two of them:</p>
<p><img class="aligncenter size-full wp-image-3824" title="self-reported drinking" src="http://vis4.net/blog/wp-content/uploads/2012/10/self-reported-dringing.png" alt="" width="391" height="212" /></p>
<p>In the first chart, each circle represents one percent of students who have been asked for their drinking behaviours. The color indicates to which category the student belongs to. In another example each stickman represent one million Americans, and the color shows whether these people would be in health insurance.</p>
<p><img class="aligncenter" title="example" src="http://vis4.net/blog/wp-content/uploads/2012/10/example.png" alt="" width="439" height="300" /></p>
<p>Now, both examples are really terrible infographics for many, many reasons, but for now let&#8217;s focus on the quality of the pictograms. In both cases, the authors picked icons that hardly represent their data. The whole point of using pictograms is to allow the reader to understand the subject <em>without</em> reading the legend or the by-text. This is what enables kids and people from different cultures or educational backgrounds to read the chart, this is what makes them universal.</p>
<p>So if you want to represent drunk students as icons, why not choose an icon of a drunk student instead of a meaningless circle? And instead of neutral stickmans—which could by the way be easily mixed up with Republicans and Democrats due to the color choice—one could use icons that represents people with and without health insurance.</p>
<p>As you probably know, there are plenty of resources where you can find icons (and I&#8217;ll add a list at the end of this post), but what if you can&#8217;t find a drunken student?</p>
<h2>Can&#8217;t find the right icon? Ask a graphic designer!</h2>
<p>That was exactly the problem Otto Neurath had in 1926, when he planned to create a new system for representing statistics using pictograms: <a href="http://en.wikipedia.org/wiki/Isotype_%28picture_language%29">ISOTYPE</a>. And do you know what he did? He looked out for a talented graphic designer. In an exhibition in Düsseldorf, he found Gerd Arntz who was showing his wood-cuts such as <a href="http://www.gerdarntz.org/content/1925">Metropa</a>, and asked him the legendary question: &#8216;<em>How much do you cost?</em>&#8216;. Finally they worked together for the next 20 years. During that period, Arntz has drawn an impressive number of icons that influence graphic designers until now.</p>
<p><a href="http://www.gerdarntz.org/isotype/people"><img class="aligncenter" title="arntz-icons" src="http://vis4.net/blog/wp-content/uploads/2012/10/arntz-icons-522x368.png" alt="" width="522" height="368" /></a></p>
<p>So seriously, if you want to represent data using pictograms/icons, you want to work with graphic designers. Unlike Neurath, you <a href="http://thenounproject.com">don&#8217;t need to start from scratch</a>, but chances are that you&#8217;ll need new or at least slightly modified pictograms.</p>
<h2>Take care about the arrangement</h2>
<p>Besides of the choice of icons, you need to take care about how you actually arrange them. For instance in the student-drinking examples shown above, the circles are arranged in a grid of 10 times 10. In the insurance-example the arrangement was even worse, as it shows 12 icons per row. In order to find out the exact numbers, you have no other option but counting the icons one by one. That&#8217;s why both authors finally decided to write down the numbers (which is by the way the perfect indicator for poor pictographs).</p>
<p>Instead, what you can do is to group the icons in a meaningful way. In the following chart by Neurath, each icon represents 25,000 unemployed in Berlin. They are arranged into groups of four which makes it easier to decode the numbers. In an earlier version they used one icon to represent 10,000 unemployed, but this ended up with too many icons, so they went for 25,000.</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2012/10/unemployed-neurath-remake1.png"><img class="aligncenter" title="unemployed-neurath-remake" src="http://vis4.net/blog/wp-content/uploads/2012/10/unemployed-neurath-remake1-484x600.png" alt="" width="484" height="600" /></a></p>
<p>By the way, have you notices the fact that the unemployed are faced to the right which makes them look as if they were standing in a queue? That kind of telling the story in a graphical way is the unique power of pictographs.</p>
<h2>Let the pictograms tell the story</h2>
<p>Let&#8217;s have a look at one of my favorite Isotype charts by Neurath and Arntz. It shows statistics about soldiers in the First World War. Each figure represents 1 million soldiers, grouped into the ones returning home, the wounded and the killed. The icons are facing opposite directions which makes clear that they are opponents returning home after war (or not in case of the killed).</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2012/10/isotype-great-war1.png"><img class="aligncenter size-medium wp-image-3800" title="isotype-great-war" src="http://vis4.net/blog/wp-content/uploads/2012/10/isotype-great-war1-522x353.png" alt="" width="522" height="353" /></a></p>
<p>Now take a closer look at the pictograms. Have you noticed that Neurath used different icons for both sides, accounting for different clothings and habits of the armies? These are the little details that make the difference!</p>
<p><img class="aligncenter size-full wp-image-3819" title="soldiers" src="http://vis4.net/blog/wp-content/uploads/2012/10/soldiers.png" alt="" width="400" height="153" /></p>
<p>&nbsp;</p>
<p>Note that the arrangement of the groups was an important design choice, too. As shown in the following image, the chart would have been less readable if the authors had used the straight-forward arrangement (left) instead of the &#8216;squarified&#8217; version (right). You don&#8217;t really need to &#8216;count&#8217; the gravestones if there shown in two rows of four. </p>
<p>&nbsp;</p>
<p><img class="aligncenter size-full wp-image-3821" title="arrangement-matters" src="http://vis4.net/blog/wp-content/uploads/2012/10/arrangement-matters.png" alt="" width="520" height="246" /></p>
<h2>The choice depends on your goals</h2>
<p>It&#8217;s not hard to guess what Stephen Few would recommend to visualize the WW1 soldiers: a bar chart. So let&#8217;s compare the Isotype chart to a bar chart. It tells a different story and represents the data in a more neutral way. Yes, it makes it easier to actually compare the killed/wounded for each side. But I&#8217;d say that it&#8217;s a lot harder to recall any of the numbers the day after seeing the chart.</p>
<p><img class="aligncenter" title="great-war-bar-chart" src="http://vis4.net/blog/wp-content/uploads/2012/10/great-war-bar-chart.png" alt="" width="511" height="332" /></p>
<p>So, of course the choice of visualization technique largely depends on the specific goals you&#8217;re trying to achieve with the chart. As <a href="http://complexdiagrams.com/2011/09/new-book-designing-data-visualizations/">explained in depth</a> by visualization expert Noah Iliinsky, we need to differentiate at least between exploration and explanation, or in other words between analysing data and communicating data. One is needed for business intelligence, which is where Stephen Few comes from, the other is needed for education, which is what Otto Neurath dedicated his life to.</p>
<p>And here&#8217;s one thing we always need to keep in mind: If you want to criticise a visualization  technique in general, you need to argue against the best-executed examples you can find. Can&#8217;t blame the bar chart for <a href="http://24.media.tumblr.com/tumblr_m83kqrwxS81qa0uujo1_500.jpg">Foxnews</a>.</p>
<h2>Further reading and icon resources</h2>
<ul>
<li><a href="http://www.amazon.com/Gerd-Arntz-Designer-Max-Bruinsma/dp/9064507635">Gerd Arntz Graphic Designer</a> — A highly recommended book about the work of Gerd Arntz, including lots of his wood-cuts, pictograms and Isotype charts.</li>
<li><a href="http://www.amazon.com/Transformer-Principles-Making-Isotype-Charts/dp/0907259405">The Transformer: Principles of Making Isotype Charts</a> — A small book with more details and first hand explanation about the process of transforming numbers into Isotype charts, written by Neurath&#8217;s wife Marie Neurath.</li>
<li><a href="http://thenounproject.com/">The Noun Project</a> — A website that collects free pictograms. In a way this project is dedicated to continue the work of Neurath and Arntz, which is to develop of visual language that can be understood around the world. The Noun Project goes beyond simply collecting icons by organizing decentral Iconathons, events where people come together to draw icons about specific topics. A great resource!</li>
<li><a href="http://iconmonstr.com/">Iconmonstr</a> — Another pictogram website, very similar to The Noun Project.</li>
<li>and of course, please check out <a href="http://gerdarntz.com">Gerd Arntz website</a> which shows many of his pictograms and additional resources about his work.</li>
</ul>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/choose-your-icons-wisely/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Let&#8217;s Break some Rules</title>
		<link>http://vis4.net/blog/posts/lets-break-the-rules/</link>
		<comments>http://vis4.net/blog/posts/lets-break-the-rules/#comments</comments>
		<pubDate>Fri, 13 Jul 2012 12:58:23 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Advice]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3709</guid>
		<description><![CDATA[A few weeks ago, while I was driving several hours towards our camping holidays, I suddenly noticed this beautiful piece of data visualization right in front of me. Actually, I found it that beautiful that I had to remake it in Illustrator: I was completely stunned by the clean and simple layout of this gauge [...]]]></description>
				<content:encoded><![CDATA[<p>A few weeks ago, while I was driving several hours towards our camping holidays, I suddenly noticed this beautiful piece of data visualization right in front of me. Actually, I found it that beautiful that I had to remake it in Illustrator:</p>
<p><img class="aligncenter size-full wp-image-3717" title="tacho" src="http://vis4.net/blog/wp-content/uploads/2012/07/tacho2.png" alt="" width="335" height="300" /></p>
<p>I was completely stunned by the clean and simple layout of this <a href="https://developers.google.com/chart/interactive/docs/gallery/gauge">gauge chart</a>. It shows everything I, as the driver, need to know such as: How fast am I driving at the moment, how far have I&#8217;ve been driving at all, when is it time to get a new car.</p>
<p>But then I realized that this tiny chart, despite its useful, intelligent design, violates some of the common rules of data visualization, so I thought it&#8217;s a good idea to write about it.</p>
<p><span id="more-3709"></span></p>
<h2>Why use a radial gauge chart at all?</h2>
<p>Let&#8217;s face it, the current driving speed is not periodic data, so there should be no reason to use a radial layout. As we all know, humans are not as good in comparing angles as we are in comparing lengths. So why shouldn&#8217;t we use bar charts for displaying the speed?</p>
<p>One obvious answer is that it makes a lot of sense to re-use established metaphors. Everyone who&#8217;s driving a car knows how to read the driving speed in a gauge chart, and probably nobody ever needs to be taught reading them. Another possible answer is that we connect certain semantics to the different regions of the radial gauge layout. No need to tint the last quarter of the gauge in red, everybody knows that this is the <em>danger zone</em>. We don&#8217;t have similar embedded semantics in a bar chart.</p>
<h2>Not more than three data points</h2>
<p>According to Jorges Camoes <a href="http://www.excelcharts.com/blog/thats-data-visualization/">recently published rules of data visualization</a>, the speedometer I fell in love with isn&#8217;t data visualization at all. That is because it doesn&#8217;t show more than three data points. That&#8217;s true, there are only 3 data points I can read in the display, and, if we subtract the LCD that was added to the bottom, there&#8217;s only one number left. One single number? Do we really need a big gauge dashboard display for one number? Of course we do, because a chart never shows just a single number. More importantly, a chart puts the single number into context. If you ever had to drive a car with a <a title="Dashboard of the 1998–2000 Renaut Twingo I" href="http://upload.wikimedia.org/wikipedia/commons/thumb/3/34/Renault_twingo_interior.jpg/800px-Renault_twingo_interior.jpg">dashboard like the Renaut Twingo I</a>, you&#8217;ll never forget that difference.</p>
<p>For instance, in the gauge chart there are two red tick marks for important driving speed limits. Sure, I know those limits back from driving school, but I find it really helpful to see how my current speed compares to them. The other context is the capability of my car. If I&#8217;d be driving at 200km/h (which I never do), I&#8217;d instantly see that I almost reached the limit of my car.</p>
<h2>Non-continuous scale</h2>
<p>But now it gets even worse. As you might noticed, the speed scale breaks at 120 km per hour. Therefore comparing angles gets even harder. Without any doubt, that cannot be good data visualization! For comparison, here&#8217;s a different version of the same chart using a continuous scale:</p>
<p><img class="aligncenter size-full wp-image-3731" title="screenshot 2012-07-13 um 13.16.54" src="http://vis4.net/blog/wp-content/uploads/2012/07/screenshot-2012-07-13-um-13.16.54.png" alt="" width="325" height="299" /></p>
<p>Now we lost half of the magic of the original design. Using the non-continuous scale the tick for 100km/h, the speed limit on primary highways, sits right at the 12 o&#8217;clock position. Can you imagine driving for hours and hours watching the gauge needle standing at 11:30 as shown in the above image? It just don&#8217;t feels right, and it would always push you to drive faster.</p>
<p>Using the non-continuous scale allows to &#8216;stretch&#8217; the speeds below 120km/h and thus makes them more easy to read. The difference between 30 and 50km/h in a populated area is more important than the difference between 130 and 150km/h on a German Autobahn. My car is a family car, not a sports vehicle. The designers probably wanted to emphasize the reasonable driving speeds, and that&#8217;s what I like most about the visualization.</p>
<p><img class="aligncenter size-full wp-image-3750" title="tacho-areas" src="http://vis4.net/blog/wp-content/uploads/2012/07/tacho-areas.gif" alt="" width="319" height="305" /></p>
<h2>When to break the rules?</h2>
<p>So, you know, the rules are the rules, but you should know when and why to break them. Of course it is allowed to display just a single data point in a chart. And of course it is allowed to add fancy annotations to the chart wherever you like. And, yes, it is even allowed to use radial charts for non-periodic data, as it is allowed to reject a certain chart type, just because the audience is not familiar with it.</p>
<p>Whenever it makes sense.</p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/lets-break-the-rules/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Doing the Line Charts Right</title>
		<link>http://vis4.net/blog/posts/doing-the-line-charts-right/</link>
		<comments>http://vis4.net/blog/posts/doing-the-line-charts-right/#comments</comments>
		<pubDate>Wed, 20 Jun 2012 16:10:52 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Advice]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3633</guid>
		<description><![CDATA[Lately I joined Datawrapper, an open source project that aims to provide simple, embeddable charts for journalists. Really, no fancy stuff here, we&#8217;re just talking about line charts and bar charts. Limiting ourself to those types gave us a good opportunity to think about the best of doing them. So it came that this week [...]]]></description>
				<content:encoded><![CDATA[<p>Lately I joined <a href="http://datawrapper.de/">Datawrapper</a>, an open source project that aims to <a href="http://datadrivenjournalism.net/news_and_analysis/datawrapper_breaks_down_barriers_in_visualising_and_publishing_data">provide simple, embeddable charts for journalists</a>. Really, no fancy stuff here, we&#8217;re just talking about line charts and bar charts. Limiting ourself to those types gave us a good opportunity to think about the best of doing them. So it came that this week I was thinking a bit about the perfect line chart.</p>
<p><span id="more-3633"></span></p>
<h2>Listen to Tufte and keep it simple</h2>
<p>Of course you cannot talk about perfect charts without mentioning the great books of Edward Tufte. Especially in the book <a href="http://www.edwardtufte.com/tufte/books_vdqi">The Visual Display of Quantitative Information</a> he summed up a lot of good advices for line charts. He argued that it&#8217;s a good idea to look at what he called the <em>data ink ratio</em> and showed how the removal of certain chart elements can increase its readability. For instance you don&#8217;t need to draw a box around the chart area. Also you can use the ends of axis lines to display the minimum and maximum value in the data.</p>
<p><img class="aligncenter  wp-image-3639" title="line-charts-tufte" src="http://vis4.net/blog/wp-content/uploads/2012/06/line-charts-tufte.png" alt="" width="514" height="330" /></p>
<h2>Forget about the separate legend</h2>
<p>Separate legends are the worst-case scenario in the line chart world. Often one can find the legend below the chart, or in an arbitrary order. You want to allow instant identification of the lines, but forcing the viewers to look them up in a legend takes way too much time. Instead you should put the labels somewhere close to the lines.</p>
<p><img class="aligncenter  wp-image-3655" title="labeling" src="http://vis4.net/blog/wp-content/uploads/2012/06/labeling.png" alt="" width="517" height="406" /></p>
<p>The great side effect of putting the labels next to the lines is that you no longer depend on fancy colors or disturbing symbols to identify individual lines. Extra points for simplicity.</p>
<h2>Highlight what&#8217;s important</h2>
<p>Although it is possible to tell hundred stories using a single line chart, it makes a lot of sense to keep the focus on just one story. Therefore you should highlight just one or two important lines in the chart, but keep the others as context in the background.</p>
<p><img class="aligncenter size-full wp-image-3644" title="screenshot 2012-06-20 um 13.32.38" src="http://vis4.net/blog/wp-content/uploads/2012/06/screenshot-2012-06-20-um-13.32.38.png" alt="" width="520" height="321" /></p>
<h2>Baseline zero or not?</h2>
<p>Sometimes you hear the advice that every (line) chart should have a baseline of zero, otherwise it would be &#8220;lying&#8221;. As a counter-example, here&#8217;s the (approximate) intraday stock quote data of the Facebook IPO day using baseline zero. The reason why nobody shows stock charts this way is obvious.   <img class="aligncenter size-full wp-image-3666" title="screenshot 2012-06-20 um 16.59.52" src="http://vis4.net/blog/wp-content/uploads/2012/06/screenshot-2012-06-20-um-16.59.52.png" alt="" width="482" height="326" /></p>
<p>It&#8217;s almost impossible to see the ups and downs of the first day of the Facebook stock. Without the zero-baseline the chart reveals much more of the data.</p>
<p><img class="aligncenter size-full wp-image-3667" title="screenshot 2012-06-20 um 17.02.54" src="http://vis4.net/blog/wp-content/uploads/2012/06/screenshot-2012-06-20-um-17.02.54.png" alt="" width="483" height="326" /></p>
<p>However, to minify the risk of confusing the readers with a non-zero baseline chart, I suggest to not draw the axes as connected lines. This way the y-axis doesn&#8217;t visually &#8216;touch&#8217; the &#8216;ground&#8217;.</p>
<p><img class="aligncenter size-full wp-image-3672" title="non-connected axes" src="http://vis4.net/blog/wp-content/uploads/2012/06/non-connected-axes.png" alt="" width="520" height="273" /></p>
<h2>Finding a nice aspect ratio</h2>
<p>The big advantage of line charts is that they enable the comparison of slopes, which is not easily possible in a bar chart, for instance. The problem, however, is that the perceivable slopes are highly dependent on the aspect ratio of the chart. The Facebook stock data would have looked much more dramatic in a taller chart. So which aspect ratio to chose? Some years ago, William Cleveland suggested a technique called <em>banking</em> to solve this problem.The core idea is that the slopes in a line chart are most readable if they average to 45°. In 2006, <a href="http://hci.stanford.edu/jheer/">Jeffrey Heer</a> and <a href="http://vis.berkeley.edu/~maneesh/">Maneesh Agrawala</a> continued the work of Cleveland and <a href="http://vis.berkeley.edu/papers/banking/">described 12 different banking algorithms</a>. I used one of the most simplest of them, the <em>median-absolute-slope banking</em>.</p>
<p>Finally, here&#8217;s what the Facebook stock chart looks like after banking. The curve looks less dramatic now, but is still easy to read.</p>
<p><img class="aligncenter size-full wp-image-3668" title="screenshot 2012-06-20 um 16.59.25" src="http://vis4.net/blog/wp-content/uploads/2012/06/screenshot-2012-06-20-um-16.59.25.png" alt="" width="486" height="166" /></p>
<p>The problem with banking is that sometimes you need the chart in a certain aspect ratio to fit into a page layout. Especially if banking produces portrait sized charts. But why not let the optimal chart ratio define your layout? For instance, you can put the additional information to the side of the chart. Remember that the main goal of banking is to increase the readability of the line slopes. In the following example, the slopes for Nuclear and Renewables would have been much more difficult to see, if the chart would have been &#8216;squeezed&#8217; to a landscape aspect.</p>
<p><img class="aligncenter  wp-image-3650" title="banked-portrait" src="http://vis4.net/blog/wp-content/uploads/2012/06/banked-portrait.png" alt="" width="512" height="338" /></p>
<h2>Turning best practices into actual tools</h2>
<p>At the end, I am very happy to say that these best practices won&#8217;t remain gray theory in research papers. Everything I mentioned will be integrated in the upcoming release of Datawrapper, which I already used to produce most of the examples in this post. Please follow <a href="https://twitter.com/#!/datawrapper">@datawrapper</a> if you want to keep up-to-date with the project.</p>
<p>If you have further suggestions or recommendations for line charts, I&#8217;m looking forward to read your comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/doing-the-line-charts-right/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Using Twitter as Social Bookmarking Service</title>
		<link>http://vis4.net/blog/posts/twitter-bookmarks/</link>
		<comments>http://vis4.net/blog/posts/twitter-bookmarks/#comments</comments>
		<pubDate>Tue, 10 Apr 2012 08:59:31 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[tutorials]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3566</guid>
		<description><![CDATA[A while ago I realized that I totally stopped using social bookmarking services since I started tweeting. Whenever I find an interesting link I share it on Twitter. If I find interesting links tweeted by the people I follow I&#8217;m most likely to favorite that tweet. I guess that&#8217;s the way most people use Twitter. [...]]]></description>
				<content:encoded><![CDATA[<p>A while ago I realized that I totally stopped using social bookmarking services since I started tweeting. Whenever I find an interesting link I share it on Twitter. If I find interesting links tweeted by the people I follow I&#8217;m most likely to favorite that tweet. I guess that&#8217;s the way most people use Twitter. How often did you check someone&#8217;s public links on delicious? I rarely did.</p>
<p><span id="more-3566"></span></p>
<p>With Twitter we now we have a social bookmarking system that works perfectly on the input side. Not only do I tag my links but I also describe in a few words what the website is about, and why it is worth sharing. But when it comes to getting the information out of Twitter, things start to get ugly. Have you ever tried to find a link you retweeted months ago? I did. The Twitter search won&#8217;t help you since it is limited to the most recent tweets. The only thing you can do is to scroll down your timeline and click the &#8216;load more&#8217; link until you find what you look for.</p>
<p>Enough of the problem, let&#8217;s come to the solution. I wrote small python script that stores my tweets and favorites in a local sqlite database. It does so using Mike Verdone&#8217;s wonderful <a href="http://mike.verdone.ca/twitter/">Twitter API for Python</a>. I installed the script as a local <a href="http://en.wikipedia.org/wiki/Cron">cronjob</a> that runs every hour so my database will remain up to date all the time.</p>
<p>Then a second script allows me to perform simple searches on that database and writes the output to a html file. Finally I integrated the search into <a href="http://www.alfredapp.com/">Alfred</a>, a spotlight-like tool for making my life easy. Now I just type &#8220;tweets foo&#8221; in Alfred and the search result html will be opened instantly.</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2012/04/tweet-search.png"><img src="http://vis4.net/blog/wp-content/uploads/2012/04/tweet-search-522x276.png" alt="" title="tweet-search" width="522" height="276" class="aligncenter size-medium wp-image-3571" /></a></p>
<p>Here&#8217;s a demo video:</p>
<p><iframe width="520" height="315" src="http://www.youtube.com/embed/IHRlle6LoYY" frameborder="0" allowfullscreen></iframe></p>
<p>While at hacking I also built two special modes: Typing &#8220;tweets favs&#8221; will simply output all tweets that I marked as favorites and typing &#8220;tweets March 2010&#8243; will return, well, all tweets from that month.</p>
<p>Here&#8217;s the source code of the two scripts and the Alfred script. Just make sure to enter your own Twitter name and <a href="https://dev.twitter.com/docs/auth/using-oauth">create some OAuth keys</a> for the script.</p>
<p><a href="https://gist.github.com/2349453">https://gist.github.com/2349453</a></p>
<p>Last remark: I&#8217;m pretty sure that there&#8217;s some amazing web service out there that does the same for you without the need to hazzle around with python scripts. However, I haven&#8217;t checked them out since I like to access my tweets offline as well (for instance when sitting in a plane or train). </p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/twitter-bookmarks/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Rendering High Resolution Maps in Kartograph</title>
		<link>http://vis4.net/blog/posts/high-res-maps-in-kartograph/</link>
		<comments>http://vis4.net/blog/posts/high-res-maps-in-kartograph/#comments</comments>
		<pubDate>Tue, 03 Apr 2012 12:53:44 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Cartography]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[tutorials]]></category>

		<guid isPermaLink="false">http://vis4.net/?p=3543</guid>
		<description><![CDATA[This is going to be a quick run-through the creation of the latest Kartograph showcase which is a high res vector map. Select your map projection I really like the idea of starting the map creation process with choosing a map projection. As mentioned in my last post, the projection can be seen as a [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://vis4.net/blog/wp-content/uploads/2012/04/eastcoast-90dpi_export.png"><img class="aligncenter size-medium wp-image-3558" title="eastcoast-90dpi_export" src="http://vis4.net/blog/wp-content/uploads/2012/04/eastcoast-90dpi_export-e1333457462704.png" alt="" width="519" height="248" /></a></p>
<p>This is going to be a quick run-through the creation of the latest <a href="http://kartograph.org">Kartograph</a> showcase which is a <a href="http://kartograph.org/showcase/eastcoast/">high res vector map</a>.</p>
<p><span id="more-3543"></span></p>
<h2>Select your map projection</h2>
<p>I really like the idea of starting the map creation process with choosing a map projection. As mentioned in my <a href="http://vis4.net/blog/posts/introducing-kartograph/">last post</a>, the projection can be seen as a very crucial point of every map. It allows you to define the perspective on the geography. In this showcased I used the <a href="http://kartograph.org/showcase/projections/#satellite">tilted perspective projection</a>, looking from Florida to New York.</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2012/04/1-editor.png"><img class="aligncenter size-medium wp-image-3544" title="Playing around with map projections" src="http://vis4.net/blog/wp-content/uploads/2012/04/1-editor-522x418.png" alt="" width="522" height="418" /></a></p>
<p>The new <a href="http://kartograph.org/showcase/editor/#satellite">visual map configurator</a> shown in the above picture not only gives you a preview of how your map will look like, but it will also give you a nice template for the JSON configuration Kartograph will use to generate the map.</p>
<p>The next step is to tell Kartograph were to cut the map view. This time I will use the <strong>points</strong> mode which allows me to give Kartograph a set of coordinates that I definitely want to be included in the map. You can add some padding, too.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="javascript"><pre class="de1"><span class="st0">&quot;bounds&quot;</span><span class="sy0">:</span> <span class="br0">&#123;</span>
  <span class="st0">&quot;mode&quot;</span><span class="sy0">:</span> <span class="st0">&quot;points&quot;</span><span class="sy0">,</span>
  <span class="st0">&quot;data&quot;</span><span class="sy0">:</span> <span class="br0">&#91;</span><span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">70.5</span><span class="sy0">,</span><span class="nu0">41.2</span><span class="br0">&#93;</span><span class="sy0">,</span><span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">90.1</span><span class="sy0">,</span><span class="nu0">32.1</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">72.8</span><span class="sy0">,</span> <span class="nu0">34</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">70.8</span><span class="sy0">,</span> <span class="nu0">31.5</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="sy0">,</span>
  <span class="st0">&quot;padding&quot;</span><span class="sy0">:</span> <span class="nu0">0.02</span>  <span class="co1">// defined as ratio of total width</span>
<span class="br0">&#125;</span></pre></div></div></div></div></div></div></div>


<p>Since this feature is not included in the visual configurator (yet?) it&#8217;s up to you to experiment with some coordinates to find out I nice bounding box. Finally I ended up with something like this:</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2012/04/test-bounds2.png"><img class="aligncenter size-full wp-image-3555" title="test-bounds" src="http://vis4.net/blog/wp-content/uploads/2012/04/test-bounds2.png" alt="" width="474" height="468" /></a></p>
<h2>Define outer Latitude/Longitude range</h2>
<p>Since our map only shows a small part of the world, we can speed up the map creation process by forcing Kartograph to skip every geographic feature that lies outside a given lat/lon bounding box. To select the bounding box I used the nice <a href="http://www.openstreetmap.org/export?lat=36.4&amp;lon=-75.4&amp;zoom=5&amp;layers=M">OpenStreetMap export feature</a>.</p>
<p style="text-align: center;"><a href="http://vis4.net/blog/wp-content/uploads/2012/04/2-osm.png"><img class="aligncenter size-medium wp-image-3546" title="selecting a lat/lon bounding box using OSM" src="http://vis4.net/blog/wp-content/uploads/2012/04/2-osm-522x378.png" alt="" width="522" height="378" /></a></p>
<p>The limits for latitudes and longitudes are then added in the bounds.<strong>crop</strong> property. The format is [minLon, minLat, maxLon, maxLat]:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="javascript"><pre class="de1"><span class="st0">&quot;bounds&quot;</span><span class="sy0">:</span> <span class="br0">&#123;</span>
 <span class="st0">&quot;mode&quot;</span><span class="sy0">:</span> <span class="st0">&quot;points&quot;</span><span class="sy0">,</span>
 <span class="st0">&quot;data&quot;</span><span class="sy0">:</span> <span class="br0">&#91;</span><span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">70.5</span><span class="sy0">,</span><span class="nu0">41.2</span><span class="br0">&#93;</span><span class="sy0">,</span><span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">90.1</span><span class="sy0">,</span><span class="nu0">32.1</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">72.8</span><span class="sy0">,</span> <span class="nu0">34</span><span class="br0">&#93;</span><span class="sy0">,</span> <span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">70.8</span><span class="sy0">,</span> <span class="nu0">31.5</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="sy0">,</span>
 <span class="st0">&quot;padding&quot;</span><span class="sy0">:</span> <span class="nu0">0.02</span>
 <span class="st0">&quot;crop&quot;</span><span class="sy0">:</span> <span class="br0">&#91;</span><span class="sy0">-</span><span class="nu0">90.85</span><span class="sy0">,</span> <span class="nu0">28.12</span><span class="sy0">,</span> <span class="sy0">-</span><span class="nu0">65.69</span><span class="sy0">,</span> <span class="nu0">51.9</span><span class="br0">&#93;</span>
<span class="br0">&#125;</span></pre></div></div></div></div></div></div></div>


<p>Of course, to find out which feature lies inside or outside the selected bounding box, Kartograph still needs to run through all the features in the shapefiles. If you deal with really huge shapefiles (like this <a href="http://aspmap.net/vmap/Trees.zip">100mb shapefile</a> containing 167.000 forest polygons) a better idea would be to filter the data directly in the shapefile. The open source GIS software <a href="http://www.qgis.org/">QuantumGIS</a> makes this really easy:</p>
<h2><a href="http://vis4.net/blog/wp-content/uploads/2012/04/3-qgis-export.png"><img class="aligncenter size-medium wp-image-3547" title="exporting a subset of a shapefile using QGis" src="http://vis4.net/blog/wp-content/uploads/2012/04/3-qgis-export-522x421.png" alt="" width="522" height="421" /></a>Add some nice layers</h2>
<p>At this point all what&#8217;s left to do is to select a nice set of layers that you want to include in your map. I ended up with like ten layers coming from <a href="http://www.naturalearthdata.com/downloads/">Natural Earth</a> and <a href="http://www.vdstech.com/world-data.aspx">VMAP</a>. You can give the layers some basic styling to make it easier to work with the raw map output in Illustrator.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="javascript"><pre class="de1"><span class="st0">&quot;layers&quot;</span><span class="sy0">:</span> <span class="br0">&#91;</span><span class="br0">&#123;</span>
  <span class="st0">&quot;id&quot;</span><span class="sy0">:</span> <span class="st0">&quot;trees&quot;</span><span class="sy0">,</span>
  <span class="st0">&quot;src&quot;</span><span class="sy0">:</span> <span class="st0">&quot;shp/custom/useast/trees.shp&quot;</span><span class="sy0">,</span>
  <span class="st0">&quot;styles&quot;</span><span class="sy0">:</span> <span class="br0">&#123;</span>
    <span class="st0">&quot;fill&quot;</span><span class="sy0">:</span> <span class="st0">&quot;#b1bfb1&quot;</span><span class="sy0">,</span>
    <span class="st0">&quot;fill-opacity&quot;</span><span class="sy0">:</span> <span class="nu0">0.4</span><span class="sy0">,</span>
    <span class="st0">&quot;stroke&quot;</span><span class="sy0">:</span> <span class="st0">&quot;none&quot;</span>
<span class="br0">&#125;</span><span class="br0">&#93;</span></pre></div></div></div></div></div></div></div>


<p>Here&#8217;s how the map configuration <a href="http://kartograph.org/showcase/eastcoast/eastcoast.json">finally looked like</a>.</p>
<h2>Refine in Illustrator</h2>
<p>After finishing the map configuration I loaded the generated SVG (7 megabytes) into Illustrator to add some labels and refine the colors.</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2012/04/eastcoast-illustrator.png"><img class="aligncenter size-medium wp-image-3556" title="eastcoast-illustrator" src="http://vis4.net/blog/wp-content/uploads/2012/04/eastcoast-illustrator-522x337.png" alt="" width="522" height="337" /></a></p>
<p>And here&#8217;s the final result:</p>
<p><a href="http://vis4.net/blog/wp-content/uploads/2012/04/eastcoast-90dpi_export.png"><img class="aligncenter size-medium wp-image-3558" title="eastcoast-90dpi_export" src="http://vis4.net/blog/wp-content/uploads/2012/04/eastcoast-90dpi_export-522x565.png" alt="" width="522" height="565" /></a></p>
<h3>Related links</h3>
<ul>
<li><a href="https://github.com/kartograph/kartograph.py/wiki/API">Kartograph.py API docs</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/high-res-maps-in-kartograph/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Why We Need Another Mapping Framework</title>
		<link>http://vis4.net/blog/posts/introducing-kartograph/</link>
		<comments>http://vis4.net/blog/posts/introducing-kartograph/#comments</comments>
		<pubDate>Wed, 07 Mar 2012 13:14:11 +0000</pubDate>
		<dc:creator>Gregor Aisch</dc:creator>
				<category><![CDATA[Cartography]]></category>
		<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://vis4.net/blog/posts/introducing-kartograh/</guid>
		<description><![CDATA[Over the last two years, cartography has drawn my attention from time to time. In 2009 I started my work in the field by porting the PROJ.4 library to ActionScript. My first notable interactive map application was a world map widget for the Piwik Analytics project, which is in use until today. It was born [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://vis4.net/blog/posts/introducing-kartograh/"><img class="aligncenter size-full wp-image-3522" title="kartograph-screenshot" src="http://vis4.net/blog/wp-content/uploads/2012/03/screenshot-2012-03-07-um-14.21.17.png" alt="" width="520" height="308" /></a></p>
<p>Over the last two years, cartography has drawn my attention from time to time. In 2009 I started my work in the field by porting the PROJ.4 library <a href="http://vis4.net/blog/posts/as3-proj/">to ActionScript</a>. My first notable interactive map application was a <a href="http://vis4.net/blog/posts/piwik-maps/">world map widget</a> for the <a href="http://piwik.org">Piwik Analytics</a> project, which is in use until today. It was born from the need to have a simple world map that is lightweight, easy to use and completely independent from external map services like Google Maps.</p>
<p><span id="more-3346"></span></p>
<p>A lot of things happened in the field since then and some cool mapping frameworks have been released. Google Maps keeps innovating and gives people lots of reasons to use their service, even on a commercial basis. But still one thing seems to never change: the <a href="http://en.wikipedia.org/wiki/Mercator_projection">Spherical Mercator projection</a>.</p>
<h2>Why should we care about map projections at all</h2>
<div>So why should we care about map projections at all? To answer that question, I kind of like to point to an analogy to photography. As a photographer, you probably would never consider working with just one camera and a single ancient and highly distorted lens. Of course, it might be fun to work with that lens, but once you face its limits you get odd of it. Instead, you carry your big bag of equipment with you, knowing that for some shots you will need the tele, while for others you need a wide angle. Some lenses work good at night, others give you the best results in sunlight etc, etc. Good photo journalists know exactly how to use their equipment in order to put their objectives into the exact perspective and context they need to underline their stories.</div>
<p>In contrast, data journalists are nowadays limited to just one projection. To illustrate the journalistic dimension of this limitation I will now show you three maps of the same geographic area. The first uses the well-known Mercator projection:</p>
<p>&nbsp;</p>
<p style="text-align: center;"><a href="http://vis4.net/blog/wp-content/uploads/2012/03/screenshot-2012-03-07-um-13.15.06.png"><img class="aligncenter size-medium wp-image-3512" style="border-radius: 10px;" title="mercator map of spain + uk" src="http://vis4.net/blog/wp-content/uploads/2012/03/screenshot-2012-03-07-um-13.15.06-522x498.png" alt="" width="522" height="498" /></a></p>
<p>The Mercator map looks neutral, clean and&#8230; boring. In fact, the projection shows the earth as an cylinder shot from an infinite distance. You really can&#8217;t distort a view on our planet much more. Note that the area of Spain looks as big as (and maybe even slightly smaller than) Germany. In contrast, the next map uses an equal area projection.</p>
<p>&nbsp;</p>
<p style="text-align: center;"><a href="http://vis4.net/blog/wp-content/uploads/2012/03/screenshot-2012-03-07-um-13.22.23.png"><img class="aligncenter size-medium wp-image-3514" style="border-radius: 10px;" title="lambert equal area projection" src="http://vis4.net/blog/wp-content/uploads/2012/03/screenshot-2012-03-07-um-13.22.23-522x454.png" alt="" width="522" height="454" /></a></p>
<p>The curved grid lines indicate that we&#8217;re living on a spherical planet, but still we are looking at it from an infinite distance. Looking at the equal-area projection, one would now say that Spain is slightly bigger than Germany (in fact it is 141% the size of Germany). You can easily imagine scenarios where true areas are really important, say if you want to show the extend of devastation of the latest environmental disaster or something..</p>
<p>And finally, here&#8217;s the last shot which shows the same geography using the tilted satellite projection:</p>
<p>&nbsp;</p>
<div><a href="http://vis4.net/blog/wp-content/uploads/2012/03/screenshot-2012-03-07-um-13.12.12.png"><img class="aligncenter size-medium wp-image-3511" style="border-radius: 10px;" title="tilted map of spain + uk" src="http://vis4.net/blog/wp-content/uploads/2012/03/screenshot-2012-03-07-um-13.12.12-522x407.png" alt="" width="522" height="407" /></a></div>
<div></div>
<div>Now the projection looks more like a photography taken from a point more close to the surface (in fact, it was taken 4464 km above Morocco). It immediately places the viewer into the cartography and inevitably forces us to look to a specific direction. Using satellite projections gives us a completely different sense of scale and can shift the context of map-driven stories dramatically.</div>
<div></div>
<div>Now tell me, which data journalist would want to do without that?</div>
<div></div>
<h2>Hello Kartograph</h2>
<p>Since I know that it&#8217;s kind of unfair to propose new awesome features while knowing that almost nobody is able to use them, I decided to turn my ideas into code. For 4 months or so I had been working on a new mapping framework which I finally named <a href="http://kartograph.org">Kartograph</a>.</p>
<p>Here&#8217;s a bit of the rationale behind Kartograph. Most notably Kartograph..</p>
<ul>
<li>..allows to <a href="http://kartograph.org/showcase/projections/#satellite">select and fine-tune the map projection</a>, which is like the fundamental equipment for telling stories with maps</li>
<li>..ships with the tools needed to easily convert open geo data (shapefiles, kml, geojson) into interactive data-driven maps</li>
<li>..gives its users complete control about what to include in their maps and what not to include.</li>
<li>..puts a focus on reliable data visualization techniques.</li>
<li>..does not rely on hosted (and costly) mapping services like Google Maps or MapBox</li>
</ul>
<p>Of course, there is a lot of <a href="https://github.com/kartograph/kartograph.py/issues/8">work left</a>, and I&#8217;m currently looking for some way that allows me to continue the work on Kartograph for another two or three months. In the meantime I also plan to release a new version of the world map widget for the Piwik dashboard, which was kind of the initial reason for starting to work on the library.</p>
<p>Now, I&#8217;m really curious for your feedback about Kartograph. Do you think it is worth the effort? Do you know of any existing library that can do the same? What are the killer features you would like to see in a modern vector mapping framework?</p>
]]></content:encoded>
			<wfw:commentRss>http://vis4.net/blog/posts/introducing-kartograph/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
	</channel>
</rss>
