people.gnome.org_federico.atom.xml - sfeed_tests - sfeed tests and RSS and Atom files
(HTM) git clone git://git.codemadness.org/sfeed_tests
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
people.gnome.org_federico.atom.xml (1402111B)
---
1 <?xml version="1.0" encoding="utf-8"?>
2 <feed xmlns="http://www.w3.org/2005/Atom"><title>Federico's Blog</title><link href="https://people.gnome.org/~federico/blog/" rel="alternate"></link><link href="https://people.gnome.org/~federico/blog/feeds/atom.xml" rel="self"></link><id>https://people.gnome.org/~federico/blog/</id><updated>2021-09-01T13:57:47-05:00</updated><subtitle></subtitle><entry><title>GNOME themes, an incomplete status report, and how you can help</title><link href="https://people.gnome.org/~federico/blog/gnome-themes.html" rel="alternate"></link><published>2021-09-01T13:57:47-05:00</published><updated>2021-09-01T13:57:47-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2021-09-01:/~federico/blog/gnome-themes.html</id><summary type="html"><p>"Themes in GNOME" is a complicated topic in technical and social
3 terms. Technically there are a lot of incomplete moving parts;
4 socially there is a lot of missing documentation to be written, a lot
5 of miscommunication and mismatched expectations.</p>
6 <p>The following is a brief and incomplete, but hopefully encouraging,
7 summary …</p></summary><content type="html"><p>"Themes in GNOME" is a complicated topic in technical and social
8 terms. Technically there are a lot of incomplete moving parts;
9 socially there is a lot of missing documentation to be written, a lot
10 of miscommunication and mismatched expectations.</p>
11 <p>The following is a brief and incomplete, but hopefully encouraging,
12 summary of the status of themes in GNOME. I want to give you an
13 overall picture of the status of things, and more importantly, an idea
14 of how you can help. This is not a problem that can be solved by a
15 small team of platform developers.</p>
16 <p>I wish to thank Alexander Mikhaylenko for providing most of the
17 knowledge in this post.</p>
18 <h1>Frame of reference</h1>
19 <p>First, I urge you to read Cassidy James Blaede's comprehensive "<a href="https://blog.elementary.io/the-need-for-a-freedesktop-dark-style-preference/">The
20 Need for a FreeDesktop Dark Style
21 Preference</a>".
22 That gives an excellent, well-researched introduction to the "dark
23 style" problem, the status quo on other platforms, and exploratory
24 plans for GNOME and Elementary from 2019.</p>
25 <p>Go ahead, read it. It's very good.</p>
26 <p>There is also a <a href="https://www.youtube.com/watch?v=gi_3b81eBUE&amp;list=PLkmRdYgttscEuv9v2-H9P5FBj8-td_Nri&amp;index=31">GUADEC talk about Cassidy's
27 research</a>
28 if you prefer to watch a video.</p>
29 <p>Two key take-aways from this: First, about this being a
30 <strong>preference</strong>, not a system-enforced setting:</p>
31 <blockquote>
32 <p>I’m explicitly using the language “Dark Style Preference” for a
33 reason! As you’ll read further on, it’s important that this is
34 treated as a user “preference,” not an explicit “mode” or
35 strictly-enforced “setting.” It’s also not a “theme” in the sense
36 that it just swaps out some assets, but is a way for the OS to
37 support a user expressing a preference, and apps to respond to that
38 preference.</p>
39 </blockquote>
40 <p>Second, about the <strong>accessibility</strong> implications:</p>
41 <blockquote>
42 <p>Clearly there’s an accessibility and usability angle here. And as
43 with other accessibility efforts, it’s important to not relegate a
44 dark style preference to a buried “Universal Access” or
45 “Accessibility” feature, as that makes it less discoverable, less
46 tested, and less likely to be used by folks who could greatly
47 benefit, but don’t consider themselves “disabled.”</p>
48 </blockquote>
49 <h1>Libadwaita and the rest of the ecosystem</h1>
50 <p>Read the <a href="https://discourse.gnome.org/t/libadwaita-1-0-roadmap/7415">libadwaita
51 roadmap</a>;
52 it is very short, but links to very interesting issues on gitlab.</p>
53 <p>For example, this merge request is for an <a href="https://gitlab.gnome.org/GNOME/libadwaita/-/merge_requests/224">API to query the dark style
54 and high-contrast
55 preferences</a>.
56 It has links to pending work in other parts of the platform: libhandy,
57 gsettings schemas, portals so that containerized applications can
58 query those preferences.</p>
59 <p>As far as I understand it, applications that just use GTK3 or libhandy
60 can opt in to supporting the dark style preference — it is opt-in
61 because doing that unconditionally in GTK/libhandy right now would
62 break existing applications.. If your app uses libadwaita, it is
63 assumed that you have opted into supporting that preference, since
64 libadwaita's widgets already make that assumption, and it is not
65 API-stable yet — so it can make that assumption from the beginning.</p>
66 <p>There is discussion of the accessibility implications in <a href="https://gitlab.gnome.org/Teams/Design/settings-mockups/-/issues/27#note_1257700">the design
67 mockups</a>.</p>
68 <h1>CSS parity across implementations</h1>
69 <p>In GNOME we have three implementations of CSS:</p>
70 <ul>
71 <li>
72 <p>librsvg uses servo's engine for CSS selector matching, and micro-parsers for CSS values based on servo's cssparser.</p>
73 </li>
74 <li>
75 <p>GTK has its own CSS parser and processor.</p>
76 </li>
77 <li>
78 <p>Gnome-shell uses an embedded version of libcroco for parsing, but it
79 does most of the selector matching and cascading with gnome-shell's
80 own Shell Toolkit code.</p>
81 </li>
82 </ul>
83 <p>None of those implementations supports <code>@media</code> queries nor custom
84 properties with <code>var()</code>. That is, unlike in the web platform, GNOME
85 applications cannot have this in their CSS:</p>
86 <div class="highlight"><pre><span></span><code><span class="p">@</span><span class="k">media</span> <span class="o">(</span><span class="nt">prefers-color-scheme</span><span class="o">:</span> <span class="nt">dark</span><span class="o">)</span> <span class="p">{</span>
87 <span class="c">/* styles for dark style */</span>
88 <span class="p">}</span>
89
90 <span class="p">@</span><span class="k">media</span> <span class="o">(</span><span class="nt">prefers-color-scheme</span><span class="o">:</span> <span class="nt">light</span><span class="o">)</span> <span class="p">{</span>
91 <span class="c">/* styles for light style */</span>
92 <span class="p">}</span>
93 </code></pre></div>
94
95 <p>Or even declaring colors in a civilized fashion:</p>
96 <div class="highlight"><pre><span></span><code><span class="p">:</span><span class="nd">root</span> <span class="p">{</span>
97 <span class="nv">--main-bg-color</span><span class="p">:</span> <span class="kc">pink</span><span class="p">;</span>
98 <span class="p">}</span>
99
100 <span class="nt">some_widget</span> <span class="p">{</span>
101 <span class="k">background-color</span><span class="p">:</span> <span class="nf">var</span><span class="p">(</span><span class="nv">--main-bg-color</span><span class="p">);</span>
102 <span class="p">}</span>
103 </code></pre></div>
104
105 <p>Or combining the two:</p>
106 <div class="highlight"><pre><span></span><code><span class="p">@</span><span class="k">media</span> <span class="o">(</span><span class="nt">prefers-color-scheme</span><span class="o">:</span> <span class="nt">dark</span><span class="o">)</span> <span class="p">{</span>
107 <span class="p">:</span><span class="nd">root</span> <span class="p">{</span>
108 <span class="nv">--main-bg-color</span><span class="p">:</span> <span class="c">/* some nice dark background color */</span><span class="p">;</span>
109 <span class="nv">--main-fg-color</span><span class="p">:</span> <span class="c">/* a contrasty light foreground */</span><span class="p">;</span>
110 <span class="p">}</span>
111 <span class="p">}</span>
112
113 <span class="p">@</span><span class="k">media</span> <span class="o">(</span><span class="nt">prefers-color-scheme</span><span class="o">:</span> <span class="nt">light</span><span class="o">)</span> <span class="p">{</span>
114 <span class="p">:</span><span class="nd">root</span> <span class="p">{</span>
115 <span class="nv">--main-bg-color</span><span class="p">:</span> <span class="c">/* some nice light background color */</span><span class="p">;</span>
116 <span class="nv">--main-fg-color</span><span class="p">:</span> <span class="c">/* a contrasty dark foreground */</span><span class="p">;</span>
117 <span class="p">}</span>
118 <span class="p">}</span>
119
120 <span class="nt">some_widget</span> <span class="p">{</span>
121 <span class="k">background-color</span><span class="p">:</span> <span class="nf">var</span><span class="p">(</span><span class="nv">--main-bg-color</span><span class="p">);</span>
122 <span class="p">}</span>
123 </code></pre></div>
124
125 <p>Boom. I think this would remove some workarounds we have right now:</p>
126 <ul>
127 <li>
128 <p>Just like GTK, libadwaita generates four variants of the system's
129 stylesheet using scss (regular, dark, high-contrast,
130 high-contrast-dark). This would be obviated with <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/@media"><code>@media</code>
131 queries</a>
132 for <code>prefers-color-scheme</code>, <code>prefers-contrast</code>, <code>inverted-colors</code> as
133 in the web platform.</p>
134 </li>
135 <li>
136 <p>GTK has a custom <code>@define-color</code> keyword, but neither gnome-shell
137 nor librsvg support that. This would be obviated with <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/--*">CSS custom
138 properties</a> -
139 the <code>var()</code> mechanism. (I don't know if some "environmental" stuff
140 would be better done as
141 <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/env()"><code>env()</code></a>,
142 but none of the three implementations support that, either.)</p>
143 </li>
144 </ul>
145 <h1>Accent colors</h1>
146 <p>They are currently implemented with GTK's <code>@define-color</code>, which is
147 not ideal if the colors have to trickle down from GTK to SVG icons,
148 since librsvg doesn't do <code>@define-color</code> - it would rather have
149 <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/459"><code>var()</code>
150 instead</a>.</p>
151 <p>Of course, gnome-shell's libcroco doesn't do <code>@define-color</code> either.</p>
152 <p>Look for <code>@accent_color</code>, <code>@accent_bg_color</code>, <code>@warning_color</code>, etc. in the <a href="https://gitlab.gnome.org/GNOME/libadwaita/-/blob/main/src/stylesheet/_defaults.scss">default
153 stylesheet</a>,
154 or better yet, <strong>write documentation!</strong></p>
155 <p>The default style:</p>
156 <p><img alt="Default blue style" src="https://people.gnome.org/~federico/blog/images/adwaita-default.png"></p>
157 <p>Accent color set to orange (e.g. tweak it in GTK's CSS inspector):</p>
158 <p><img alt="Orange accents for widgets" src="https://people.gnome.org/~federico/blog/images/adwaita-accent-orange.png"></p>
159 <div class="highlight"><pre><span></span><code><span class="c">/* Standalone, e.g. the &quot;Page 1&quot; label */</span>
160 <span class="p">@</span><span class="k">define-color</span> <span class="nt">accent_color</span> <span class="p">@</span><span class="k">orange_5</span><span class="p">;</span>
161
162 <span class="c">/* background+text pair */</span>
163 <span class="p">@</span><span class="k">define-color</span> <span class="nt">accent_bg_color</span> <span class="p">@</span><span class="k">orange_4</span><span class="p">;</span>
164 <span class="p">@</span><span class="k">define-color</span> <span class="nt">accent_fg_color</span> <span class="nt">white</span><span class="p">;</span>
165 </code></pre></div>
166
167 <h1>Custom widgets</h1>
168 <p>Again, your app's custom stylesheet for its custom widgets can use the
169 colors defined through <code>@define-color</code> from the system's stylesheet.</p>
170 <h1>Recoloring styles</h1>
171 <p>You will be able to do this after it gets merged into the main branch,
172 e.g. recolor everything to sepia:</p>
173 <p><img alt="Adwaita recolored to sepia" src="https://people.gnome.org/~federico/blog/images/adwaita-recolored.png"></p>
174 <div class="highlight"><pre><span></span><code><span class="p">@</span><span class="k">define-color</span> <span class="nt">headerbar_bg_color</span> <span class="p">#</span><span class="nn">eedcbf</span><span class="p">;</span>
175 <span class="p">@</span><span class="k">define-color</span> <span class="nt">headerbar_fg_color</span> <span class="p">#</span><span class="nn">483a22</span><span class="p">;</span>
176
177 <span class="p">@</span><span class="k">define-color</span> <span class="nt">bg_color</span> <span class="p">#</span><span class="nn">f9f3e9</span><span class="p">;</span>
178 <span class="p">@</span><span class="k">define-color</span> <span class="nt">fg_color</span> <span class="p">#</span><span class="nn">483a22</span><span class="p">;</span>
179
180 <span class="p">@</span><span class="k">define-color</span> <span class="nt">dark_fill_color</span> <span class="nt">shade</span><span class="o">(</span><span class="p">#</span><span class="nn">f9f3e9</span><span class="o">,</span> <span class="p">.</span><span class="nc">95</span><span class="o">)</span><span class="p">;</span>
181
182 <span class="p">@</span><span class="k">define-color</span> <span class="nt">accent_bg_color</span> <span class="p">@</span><span class="k">orange_4</span><span class="p">;</span>
183 <span class="p">@</span><span class="k">define-color</span> <span class="nt">accent_color</span> <span class="p">@</span><span class="k">orange_5</span><span class="p">;</span>
184 </code></pre></div>
185
186 <p>Of course <code>shade()</code> is not web-platform CSS, either. We could keep
187 it, or redo it by implementing <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/calc()"><code>calc()</code>
188 function</a> for
189 color values.</p>
190 <h1>Recoloring icons</h1>
191 <p>Currently GTK takes some defined colors and <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/539#note_669487">creates a chunk of CSS to
192 inject into SVG for
193 icons</a>.
194 This has <a href="https://gitlab.gnome.org/GNOME/gtk/-/issues/2314">some
195 problems</a>.</p>
196 <p>There is also some discussion about <a href="https://gitlab.gnome.org/GNOME/gtk/-/issues/1762">standardizing recolorable
197 icons</a> across
198 desktop environments.</p>
199 <h1>How you can help</h1>
200 <p>Implement support for <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Media_Queries/Using_media_queries"><code>@media</code>
201 queries</a>
202 in our three CSS implementations (librsvg, gnome-shell, GTK). Decide
203 how CSS media features like <code>prefers-color-scheme</code>,
204 <code>prefers-contrast</code>, <code>inverted-colors</code> should interact with the GNOME's
205 themes and accessibility, and decide if we should use them for
206 familiarity with the web platform, or if we need media features with
207 different names.</p>
208 <p>Implement support for <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Using_CSS_custom_properties">CSS custom properties -
209 <code>var()</code></a>
210 in our three CSS implementations. Decide if we should replace the
211 current <code>@define-color</code> with that (note that <code>@define-color</code> is only
212 in GTK, but not in librsvg or gnome-shell).</p>
213 <p>See the <a href="https://discourse.gnome.org/t/libadwaita-1-0-roadmap/7415">libadwaita
214 roadmap</a>
215 and help out!</p>
216 <p>Port applications to use the proposed APIs for querying the dark style
217 preference. There are a bunch of hacky ways of doing it right now;
218 they need to be migrated to the new system.</p>
219 <p>Personally I would love help with finishing to <a href="https://gitlab.gnome.org/GNOME/gnome-shell/-/merge_requests/1355">port gnome-shell's
220 styles to
221 Rust</a> -
222 this is part of unifying librsvg's and gnome-shell's CSS machinery.</p></content><category term="misc"></category><category term="gnome"></category></entry><entry><title>Bzip2's experimental repository is changing maintainership</title><link href="https://people.gnome.org/~federico/blog/bzip2-changing-maintainership.html" rel="alternate"></link><published>2021-06-03T19:21:04-05:00</published><updated>2021-06-03T19:21:04-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2021-06-03:/~federico/blog/bzip2-changing-maintainership.html</id><summary type="html"><p>Bzip2's stable repository is maintained <a href="https://sourceware.org/git/?p=bzip2.git;a=summary">at Sourceware</a> by
223 Mark Wielaard. In 2019 I started maintaining an <a href="https://gitlab.com/bzip2/bzip2/">experimental
224 repository in GitLab</a>, with the intention of updating
225 the build system and starting a Rust port of bzip2. Unfortunately I
226 have left this project slip by.</p>
227 <p>The new maintainer of the <a href="https://gitlab.com/bzip2/bzip2/">experimental repository …</a></p></summary><content type="html"><p>Bzip2's stable repository is maintained <a href="https://sourceware.org/git/?p=bzip2.git;a=summary">at Sourceware</a> by
228 Mark Wielaard. In 2019 I started maintaining an <a href="https://gitlab.com/bzip2/bzip2/">experimental
229 repository in GitLab</a>, with the intention of updating
230 the build system and starting a Rust port of bzip2. Unfortunately I
231 have left this project slip by.</p>
232 <p>The new maintainer of the <a href="https://gitlab.com/bzip2/bzip2/">experimental repository</a> for
233 Bzip2 is Micah Snyder. Thanks, Micah, for picking it up!</p></content><category term="misc"></category><category term="bzip2"></category><category term="rust"></category></entry><entry><title>Librsvg, Rust, and non-mainstream architectures</title><link href="https://people.gnome.org/~federico/blog/librsvg-rust-and-non-mainstream-architectures.html" rel="alternate"></link><published>2021-02-24T19:14:55-06:00</published><updated>2021-02-24T19:14:55-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2021-02-24:/~federico/blog/librsvg-rust-and-non-mainstream-architectures.html</id><summary type="html"><p>Almost five years ago <a href="https://people.gnome.org/~federico/news-2016-10.html#25">librsvg introduced Rust into its source
234 code</a>. Around the same time, Linux distributions
235 started shipping the first versions of Firefox that also required
236 Rust. I unashamedly wanted to ride that wave: distros would <em>have</em> to
237 integrate a new language in their build infrastructure, or they would …</p></summary><content type="html"><p>Almost five years ago <a href="https://people.gnome.org/~federico/news-2016-10.html#25">librsvg introduced Rust into its source
238 code</a>. Around the same time, Linux distributions
239 started shipping the first versions of Firefox that also required
240 Rust. I unashamedly wanted to ride that wave: distros would <em>have</em> to
241 integrate a new language in their build infrastructure, or they would
242 be left without Firefox. I was hoping that having a working Rust
243 toolchain would make it easier for the rustified librsvg to get into
244 distros.</p>
245 <p>Two years after that, <a href="https://lwn.net/Articles/771355/">someone from Debian complained</a>
246 that this made it hard or impossible to build librsvg (and all the
247 software that depends on it, which is A Lot) on all the architectures
248 that Debian builds on — specifically, on things like HP PA-RISC or
249 Alpha, which <a href="https://www.debian.org/ports/">even Debian marks as "discontinued" now</a>.</p>
250 <p>Recently there was a similar kerfuffle, this time from <a href="https://lwn.net/Articles/845535/">someone from
251 Gentoo</a>, specifically about how Python's cryptography
252 package now requires Rust. So, it doesn't build for platforms that
253 Rust/LLVM don't support, like hppa, alpha, and Itanium. It also
254 doesn't build for platforms for which there are no Rust packages from
255 Gentoo yet (mips, s390x, riscv among them).</p>
256 <h2>Memories of discontinued architectures</h2>
257 <p>Let me reminisce about a couple of discontinued architectures. If I'm
258 reading <a href="https://en.wikipedia.org/wiki/DEC_Alpha">Wikipedia</a> correctly, the DEC Alpha ceased to be
259 developed in 2001, and HP, who purchased Compaq, who purchased DEC,
260 stopped selling Alpha systems in 2007. Notably, Compaq phased out the
261 Alpha in favor of the Itanium, which stopped being developed in 2017.</p>
262 <p>I <em>used</em> an Alpha machine in 1997-1998, back at the University.
263 <a href="https://twitter.com/migueldeicaza/">Miguel</a> kindly let me program and learn from him at the Institute
264 where he worked, and the computer lab there got an Alpha box to let
265 the scientists run mathematical models on a machine with really fast
266 floating-point. This was a time when people actually regularly ssh'ed
267 into machines to run X11 applications remotely — in their case, I
268 think it was Matlab and Mathematica. Good times.</p>
269 <p>The Alpha had fast floating point, much faster than Intel x86 CPUs,
270 and I was delighted to do graphics work on it. That was the first
271 64-bit machine I used, and it let me learn how to fix code that only
272 assumed 32 bits. It had a really picky floating-point unit. Whereas
273 x86 would happily throw you a NaN if you used uninitialized memory as
274 floats, the Alpha would properly fault and crash the program. I fixed
275 so many bugs thanks to that!</p>
276 <p>I also have fond memories of the 32-bit SPARC
277 boxes at the University and their flat-screen fixed-frequency CRT
278 displays, but you know, I haven't <em>seen</em> one of those machines since
279 1998. Because I was doing graphics work, I used the single SPARC
280 machine in the computer lab at the Institute that had 24-bit graphics,
281 with a humongous 21" CRT display. PCs at the time still had 8-bit video
282 cards and shitty little monitors.</p>
283 <p>At about the same time that the Institute got its Alpha, it also got
284 one of the first 64-bit UltraSPARCs from Sun — a very expensive
285 machine definitely not targeted to hobbyists. I think it had two CPUs!
286 Multicore did not exist!</p>
287 <p>I think I saw a single Itanium machine in my life, probably around
288 2002-2005. The Ximian/Novell office in Mexico City got one, for QA
289 purposes — an incredibly loud and unstable machine. I don't think we
290 ever did any actual development on that box; it was a "can you
291 reproduce this bug there" kind of thing. I think Ximian/Novell had a
292 contract with HP to test the distro there, I don't remember.</p>
293 <h2>Unsupported architectures at the LLVM level</h2>
294 <p>Platforms like the Alpha and Itanium that Rust/LLVM don't support —
295 those platforms are dead in the water. The compiler cannot target
296 them, as Rust generates machine code via LLVM, and LLVM doesn't
297 support them.</p>
298 <p>I don't know why distributions maintained by volunteers give
299 themselves the responsibility to keep their software running on
300 platforms that have not been manufactured for years, and that were
301 never even hobbyist machines.</p>
302 <p>I read the other day, and now I regret not keeping the link, something
303 like this: don't assume that your hobby computing entitles you to free
304 labor on the part of compiler writers, software maintainers, and
305 distro volunteers. (If someone helps me find the source, I'll happily
306 link to it and quote it properly.)</p>
307 <h2>Non-tier-1 platforms and "$distro does not build Rust there yet"</h2>
308 <p>I think people are discovering these once again:</p>
309 <ul>
310 <li>
311 <p>Writing and supporting a compiler for a certain architecture takes Real Work.</p>
312 </li>
313 <li>
314 <p>Supporting a distro for a certain architecture takes Real Work.</p>
315 </li>
316 <li>
317 <p>Fixing software to work on a certain architecture takes Real Work.</p>
318 </li>
319 </ul>
320 <p>Rust divides its support for different platforms into <a href="https://doc.rust-lang.org/nightly/rustc/platform-support.html">tiers</a>, going
321 from tier 1, the most supported, to tier 3, the least supported. Or,
322 I should say, <em>taken care of</em>, which is a combination of people who
323 actually have the hardware in question, and whether the general CI and
324 build tooling is prepared to deal with them as effectively as it does
325 for tier 1 platforms.</p>
326 <p>In other words: there are more people capable of paying attention to, and
327 testing things on, x86_64 PCs than there are for
328 <code>sparc-unknown-linux-gnu</code>.</p>
329 <h2>Some anecdotes from Suse</h2>
330 <p>At Suse we actually support IBM's s390x big iron; those mainframes run
331 Suse Linux Enterprise Server. You have to pay a lot of money to get a
332 machine like that and support for it. It's a room-sized beast that
333 requires professional babysitting.</p>
334 <p>When librsvg and Firefox started getting rustified, there was of
335 course concern about getting Rust to work properly on the s390x.
336 I worked sporadically with the people who made the distro work there,
337 and who had to deal with building Rust and Firefox on it (librsvg
338 was a non-issue after getting Rust and Firefox to work).</p>
339 <p>I think all the LLVM work for the s390x was done at IBM. There were
340 probably a couple of miscompilations that affected Firefox; they got fixed.</p>
341 <p>One would expect bugs in software for IBM mainframes to be fixed by
342 IBM or its contractors, not by volunteers maintaining a distro in
343 their spare time.</p>
344 <p>Giving computing time on mainframes to volunteers in distros could seem
345 like a good samaritan move, or a trap to extract free labor from
346 unsuspecting people.</p>
347 <h3>Endianness bugs</h3>
348 <p>Firefox's problems on the s390x were more around big-endian bugs than
349 anything. You see, all the common architectures these days (x86_64
350 and arm64) are little-endian. However, s390x is <a href="https://en.wikipedia.org/wiki/Endianness">big-endian</a>,
351 which means that all multi-byte numbers in memory are stored backwards
352 from what most software expects.</p>
353 <p>It is not a problem to write software that assumes little-endian or
354 big-endian all the time, but it takes a little care to write software
355 that works on either.</p>
356 <p>Most of the software that volunteers and paid people write assumes
357 little-endian CPUs, because that is likely what they are targeting.
358 It is a pain in the ass to encounter code that works incorrectly on
359 big-endian — a pain because <em>knowing where to look</em> for evidence of
360 bugs is tricky, and <em>fixing existing code</em> to work with either
361 endianness can be either very simple, or a major adventure in
362 refactoring and testing.</p>
363 <p>Two cases in point:</p>
364 <p><strong>Firefox.</strong> When Suse started dealing with Rust and Firefox in the
365 s390x, there were endianness bugs in the graphics code in Firefox that
366 deals with pixel formats. Whether pixels get stored in memory as
367 ARGB/ABGR/RGBA/etc. is a platform-specific thing, and is generally a
368 combination of the graphics hardware for that platform, plus the
369 actual CPU architecture. At that time, it looked like the C++ code in
370 Firefox that deals with pixels had been rewritten/refactored, and had
371 lost big-endian support along the way. I don't know the current
372 status (not a single big-endian CPU in my vincinity), but I haven't
373 seen related bugs come in the Suse bug tracker? Maybe it's fixed now?</p>
374 <p><strong>Librsvg</strong> had <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues?scope=all&amp;utf8=%E2%9C%93&amp;state=closed&amp;search=s390">two root causes of bugs for
375 big-endian</a>. One was in the old code for SVG
376 filter effects that was written in C; <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/195">it never supported big-endian</a>. The
377 initial port to Rust inherited the same bug (think of a line-by-line
378 port, althought it wasn't exactly like that), but it got fixed when my
379 Summer of Code intern Ivan Molodetskikh refactored the code to have a
380 <code>Pixel</code> abstraction that works for little-endian and big-endian, and
381 wraps Cairo's funky requirements.</p>
382 <p>The other endian-related bug in librsvg was when <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/328">computing
383 masks</a>. Again, a little refactoring with that <code>Pixel</code>
384 abstraction fixed it.</p>
385 <p>I knew that the original C code for SVG filter effects didn't work on
386 big-endian. But even back then, at Suse we never got
387 reports of it producing incorrect results on the s390x... maybe people don't use
388 their mainframes to run <code>rsvg-convert</code>? I was hoping that the port to
389 Rust of that code would automatically fix that bug, and it kind of
390 happened that way through Ivan's careful work.</p>
391 <p>And the code for masks? There were two bugs reported with that same
392 root cause: <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/328">one from
393 Debian</a> as a
394 failure in librsvg's test suite (yay, it caught that bug!), and one
395 from <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/322">someone running an Apple PowerBook
396 G4</a> with a MATE
397 desktop and seeing incorrectly-rendered SVG icons.</p>
398 <p>And you know what? I am <strong>delighted</strong> to see people trying to keep
399 those lovely machines alive. A laptop that doesn't get warm enough to
400 burn your thighs, what a concept. A perfectly serviceable 32-bit
401 laptop with a maximum of about 1 GB of RAM and a 40 GB hard drive (it
402 didn't have HDMI!)... But you know, it's the same kind of delight I
403 feel when people talk about doing film photography on a
404 <a href="https://en.wikipedia.org/wiki/Rollei_35">Rollei 35</a>. A lot of
405 nostalgia for hardware of days past, and a lot of mixed feelings about
406 not throwing out working things and creating more trash.</p>
407 <p>As a graphics programmer I feel the responsibility to write code that
408 works on little-endian and big-endian, but you know, it's not exactly
409 an everyday concern anymore. The last big-endian machine I used on an
410 everyday basis was the SPARCs in the university, more than 20 years
411 ago.</p>
412 <h3>Who gets paid to fix this?</h3>
413 <p>That's the question. Suse got paid to support Firefox on the s390x; I
414 suppose IBM has an interest in fixing LLVM there; both actually have
415 people and hardware and money to that effect.</p>
416 <p>Within Suse, I am by default responsible for keeping librsvg
417 working for the s390x as well — it gets built as part of the distro,
418 after all. I have never gotten an endianness bug report from the Suse
419 side of things.</p>
420 <p>Which leads me to suspect that, probably similar to Debian and Gentoo,
421 we <em>build</em> a lot of software because it's in the build chain, but we
422 don't <em>run</em> it to its fullest extent. Do people run GNOME desktops on
423 s390x virtual machines? Maybe? Did they not notice endianness bugs
424 <strong>because they were not in the code path that most GNOME icons
425 actually use</strong>? Who knows!</p>
426 <p>I'm thankful to Simon from the Debian bug for pointing out
427 the failure in librsvg's test case for masks, and to Mingcong for
428 actually showing a screenshot of a MATE desktop running on a PPC
429 PowerBook. Those were useful things for them to do.</p>
430 <p>Also — they were kind about it. It was a pleasure to interact with them.</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category></entry><entry><title>Do not use librsvg 2.40.x</title><link href="https://people.gnome.org/~federico/blog/do-not-use-librsvg-2.40.x.html" rel="alternate"></link><published>2020-11-27T10:28:46-06:00</published><updated>2020-11-27T10:28:46-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-11-27:/~federico/blog/do-not-use-librsvg-2.40.x.html</id><summary type="html"><p>Please do not use librsvg 2.40.x; <strong>it cannot render recent Adwaita icon themes correctly</strong>.</p>
431 <p>The librsvg 2.40.x series is the last "C only" version of the library;
432 it was deprecated in 2017.</p>
433 <p>During the port to Rust, I rewrote the path parser to be
434 spec-compliant, and …</p></summary><content type="html"><p>Please do not use librsvg 2.40.x; <strong>it cannot render recent Adwaita icon themes correctly</strong>.</p>
435 <p>The librsvg 2.40.x series is the last "C only" version of the library;
436 it was deprecated in 2017.</p>
437 <p>During the port to Rust, I rewrote the path parser to be
438 spec-compliant, and fixed a few cases that the C version did not
439 handle. One of this cases is for compact Arc data.</p>
440 <p>The <a href="https://www.w3.org/TR/SVG11/paths.html#PathDataBNF">SVG path grammar</a> allows
441 one to remove whitespace between numbers if the next number starts
442 with a sign. For example, <code>23-45</code> gets parsed as two numbers <code>23
443 -45</code>.</p>
444 <p>In addition, the arguments of the Arc commands have two flags in the
445 middle of a bunch of numbers. The flags can be <code>0</code> or <code>1</code>, and there
446 may be no whitespace between the flags and the next number. For
447 example, <code>A1.98 1.98 0 0015 13.96</code> gets parsed as <code>A1.98 1.98 0 0 0 15
448 13.96</code> — note the two <code>0 0</code> flags before the <code>15</code>.</p>
449 <p>Librsvg 2.40.x does not parse this correctly.
450 Adwaita-icon-theme-3.36, and possibly earlier versions, uses minimized
451 SVG files with compressed whitespace, and will not render correctly
452 with the C-only version of librsvg.</p>
453 <p>This is <code>help-contents-symbolic.svg</code> rendered with librsvg 2.40.21:</p>
454 <p><img alt="icon rendered incorrectly" src="https://people.gnome.org/~federico/blog/images/help-contents-symbolic-2.40.21.png"></p>
455 <p>And this is <code>help-contents-symbolic.svg</code> rendered with librsvg 2.50.2:</p>
456 <p><img alt="icon rendered correctly" src="https://people.gnome.org/~federico/blog/images/help-contents-symbolic-2.50.2.png"></p>
457 <p>This is not the only icon with compact Arc commands; there are many
458 others that will also be mis-rendered in 2.40.x.</p>
459 <p>I don't know when Adwaita started using SVGs with compressed
460 whitespace; probably it didn't when librsvg 2.40.x was the latest
461 version, or everyone would have noticed mis-rendered icons.</p>
462 <p><strong>Background:</strong> Someone recently filed a
463 <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/654">bug</a> about
464 memory unsafety in librsvg 2.40.x's path parser, which mysteriously
465 enough only manifests itself in big-endian platforms. I wouldn't be
466 surprised if this had latent bugs on little-endian as well.</p>
467 <p>Please use at least librsvg 2.48.x; any earlier versions are not
468 supported. Generally I keep an eye on the last two stable release
469 sets (2.48.x and 2.50.x as of this writing), but only commit fixes to
470 the latest stable series (2.50.x currently).</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category></entry><entry><title>Librsvg's test suite is now in Rust</title><link href="https://people.gnome.org/~federico/blog/librsvg-test-suite-is-now-in-rust.html" rel="alternate"></link><published>2020-10-26T10:38:12-06:00</published><updated>2020-10-26T10:38:12-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-10-26:/~federico/blog/librsvg-test-suite-is-now-in-rust.html</id><summary type="html"><p>Some important changes are afoot in librsvg.</p>
471 <h2>Changes to continuous integration</h2>
472 <p>Some days ago, <a href="https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/398">Dunja Lalic rewrote the continuous integration
473 scripts</a> to be much faster. A complete pipeline used to take
474 about 90 minutes to run, now it takes about 15 minutes on average.</p>
475 <p><img alt="Graph with pipeline timings, which shrink drastically" src="https://people.gnome.org/~federico/blog/images/librsvg-fast-ci.png" title="Guess when the CI changed"></p>
476 <p>The <a href="https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/398">description of changes</a> is interesting …</p></summary><content type="html"><p>Some important changes are afoot in librsvg.</p>
477 <h2>Changes to continuous integration</h2>
478 <p>Some days ago, <a href="https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/398">Dunja Lalic rewrote the continuous integration
479 scripts</a> to be much faster. A complete pipeline used to take
480 about 90 minutes to run, now it takes about 15 minutes on average.</p>
481 <p><img alt="Graph with pipeline timings, which shrink drastically" src="https://people.gnome.org/~federico/blog/images/librsvg-fast-ci.png" title="Guess when the CI changed"></p>
482 <p>The <a href="https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/398">description of changes</a> is interesting. The idea is to make tests fail as fast as possible, close to the beginning of the pipeline. To speed up the whole pipeline, Dunja did the following:</p>
483 <ul>
484 <li>
485 <p>Move the <code>cargo check</code> stage to the beginning. This test means,
486 "does this even have a chance of compiling?".</p>
487 </li>
488 <li>
489 <p>Have the code style and formatting tests, <code>cargo clippy</code> and <code>cargo
490 fmt</code>, run in parallel with the unit tests. These lints can fail,
491 but they are easy to fix after one is finished modifying the main
492 code.</p>
493 </li>
494 <li>
495 <p>Run the unit tests and the smoke tests in debug mode, so they compile
496 quickly.</p>
497 </li>
498 <li>
499 <p>Run the complete integration test suite in release mode. This takes
500 longer to compile, but there are some slow tests that benefit a lot
501 from faster execution.</p>
502 </li>
503 <li>
504 <p>Move the release tests until the end, and only run them once a week
505 — or whenever, by hand. These take a good amount of time to run,
506 because they do a full <code>make distcheck</code> and autotools is slow. Even
507 then, now these tests take 30-40 minutes, instead of the 90 from
508 before.</p>
509 </li>
510 <li>
511 <p>Between each stage of the pipeline, don't cache what doesn't help
512 reduce compilation time. It seems that keeping around a big cache,
513 with the whole build <code>target</code>, between each pipeline stage can be
514 worse than not having one at all.</p>
515 </li>
516 </ul>
517 <p><img alt="Complete pipeline with all the stages" src="https://people.gnome.org/~federico/blog/images/librsvg-complete-pipeline.png"></p>
518 <h2>Test suite in Rust</h2>
519 <p>Beteen Sven Neumann, Dunja Lalic, and myself we have finally <a href="https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/408">ported
520 the test suite to Rust</a>: all of librsvg's tests are
521 now in Rust, except for the C API tests. We had to do a few things:</p>
522 <ul>
523 <li>
524 <p>Review the old tests and remove some obsolete ones.</p>
525 </li>
526 <li>
527 <p>Port each of the test modules to Rust. They are small, but each one
528 has special little things — test for crashes in the XML loading
529 code, test for crashes during rendering, test the library's security
530 limits.</p>
531 </li>
532 <li>
533 <p>Fix the small tests that come as part of the documentation.</p>
534 </li>
535 <li>
536 <p>Untangle the reference tests and port them to Rust.</p>
537 </li>
538 <li>
539 <p>Move little chunks of code around so the unit tests and integration
540 tests can share utilities to compare images, compute file paths for
541 test fixtures, etc.</p>
542 </li>
543 </ul>
544 <p>The most complicated thing to port was the reference tests. These are
545 the most important ones; each test loads an SVG document, renders it,
546 and compares the result to a reference PNG image. There are some
547 complications in the tests; they have to create a special
548 configuration for Fontconfig and Pango, so as to have reproducible
549 font rendering. The pango-rs bindings do not cover this part of
550 Pango, so we had to do some things by hand.</p>
551 <p>Anyway, <a href="https://gitlab.gnome.org/GNOME/librsvg/-/tree/master/tests/src">the tests are now in Rust</a>. One nice thing is that
552 now the tests run automatically in parallel, across all CPU cores, so
553 we save on total testing time.</p>
554 <h2>What's next: cargo-c and publish to crates.io</h2>
555 <p>We want to be able to <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/635">publish librsvg in crates.io</a> as a
556 normal crate; this implies being able to compile, test, and publish
557 entirely from Cargo. The compilation and testing part is done.</p>
558 <p>Now, we have to reorganize the code so it can be published to
559 crates.io. Librsvg comes in three parts, <code>rsvg_internals</code> with the
560 implementation of the library, <code>librsvg</code> with the traditional C API,
561 and <code>librsvg_crate</code> with the Rust API. However, to publish the Rust
562 API to crates.io, it would be more convenient to have a single crate
563 instead of one with the internals and one with the API.</p>
564 <p>The next step is thus to reorganize the code:</p>
565 <ul>
566 <li>
567 <p>Make it possible to implement the C API as a compile-time option on
568 top of the normal Rust code. We want to <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/552">use cargo-c</a> to
569 compile the traditional shared library <code>librsvg.so</code>, instead of
570 depending on C tools for compiling and linking.</p>
571 </li>
572 <li>
573 <p>Combine <code>rsvg_internals</code> and <code>librsvg_crate</code> in a single crate, to
574 publish them together. Crates.io has a 10 MB limit per crate; now
575 that the test suite lives in a separate <code>tests</code> crate, this
576 shouldn't be a problem.</p>
577 </li>
578 <li>
579 <p>I would like to <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/597">polish the public error types</a> before
580 publishing the Rust API; right now they expose some implementation
581 details that are of no interest to callers of the library.</p>
582 </li>
583 </ul>
584 <h2>What remains to be ported to Rust?</h2>
585 <p>Only two things, which amount to less than 900 lines of C code:</p>
586 <ul>
587 <li>
588 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/534">rsvg-convert</a> - the command-line program that everyone uses to
589 convert SVG to PNG and other formats. Fortunately, Sven Neumann
590 wrote some <a href="https://gitlab.gnome.org/GNOME/librsvg/-/blob/master/tests/src/cmdline/rsvg_convert.rs">fantastic tests</a> for rsvg-convert,
591 as it is like an API that we need to keep stable: if we change the
592 command-line options or the program's behavior, we would break
593 everyone's scripts.</p>
594 </li>
595 <li>
596 <p>The gdk-pixbuf module for loading SVG. Alberto Ruiz <a href="https://gitlab.gnome.org/GNOME/librsvg/-/tree/wip/aruiz/rust-pixbuf-loader">has started
597 porting it to Rust</a>. The generic part of this code
598 could later serve to wrap other Rust image codecs and plug them to
599 gdk-pixbuf.</p>
600 </li>
601 </ul></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category></entry><entry><title>Librsvg is accepting interns for Outreachy's December 2020 round</title><link href="https://people.gnome.org/~federico/blog/librsvg-accepting-outreachy-interns-2020.html" rel="alternate"></link><published>2020-10-05T09:52:27-05:00</published><updated>2020-10-05T09:52:27-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-10-05:/~federico/blog/librsvg-accepting-outreachy-interns-2020.html</id><summary type="html"><p>There are two projects in librsvg available for <a href="https://www.outreachy.org">Outreachy</a> applicants
602 in the December 2020 / March 2021 round:</p>
603 <ul>
604 <li>
605 <p><strong>Revamp the text engine</strong> - Do you know about international text
606 layout? Can you read a right-to-left language, or do you write in a
607 language that requires complex shaping? Would you like to implement …</p></li></ul></summary><content type="html"><p>There are two projects in librsvg available for <a href="https://www.outreachy.org">Outreachy</a> applicants
608 in the December 2020 / March 2021 round:</p>
609 <ul>
610 <li>
611 <p><strong>Revamp the text engine</strong> - Do you know about international text
612 layout? Can you read a right-to-left language, or do you write in a
613 language that requires complex shaping? Would you like to implement
614 the <a href="https://www.w3.org/TR/SVG2/text.html">SVG 2 text specification</a> in a <a href="https://gitlab.gnome.org/GNOME/librsvg/-/blob/master/rsvg_internals/src/text.rs">pleasant Rust code
615 base</a>? This project requires someone who can write Rust
616 comfortably; it will require reading and refactoring some
617 existing code. You don't need to be an expert in exotic lifetimes
618 and trait bounds and such; the code doesn't use them.</p>
619 </li>
620 <li>
621 <p><strong>Implement SVG2/CSS3 features</strong> - Are you excited by all the <a href="https://www.w3.org/TR/SVG2/changes.html">SVG2
622 features</a> in Inkscape, and would like to add support for them
623 in librsvg? Would you like to do small changes to many parts of the
624 code to implement small features, one at a time? Do you like
625 test-driven development? This project requires someone who can
626 write Rust code at a medium level; you'll learn a lot by
627 cutting&amp;pasting from existing code and refactoring things to
628 implement SVG2 features.</p>
629 </li>
630 </ul>
631 <p><strong>Important:</strong> Outreachy's December 2020 / March 2021 round is available <a href="https://www.outreachy.org/docs/applicant/#eligibility">only
632 for students in the Southern hemisphere</a>. People in the
633 Northern hemisphere can wait until the 2021 mid-year round.</p>
634 <p>You can see <a href="https://www.outreachy.org/apply/project-selection/#gnome">GNOME's projects in Outreachy</a> for this round.
635 <strong>The deadline for initial contributions and project applications is
636 October 31, 2020 at 16:00 UTC.</strong></p></content><category term="misc"></category><category term="mentoring"></category><category term="librsvg"></category></entry><entry><title>"Rust does not have a stable ABI"</title><link href="https://people.gnome.org/~federico/blog/rust-stable-abi.html" rel="alternate"></link><published>2020-08-12T22:01:44-05:00</published><updated>2020-08-12T22:01:44-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-08-12:/~federico/blog/rust-stable-abi.html</id><summary type="html"><p>I've seen GNOME people (often, people who have been working for a long
637 time on C libraries) express concerns along the following lines:</p>
638 <ol>
639 <li>Compiled Rust code doesn't have a stable ABI (application binary interface).</li>
640 <li>So, we can't have shared libraries in the traditional fashion of
641 Linux distributions.</li>
642 <li>Also Rust bundles …</li></ol></summary><content type="html"><p>I've seen GNOME people (often, people who have been working for a long
643 time on C libraries) express concerns along the following lines:</p>
644 <ol>
645 <li>Compiled Rust code doesn't have a stable ABI (application binary interface).</li>
646 <li>So, we can't have shared libraries in the traditional fashion of
647 Linux distributions.</li>
648 <li>Also Rust bundles its entire standard library with every binary it compiles, which makes Rust-built libraries huge.</li>
649 </ol>
650 <p>These are extremely valid concerns to be addressed by people like
651 myself who propose that chunks of infrastructural libraries
652 should be done in Rust.</p>
653 <p>So, let's begin.</p>
654 <p>The first part of this article is a super-quick introduction to shared
655 libraries and how Linux distributions use them. If you already know
656 those things, feel free to skip to the "<a href="#rust_does_not_have_a_stable_abi">Rust does not have a stable
657 ABI</a>" section.</p>
658 <h2>How do distributions use shared libraries?</h2>
659 <p>If several programs run at the same time and use the same shared library
660 (say, <code>libgtk-3.so</code>), the operating system can load a single copy of
661 the library in memory and share the read-only parts of the code/data
662 through the magic of virtual memory.</p>
663 <p><em>In theory</em>, if a library gets a bugfix but does not change its
664 interface, one can just recompile the library, stick the new <code>.so</code> in
665 <code>/usr/lib</code> or whatever, and be done with it. Programs that depend on
666 the library do not need to be recompiled.</p>
667 <p>If libraries limit their public interface to a plain C ABI
668 (application binary interface), they are relatively easy to consume
669 from other programming languages. Those languages don't have to deal
670 with name mangling of C++ symbols, exception handlers, constructors,
671 and all that complexity. Pretty much every language has some form of
672 C FFI (foreign function interface), which roughly means "call C
673 functions without too much trouble".</p>
674 <p>For the purposes of a library, what's an
675 <a href="https://en.wikipedia.org/wiki/Application_binary_interface">ABI</a>?
676 Wikipedia says, "An ABI defines how data structures or computational
677 routines are accessed in machine code [...] A common aspect of an
678 ABI is the calling convention", which means that to call a function in
679 machine code you need to frob the call and stack pointers, pass some
680 function arguments in registers or push some others to the stack, etc.
681 Really low-level stuff. Each machine architecture or operating system
682 usually defines a C standard ABI.</p>
683 <p>For libraries, we commonly understand an ABI to mean the machine-code
684 implications of their programming interface. Which functions are
685 available as public symbols in the <code>.so</code> file? To which numeric
686 values do C enum values correspond, so that they can be passed to
687 those functions? What is the exact order and type of arguments that
688 the functions take? What are the struct sizes, and the order and
689 types and padding of the fields that those functions take? Does one
690 pass arguments in CPU registers or on the stack? Does the caller or
691 the callee clean up the stack after a function call?</p>
692 <h2>Bug fixes and security fixes</h2>
693 <p>Linux distributions generally try <em>really hard</em> to have a single
694 version of each shared library installed in the system: a single
695 <code>libjpeg.so</code>, a single <code>libpng.so</code>, a single <code>libc.so</code>, etc.</p>
696 <p>This is helpful when there needs to be an update to fix a bug,
697 security-related or not: users can just download the updated package
698 for the library, which when installed will just stick in a new <code>.so</code>
699 in the right place, and the calling software won't need to be updated.</p>
700 <p>This is possible only if the bug <em>really only changes the internal
701 code</em> without changing behavior or interface. If a bug fix requires
702 part of the public API or ABI to change, then you are screwed; all
703 calling software needs to be recompiled. "Irresponsible" library
704 authors either learn really fast when distros complain loudly about
705 this sort of change, or they don't learn and get forever marked by
706 distros as "that irresponsible library" which always requires special
707 handling in order not to break other software.</p>
708 <p>Sidenote: sometimes it's more complicated. Poppler (the PDF
709 rendering library) ships at least two stable APIs, one Glib-based in
710 C, and one Qt-based in C++. However, some software like texlive uses
711 Poppler's internals library directly, which of course does not have a
712 stable API, and thus texlive breaks frequently as Poppler evolves.
713 Someone should extend the public, stable API so that texlive doesn't
714 have to use the library's internals!</p>
715 <h2>Bundled libraries</h2>
716 <p>Sometimes it is not irresponsible authors of libraries, but rather
717 that people who use the libraries find out that over time the behavior
718 of the library changes subtly, maybe without breaking the API or ABI,
719 and they are better off bundling a specific version of the library
720 with their software. That version is what they test their software
721 against, and they try to learn its quirks.</p>
722 <p>Distros inevitably complain about this, and either patch the calling
723 software by hand to force it to use the system's shared library, or
724 succeed in getting patches accepted by the software so that they have
725 a <code>--use-system-libjpeg</code> option or similar.</p>
726 <p>This doesn't work very well if the bundled version of the library has
727 extra patches that are not in a distro's usual patches. Or
728 vice-versa; it may actually work better to use the distro's version of
729 the library, if it has extra fixes that the bundled library doesn't.
730 Who knows! It's a case-by-case situation.</p>
731 <h2 id="rust_does_not_have_a_stable_abi">Rust does not have a stable ABI</h2>
732 <p>By default indeed it doesn't, because the compiler team wants to have
733 the freedom to change the data layout and Rust-to-Rust calling
734 conventions, often for performance reasons, at any time. For example,
735 it is not guaranteed that struct fields will be laid out in memory in
736 the same order as they are written in the code:</p>
737 <div class="highlight"><pre><span></span><code><span class="k">struct</span> <span class="nc">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
738 <span class="w"> </span><span class="n">bar</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
739 <span class="w"> </span><span class="n">baz</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"></span>
740 <span class="w"> </span><span class="n">beep</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
741 <span class="w"> </span><span class="n">qux</span>: <span class="kt">i32</span><span class="p">,</span><span class="w"></span>
742 <span class="p">}</span><span class="w"></span>
743 </code></pre></div>
744
745 <p>The compiler is free to rearrange the struct fields in memory as it
746 sees fit. Maybe it decides to put the two <code>bool</code> fields next to each
747 other to save on inter-field padding due to alignment requirements;
748 maybe it does static analysis or profile-guided optimizations and
749 picks an optmal ordering.</p>
750 <p>But we can override this! Let's look at data layout first, and then
751 calling conventions.</p>
752 <h3>Data layout for C versus Rust</h3>
753 <p>The following is the same struct as above, but with an extra <code>#[repr(C)]</code> attribute:</p>
754 <div class="highlight"><pre><span></span><code><span class="cp">#[repr(C)]</span><span class="w"></span>
755 <span class="k">struct</span> <span class="nc">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
756 <span class="w"> </span><span class="n">bar</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
757 <span class="w"> </span><span class="n">baz</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"></span>
758 <span class="w"> </span><span class="n">beep</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
759 <span class="w"> </span><span class="n">qux</span>: <span class="kt">i32</span><span class="p">,</span><span class="w"></span>
760 <span class="p">}</span><span class="w"></span>
761 </code></pre></div>
762
763 <p>With that attribute, the struct will be laid out just as this C struct:</p>
764 <div class="highlight"><pre><span></span><code><span class="cp">#include</span> <span class="cpf">&lt;stdbool.h&gt;</span><span class="cp"></span>
765 <span class="cp">#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp"></span>
766
767 <span class="k">struct</span> <span class="nc">Foo</span> <span class="p">{</span>
768 <span class="kt">bool</span> <span class="n">bar</span><span class="p">;</span>
769 <span class="kt">double</span> <span class="n">baz</span><span class="p">;</span>
770 <span class="kt">bool</span> <span class="n">beep</span><span class="p">;</span>
771 <span class="kt">int32_t</span> <span class="n">qux</span><span class="p">;</span>
772 <span class="p">}</span>
773 </code></pre></div>
774
775 <p>(Aside: it is unfortunate that <a href="https://people.gnome.org/~federico/news-2017-04.html#gboolean-is-not-rust-bool"><code>gboolean</code> is not <code>bool</code></a>,
776 but that's because <code>gboolean</code> predates C99, and clearly standards from
777 20 years ago are <em>too new</em> to use. (Aside aside: since I wrote that
778 other post, Rust's repr(C) for bool is actually defined as C99's bool;
779 it's no longer undefined.))</p>
780 <p>Even Rust's data-carrying enums can be laid out in a manner friendly
781 to C and C++:</p>
782 <div class="highlight"><pre><span></span><code><span class="cp">#[repr(C, u8)]</span><span class="w"></span>
783 <span class="k">enum</span> <span class="nc">MyEnum</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
784 <span class="w"> </span><span class="n">A</span><span class="p">(</span><span class="kt">u32</span><span class="p">),</span><span class="w"></span>
785 <span class="w"> </span><span class="n">B</span><span class="p">(</span><span class="kt">f32</span><span class="p">,</span><span class="w"> </span><span class="kt">bool</span><span class="p">),</span><span class="w"></span>
786 <span class="p">}</span><span class="w"></span>
787 </code></pre></div>
788
789 <p>This means, use C layout, and a <code>u8</code> for the enum's discriminant. It
790 will be laid out like this:</p>
791 <div class="highlight"><pre><span></span><code><span class="cp">#include</span> <span class="cpf">&lt;stdbool.h&gt;</span><span class="cp"></span>
792 <span class="cp">#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp"></span>
793
794 <span class="k">enum</span> <span class="n">MyEnumTag</span> <span class="p">{</span>
795 <span class="n">A</span><span class="p">,</span>
796 <span class="n">B</span>
797 <span class="p">};</span>
798
799 <span class="k">typedef</span> <span class="kt">uint32_t</span> <span class="n">MyEnumPayloadA</span><span class="p">;</span>
800
801 <span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
802 <span class="kt">float</span> <span class="n">x</span><span class="p">;</span>
803 <span class="kt">bool</span> <span class="n">y</span><span class="p">;</span>
804 <span class="p">}</span> <span class="n">MyEnumPayloadB</span><span class="p">;</span>
805
806 <span class="k">typedef</span> <span class="k">union</span> <span class="p">{</span>
807 <span class="n">MyEnumPayloadA</span> <span class="n">a</span><span class="p">;</span>
808 <span class="n">MyEnumPayloadB</span> <span class="n">b</span><span class="p">;</span>
809 <span class="p">}</span> <span class="n">MyEnumPayload</span><span class="p">;</span>
810
811 <span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
812 <span class="kt">uint8_t</span> <span class="n">tag</span><span class="p">;</span>
813 <span class="n">MyEnumPayload</span> <span class="n">payload</span><span class="p">;</span>
814 <span class="p">}</span> <span class="n">MyEnum</span><span class="p">;</span>
815 </code></pre></div>
816
817 <p>The gory details of data layout are in the <a href="https://doc.rust-lang.org/nomicon/other-reprs.html">Alternative Representations section of the
818 Rustonomicon</a> and
819 the <a href="https://rust-lang.github.io/unsafe-code-guidelines/introduction.html">Unsafe Code
820 Guidelines</a>.</p>
821 <h3>Calling conventions</h3>
822 <p>An ABI's calling conventions detail things like how to call functions
823 in machine code, and how to lay out function arguments in registers or
824 the stack. <a href="https://en.wikipedia.org/wiki/X86_calling_conventions">The wikipedia page on X86 calling
825 conventions</a>
826 has a good cheat-sheet, useful when you are looking at assembly code
827 and registers in a low-level debugger.</p>
828 <p>I've already written about how it is possible to write Rust code to
829 export functions callable from C; one uses the <code>extern "C"</code> in the
830 function definition and a <code>#[no_mangle]</code> attribute to keep the symbol
831 name pristine. This is how librsvg is able to have the following:</p>
832 <div class="highlight"><pre><span></span><code><span class="cp">#[no_mangle]</span><span class="w"></span>
833 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_handle_new_from_file</span><span class="p">(</span><span class="w"></span>
834 <span class="w"> </span><span class="n">filename</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">libc</span>::<span class="n">c_char</span><span class="p">,</span><span class="w"></span>
835 <span class="w"> </span><span class="n">error</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">glib_sys</span>::<span class="n">GError</span><span class="p">,</span><span class="w"></span>
836 <span class="p">)</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">RsvgHandle</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
837 <span class="w"> </span><span class="c1">// ...</span>
838 <span class="p">}</span><span class="w"></span>
839 </code></pre></div>
840
841 <p>Which compiles to what a C compiler would produce for this:</p>
842 <div class="highlight"><pre><span></span><code><span class="n">RsvgHandle</span> <span class="o">*</span><span class="nf">rsvg_handle_new_from_file</span> <span class="p">(</span><span class="k">const</span> <span class="n">gchar</span> <span class="o">*</span><span class="n">filename</span><span class="p">,</span> <span class="n">GError</span> <span class="o">**</span><span class="n">error</span><span class="p">);</span>
843 </code></pre></div>
844
845 <p>(Aside: librsvg <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/416">still uses an intermediate C library full of
846 stubs</a> that just
847 call the Rust-exported functions, but there is now <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/552">tooling to produce a .so
848 directly from
849 Rust</a> which I
850 just haven't had time to investigate. Help is appreciated!)</p>
851 <h3>Summary of ABI so far</h3>
852 <p>It is <em>one's decision</em> to export a stable C ABI from a Rust library.
853 There is some awkwardness in how types are laid out in C, because the
854 Rust type system is richer, but things can be made to work well with a
855 little thought. Certainly no more thought than the burden of
856 designing and maintaining a stable API/ABI in plain C.</p>
857 <p>I'll fold the second concern into here — "we can't have shared
858 libraries in traditional distro fashion". Yes, we can, API/ABI-wise,
859 but read on.</p>
860 <h2>Rust bundles its entire standard library with Rust-built .so's</h2>
861 <p>I.e. it statically links all the Rust dependencies. This produces a
862 large .so:</p>
863 <ul>
864 <li>librsvg-2.so (version 2.40.21, C only) - 1408840 bytes</li>
865 <li>librsvg-2.so (version 2.49.3, Rust only) - 9899120 bytes</li>
866 </ul>
867 <p>Holy crap! What's all that?</p>
868 <p>(And I'm cheating: this is both with link-time optimization turned on,
869 and by running <code>strip(1)</code> on the .so. If you just <code>autogen.sh &amp;&amp; make</code>
870 it will be bigger.)</p>
871 <p>This has Rust's standard library statically linked (or at least the
872 bits of that librsvg actually uses), plus all the Rust dependencies
873 (cssparser, selectors, nalgebra, glib-rs, cairo-rs, locale_config,
874 rayon, xml5ever, and an assload of crates). I could explain why each
875 one is needed:</p>
876 <ul>
877 <li>cssparser - librsvg needs to parse CSS.</li>
878 <li>selectors - librsvg needs to run the CSS selector matching
879 algorithm.</li>
880 <li>nalgebra - the code for SVG filter effects uses vectors and
881 matrices.</li>
882 <li>glib-rs, cairo-rs - draw to Cairo and export GObject types.</li>
883 <li>locale_config - so that localized SVG files can work.</li>
884 <li>rayon - so filters can use all your CPU cores instead of processing
885 one pixel at a time.</li>
886 <li>Etcetera. SVG is big and requires a lot of helper code!</li>
887 </ul>
888 <p>Is this a problem?</p>
889 <p>Or more exactly, why does this happen, and why do people perceive it
890 as a problem?</p>
891 <h3>Stable APIs/ABIs and distros</h3>
892 <p>Many Linux distributions have worked <em>really hard</em> to ensure that
893 there is a single copy of "system libraries" in an installation.
894 There is Just One Copy of <code>/usr/lib/libc.so</code>, <code>/usr/lib/libjpeg.so</code>,
895 etc., and packages are compiled with special options to tell them to
896 really use the sytem libraries instead of their bundled versions, or
897 patched to do so if they don't provide build-time options for that.</p>
898 <p>In a way, this works well for distros:</p>
899 <ul>
900 <li>
901 <p>A bug in a library can be fixed in a single place, and all
902 applications that use it get the fix automatically.</p>
903 </li>
904 <li>
905 <p>A security bug can be patched in a single place, and in theory
906 applications don't need to be audited further.</p>
907 </li>
908 </ul>
909 <p>If you maintain a library that is shipped in Linux distros, and you
910 break the ABI, you'll get complaints from distros very quickly.</p>
911 <p>This is good because it creates responsible maintainers for libraries
912 that can be depended on. It's how Inkscape/GIMP can have a stable
913 toolkit to be written in.</p>
914 <p>This is bad because it encourages stagnation in the long term. It's
915 how we get a horrible, unsafe, error-prone API in libjpeg that can
916 never ever be improved because it would requires changes in tons of
917 software; it's why <code>gboolean</code> is still a 32-bit <code>int</code> after
918 twenty-something years, even though everything else close to C has
919 decided that booleans are 1 byte. It's how Inkscape/GIMP take many
920 years to move from GTK2 to GTK3 (okay, that's lack of paid developers
921 to do the grunt work, but it is enabled by having forever-stable APIs).</p>
922 <p>However, a long-term stable API/ABI has a <strong>lot of value</strong>. It is why
923 the Windows API is the crown jewels; it is why people can rely on glib
924 and glibc to not break their code for many years and take them for granted.</p>
925 <h3>But we only have a single stable ABI anyway</h3>
926 <p>And that is the C ABI. Even C++ libraries have trouble with this, and
927 people sometimes write the internals of a library in C++ for
928 convenience, but export a stable C API/ABI from it.</p>
929 <p>High level languages like Python have <em>real trouble</em> calling C++ code
930 precisely because of ABI issues.</p>
931 <h3>Actually, in GNOME we have gone further than that</h3>
932 <p>In GNOME we have constructed a sweet little universe where <a href="https://people.gnome.org/~federico/blog/magic-of-gobject-introspection.html">GObject
933 Introspection</a> is
934 basically a C ABI with a ton of machine-generated annotations to make
935 it friendly to language bindings.</p>
936 <p>Still, we rely on a C ABI underneath. See <a href="https://twitter.com/federicomena/status/1286447929880801280">this exploratory twitter
937 thread on advancing the C ABI from Rust</a> for
938 lots of food for thought.</p>
939 <h3>Single copies of libraries with a C ABI</h3>
940 <p>Okay, let's go back to this. What price do we pay for single copies
941 of libraries that, by necessity, must export a C ABI?</p>
942 <ul>
943 <li>
944 <p>Code that can be conveniently called from C, maybe from C++, and
945 moderately to very inconvently from ANYTHING ELSE. With most new
946 application code being written definitely not in C, maybe we should
947 reconsider our priorities here.</p>
948 </li>
949 <li>
950 <p>No language facilities like generics or field visibility, which are
951 not even "modern language" features. Even C++ templates get
952 compiled and statically linked into the calling code, because
953 there's no way to pass information like the size of <code>T</code> in
954 <code>Array&lt;T&gt;</code> across a C ABI. You wanted to make some struct fields
955 public and some private? You are out of luck.</p>
956 </li>
957 <li>
958 <p>No knowledge of data ownership except by careful reading of the C
959 function's documentation. Does the function free its arguments?
960 How - with <code>free()</code> or <code>g_free()</code> or <code>my_thing_free()</code>? Or does the
961 caller just lend it a reference? Can the data be copied bit-by-bit
962 or must a special function be called to make a copy?
963 GObject-Introspection carries this information in its annotations,
964 while the C ABI has no idea and just ships raw pointers around.</p>
965 </li>
966 </ul>
967 <p>More food for thought note: <a href="https://twitter.com/hsivonen/status/1232204147740508162">this twitter
968 thread</a> says
969 this about the C++ ABI: "Also, the ABI matters for whether the actual
970 level of practicality of complying with LGPL matches the level of
971 practicality intended years ago when some project picked LGPL as its
972 license. Of course, the standard does not talk about LGPL, either.
973 LGPL has rather different implications for Rust and Go than it does
974 for C and Java. It was obviously written with C in mind."</p>
975 <h2>Monomorphization and template bloat</h2>
976 <p>While C++ had the problem of "lots of template code in header files",
977 Rust has the problem that <a href="https://pingcap.com/blog/generics-and-compile-time-in-rust#monomorphized-generics">monomorphization of generics creates a lot
978 of compiled
979 code</a>.
980 There are tricks to avoid this and they are all the decision of the
981 library/crate author. Both share the root cause that templated or
982 generic code must be recompiled for every specific use, and thus
983 cannot live in a shared library.</p>
984 <p>Also, see this wonderful <a href="https://thume.ca/2019/07/14/a-tour-of-metaprogramming-models-for-generics/">article on how different languages implement
985 generics</a>,
986 and think that a plain C ABI means we have NOTHING of the sort.</p>
987 <p>Also, see <a href="https://gankra.github.io/blah/swift-abi/">How Swift Achieved Dynamic Linking Where Rust
988 Couldn't</a> for more food for
989 thought. This is extremely roughly equivalent to GObject's boxed
990 types; callers keep values on the heap but know the type layout via
991 annotation magic, while the library's actual implementation
992 is free to have the values on the stack or wherever for its own use.</p>
993 <h2>Should all libraries export APIs with generics and exotic types?</h2>
994 <p>No!</p>
995 <p>You probably want something like a low-level array of values,
996 <code>Vec&lt;T&gt;</code>, to be inlined everywhere and with code that knows the
997 type of the vector's elements. Element accesses can be inlined to a
998 single machine instruction in the best case.</p>
999 <p>But not everything requires this absolute raw performance with
1000 everything inlined everywhere. It is fine to pass references or
1001 pointers to things and do dynamic dispatch from a vtable if you are
1002 not in a super-tight loop, as we love to do in the GObject world.</p>
1003 <h2>Library sizes</h2>
1004 <p>I don't have a good answer to librsvg's compiled size. If gnome-shell
1005 merges my branch to rustify the CSS code, it will also grow its binary
1006 size by quite a bit.</p>
1007 <p>It is my intention to have a Rust crate that both librsvg and
1008 gnome-shell share for their CSS styling needs, but right now I have no
1009 idea if this would be a shared library or just a normal Rust crate.
1010 Maybe it's possible to have a very general CSS library, and the
1011 application registers which properties it can parse and how? Is it
1012 possible to do this as a shared library without essentially
1013 reinventing libcroco? I don't know yet. We'll see.</p>
1014 <h2>A metaphor which I haven't fully explored</h2>
1015 <p>If every application or end-user package is kind of like a living
1016 organism, with its own cycles and behaviors and organs (dependent
1017 libraries) that make it possible...</p>
1018 <p>Why do distros expect all the living organisms on your machine to
1019 share The World's Single Lungs Service, and The World's Single Stomach
1020 Service, and The World's Single Liver Service?</p>
1021 <p>You know, instead of letting every organism have its own slightly
1022 different version of those organs, customized for it? We humans know
1023 how to do vaccination campaigns and everything; maybe we need better
1024 tools to apply bug fixes where they are needed?</p>
1025 <p>I know this metaphor is extremely imperfect and not how things work in
1026 software, but it makes me wonder.</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Looking for candidates for the 2020 GNOME Foundation elections</title><link href="https://people.gnome.org/~federico/blog/looking-for-candidates-2020.html" rel="alternate"></link><published>2020-05-26T17:05:31-05:00</published><updated>2020-05-26T17:05:31-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-05-26:/~federico/blog/looking-for-candidates-2020.html</id><summary type="html"><p>I forgot to write this a few days ago; I hope it is not too late.</p>
1027 <p>The GNOME Foundation's <a href="https://mail.gnome.org/archives/foundation-announce/2020-May/msg00000.html">elections for the Board</a> are coming
1028 up, and we are looking for candidates. Of the 7 directors, we are
1029 replacing 4, and the 3 remaining positions remain for another year.
1030 You …</p></summary><content type="html"><p>I forgot to write this a few days ago; I hope it is not too late.</p>
1031 <p>The GNOME Foundation's <a href="https://mail.gnome.org/archives/foundation-announce/2020-May/msg00000.html">elections for the Board</a> are coming
1032 up, and we are looking for candidates. Of the 7 directors, we are
1033 replacing 4, and the 3 remaining positions remain for another year.
1034 You could be one of those four.</p>
1035 <p>I would like it very much if there were candidates and directors that
1036 fall outside the box of "white male programmer"; it is unfortunate
1037 that for the current Board we ended up with all dudes. GNOME has a
1038 <a href="https://wiki.gnome.org/Foundation/CodeOfConduct">Code of Conduct</a> to make it a good place to be.</p>
1039 <p>Allan Day wrote a <a href="https://blogs.gnome.org/aday/2020/05/26/gnome-foundation-board-of-directors-a-year-in-review/">review of the Board's activies for the last
1040 year</a>. We are moving from a model where the Board does a
1041 little bit of everything, to one with a more strategic role — now that
1042 the Foundation has full-time employees, they take care of most of the
1043 executive work.</p>
1044 <p><strong>The call-for-candidates is open until May 29, so hurry up!</strong></p>
1045 <ul>
1046 <li><a href="https://blogs.gnome.org/aday/2020/05/26/gnome-foundation-board-of-directors-a-year-in-review/">GNOME Foundation Board of Directors: a Year in Review</a></li>
1047 <li><a href="https://mail.gnome.org/archives/foundation-announce/2020-May/msg00000.html">Call for candidates and elections</a></li>
1048 </ul></content><category term="misc"></category><category term="gnome"></category></entry><entry><title>Bringing my Emacs from the past</title><link href="https://people.gnome.org/~federico/blog/bringing-my-emacs-from-the-past.html" rel="alternate"></link><published>2020-04-28T18:03:22-05:00</published><updated>2020-04-28T18:03:22-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-04-28:/~federico/blog/bringing-my-emacs-from-the-past.html</id><summary type="html"><p>I started using Emacs in 1995, and since then I have been carrying a <code>.emacs</code>
1049 that by now has a lot of accumulated crap. It is such an old configuration that
1050 it didn't even use the modern convention of <code>~/.emacs.d/init.el</code> (and it looks
1051 like a newer Emacs …</p></summary><content type="html"><p>I started using Emacs in 1995, and since then I have been carrying a <code>.emacs</code>
1052 that by now has a lot of accumulated crap. It is such an old configuration that
1053 it didn't even use the modern convention of <code>~/.emacs.d/init.el</code> (and it looks
1054 like a newer Emacs version will allow <code>.config/emacs</code> as per the XDG
1055 standard... at last).</p>
1056 <p>I have wanted to change my Emacs configuration for some time, and give it all
1057 the pretty and modern toys.</p>
1058 <p>The things that matter the most to me:</p>
1059 <ul>
1060 <li>Not have a random dumpster in <code>~/.emacs</code> if possible.</li>
1061 <li>Pretty colors.</li>
1062 <li>Magit.</li>
1063 <li>Rust-mode or whatever the new thing is for rust-analyzer and the Language Server.</li>
1064 </ul>
1065 <p>After looking at several examples of configurations that mention <code>use-package</code>
1066 as a unified way of loading packages and configuring them, I found <a href="https://github.com/alhassy/emacs.d">this
1067 configuration</a> which is extremely well
1068 documented. The author does literate programming with org-mode and elisp —
1069 something which I'm casually interested in, but not just now — but that way
1070 everything ends up very well explained and easy to read.</p>
1071 <p>I extracted bits of that configuration and ended up with the following.</p>
1072 <h2>Everything in <code>~/.emacs/init.el</code> and with <code>use-package</code></h2>
1073 <div class="highlight"><pre><span></span><code><span class="c1">;; Initialize package system</span>
1074
1075 <span class="p">(</span><span class="nb">require</span> <span class="ss">&#39;package</span><span class="p">)</span>
1076
1077 <span class="p">(</span><span class="k">setq</span> <span class="nv">package-archives</span>
1078 <span class="o">&#39;</span><span class="p">((</span><span class="s">&quot;org&quot;</span> <span class="o">.</span> <span class="s">&quot;https://orgmode.org/elpa/&quot;</span><span class="p">)</span>
1079 <span class="p">(</span><span class="s">&quot;gnu&quot;</span> <span class="o">.</span> <span class="s">&quot;https://elpa.gnu.org/packages/&quot;</span><span class="p">)</span>
1080 <span class="p">(</span><span class="s">&quot;melpa&quot;</span> <span class="o">.</span> <span class="s">&quot;https://melpa.org/packages/&quot;</span><span class="p">)))</span>
1081
1082 <span class="p">(</span><span class="nv">package-initialize</span><span class="p">)</span>
1083 <span class="c1">;(package-refresh-contents)</span>
1084
1085 <span class="c1">;; Use-package for civilized configuration</span>
1086
1087 <span class="p">(</span><span class="nb">unless</span> <span class="p">(</span><span class="nv">package-installed-p</span> <span class="ss">&#39;use-package</span><span class="p">)</span>
1088 <span class="p">(</span><span class="nv">package-install</span> <span class="ss">&#39;use-package</span><span class="p">))</span>
1089 <span class="p">(</span><span class="nb">require</span> <span class="ss">&#39;use-package</span><span class="p">)</span>
1090
1091 <span class="p">(</span><span class="k">setq</span> <span class="nv">use-package-always-ensure</span> <span class="no">t</span><span class="p">)</span>
1092 </code></pre></div>
1093
1094 <h2><code>~/.emacs.d/custom.el</code> for <code>M-x customize</code> stuff</h2>
1095 <div class="highlight"><pre><span></span><code><span class="c1">;; Set customization data in a specific file, without littering</span>
1096 <span class="c1">;; my init files.</span>
1097
1098 <span class="p">(</span><span class="k">setq</span> <span class="nv">custom-file</span> <span class="s">&quot;~/.emacs.d/custom.el&quot;</span><span class="p">)</span>
1099 <span class="p">(</span><span class="nf">load</span> <span class="nv">custom-file</span><span class="p">)</span>
1100 </code></pre></div>
1101
1102 <h2>Which-key to get hints when typing command prefixes</h2>
1103 <div class="highlight"><pre><span></span><code><span class="c1">;; Make it easier to discover key shortcuts</span>
1104
1105 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">which-key</span>
1106 <span class="nb">:diminish</span>
1107 <span class="nb">:config</span>
1108 <span class="p">(</span><span class="nv">which-key-mode</span><span class="p">)</span>
1109 <span class="p">(</span><span class="nv">which-key-setup-side-window-bottom</span><span class="p">)</span>
1110 <span class="p">(</span><span class="k">setq</span> <span class="nv">which-key-idle-delay</span> <span class="mf">0.1</span><span class="p">))</span>
1111 </code></pre></div>
1112
1113 <h2>Don't pollute the modeline with common modes</h2>
1114 <div class="highlight"><pre><span></span><code><span class="c1">;; Do not show some common modes in the modeline, to save space</span>
1115
1116 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">diminish</span>
1117 <span class="nb">:defer</span> <span class="mi">5</span>
1118 <span class="nb">:config</span>
1119 <span class="p">(</span><span class="nv">diminish</span> <span class="ss">&#39;org-indent-mode</span><span class="p">))</span>
1120 </code></pre></div>
1121
1122 <h2>Magit to use git in a civilized fashion</h2>
1123 <div class="highlight"><pre><span></span><code><span class="c1">;; Magit</span>
1124
1125 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">magit</span>
1126 <span class="nb">:config</span>
1127 <span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">&quot;C-x g&quot;</span><span class="p">)</span> <span class="ss">&#39;magit-status</span><span class="p">))</span>
1128 </code></pre></div>
1129
1130 <h2>Move between windows with Shift-arrows</h2>
1131 <div class="highlight"><pre><span></span><code><span class="c1">;; Let me switch windows with shift-arrows instead of &quot;C-x o&quot; all the time</span>
1132 <span class="p">(</span><span class="nv">windmove-default-keybindings</span><span class="p">)</span>
1133 </code></pre></div>
1134
1135 <h2>Pretty colors</h2>
1136 <p>I was using <code>solarized-dark</code> but I like this one even better:</p>
1137 <div class="highlight"><pre><span></span><code><span class="c1">;; Pretty colors</span>
1138
1139 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">flatland-theme</span>
1140 <span class="nb">:config</span>
1141 <span class="p">(</span><span class="nv">custom-theme-set-faces</span> <span class="ss">&#39;flatland</span>
1142 <span class="o">&#39;</span><span class="p">(</span><span class="nv">show-paren-match</span> <span class="p">((</span><span class="no">t</span> <span class="p">(</span><span class="nb">:background</span> <span class="s">&quot;dark gray&quot;</span> <span class="nb">:foreground</span> <span class="s">&quot;black&quot;</span> <span class="nb">:weight</span> <span class="nv">bold</span><span class="p">))))</span>
1143 <span class="o">&#39;</span><span class="p">(</span><span class="nv">show-paren-mismatch</span> <span class="p">((</span><span class="no">t</span> <span class="p">(</span><span class="nb">:background</span> <span class="s">&quot;firebrick&quot;</span> <span class="nb">:foreground</span> <span class="s">&quot;orange&quot;</span> <span class="nb">:weight</span> <span class="nv">bold</span><span class="p">))))))</span>
1144 </code></pre></div>
1145
1146 <h2>Nyan cat instead of scrollbars</h2>
1147 <div class="highlight"><pre><span></span><code><span class="c1">;; Nyan cat instead of scrollbar</span>
1148 <span class="c1">;; scroll-bar-mode is turned off in custom.el</span>
1149
1150 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">nyan-mode</span>
1151 <span class="nb">:config</span>
1152 <span class="p">(</span><span class="nv">nyan-mode</span> <span class="mi">1</span><span class="p">))</span>
1153 </code></pre></div>
1154
1155 <h2>Move buffers to adjacent windows</h2>
1156 <div class="highlight"><pre><span></span><code><span class="c1">;; Move buffers between windows</span>
1157
1158 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">buffer-move</span>
1159 <span class="nb">:config</span>
1160 <span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">&quot;&lt;C-S-up&gt;&quot;</span><span class="p">)</span> <span class="ss">&#39;buf-move-up</span><span class="p">)</span>
1161 <span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">&quot;&lt;C-S-down&gt;&quot;</span><span class="p">)</span> <span class="ss">&#39;buf-move-down</span><span class="p">)</span>
1162 <span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">&quot;&lt;C-S-left&gt;&quot;</span><span class="p">)</span> <span class="ss">&#39;buf-move-left</span><span class="p">)</span>
1163 <span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">&quot;&lt;C-S-right&gt;&quot;</span><span class="p">)</span> <span class="ss">&#39;buf-move-right</span><span class="p">))</span>
1164 </code></pre></div>
1165
1166 <h2>Change buffer names for files with the same name</h2>
1167 <div class="highlight"><pre><span></span><code><span class="c1">;; Note that ‘uniquify’ is builtin.</span>
1168 <span class="p">(</span><span class="nb">require</span> <span class="ss">&#39;uniquify</span><span class="p">)</span>
1169 <span class="p">(</span><span class="k">setq</span> <span class="nv">uniquify-separator</span> <span class="s">&quot;/&quot;</span> <span class="c1">;; The separator in buffer names.</span>
1170 <span class="nv">uniquify-buffer-name-style</span> <span class="ss">&#39;forward</span><span class="p">)</span> <span class="c1">;; names/in/this/style</span>
1171 </code></pre></div>
1172
1173 <h2>Helm to auto-complete in grand style</h2>
1174 <div class="highlight"><pre><span></span><code><span class="p">(</span><span class="nb">use-package</span> <span class="nv">helm</span>
1175 <span class="nb">:diminish</span>
1176 <span class="nb">:init</span> <span class="p">(</span><span class="nv">helm-mode</span> <span class="no">t</span><span class="p">)</span>
1177 <span class="nb">:bind</span> <span class="p">((</span><span class="s">&quot;M-x&quot;</span> <span class="o">.</span> <span class="nv">helm-M-x</span><span class="p">)</span>
1178 <span class="p">(</span><span class="s">&quot;C-x C-f&quot;</span> <span class="o">.</span> <span class="nv">helm-find-files</span><span class="p">)</span>
1179 <span class="p">(</span><span class="s">&quot;C-x b&quot;</span> <span class="o">.</span> <span class="nv">helm-mini</span><span class="p">)</span> <span class="c1">;; See buffers &amp; recent files; more useful.</span>
1180 <span class="p">(</span><span class="s">&quot;C-x r b&quot;</span> <span class="o">.</span> <span class="nv">helm-filtered-bookmarks</span><span class="p">)</span>
1181 <span class="p">(</span><span class="s">&quot;C-x C-r&quot;</span> <span class="o">.</span> <span class="nv">helm-recentf</span><span class="p">)</span> <span class="c1">;; Search for recently edited files</span>
1182 <span class="p">(</span><span class="s">&quot;C-c i&quot;</span> <span class="o">.</span> <span class="nv">helm-imenu</span><span class="p">)</span>
1183 <span class="p">(</span><span class="s">&quot;C-h a&quot;</span> <span class="o">.</span> <span class="nv">helm-apropos</span><span class="p">)</span>
1184 <span class="c1">;; Look at what was cut recently &amp; paste it in.</span>
1185 <span class="p">(</span><span class="s">&quot;M-y&quot;</span> <span class="o">.</span> <span class="nv">helm-show-kill-ring</span><span class="p">)</span>
1186
1187 <span class="nb">:map</span> <span class="nv">helm-map</span>
1188 <span class="c1">;; We can list ‘actions’ on the currently selected item by C-z.</span>
1189 <span class="p">(</span><span class="s">&quot;C-z&quot;</span> <span class="o">.</span> <span class="nv">helm-select-action</span><span class="p">)</span>
1190 <span class="c1">;; Let&#39;s keep tab-completetion anyhow.</span>
1191 <span class="p">(</span><span class="s">&quot;TAB&quot;</span> <span class="o">.</span> <span class="nv">helm-execute-persistent-action</span><span class="p">)</span>
1192 <span class="p">(</span><span class="s">&quot;&lt;tab&gt;&quot;</span> <span class="o">.</span> <span class="nv">helm-execute-persistent-action</span><span class="p">)))</span>
1193 </code></pre></div>
1194
1195 <h2>Ripgrep to search in grand style</h2>
1196 <div class="highlight"><pre><span></span><code><span class="c1">;; Ripgrep</span>
1197
1198 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">rg</span>
1199 <span class="nb">:config</span>
1200 <span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">&quot;M-s g&quot;</span><span class="p">)</span> <span class="ss">&#39;rg</span><span class="p">)</span>
1201 <span class="p">(</span><span class="nv">global-set-key</span> <span class="p">(</span><span class="nv">kbd</span> <span class="s">&quot;M-s d&quot;</span><span class="p">)</span> <span class="ss">&#39;rg-dwim</span><span class="p">))</span>
1202
1203 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">helm-rg</span><span class="p">)</span>
1204 </code></pre></div>
1205
1206 <h2>Rust mode and Language Server</h2>
1207 <p>Now that RLS is in the process of being deprecated, it's getting substituted
1208 with rust-analyzer. Also, rust-mode goes away in favor of rustic.</p>
1209 <div class="highlight"><pre><span></span><code><span class="c1">;; Rustic, LSP</span>
1210
1211 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">flycheck</span><span class="p">)</span>
1212
1213 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">rustic</span><span class="p">)</span>
1214
1215 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">lsp-ui</span><span class="p">)</span>
1216
1217 <span class="p">(</span><span class="nb">use-package</span> <span class="nv">helm-lsp</span>
1218 <span class="nb">:config</span>
1219 <span class="p">(</span><span class="nf">define-key</span> <span class="nv">lsp-mode-map</span> <span class="p">[</span><span class="nv">remap</span> <span class="nv">xref-find-apropos</span><span class="p">]</span> <span class="nf">#&#39;</span><span class="nv">helm-lsp-workspace-symbol</span><span class="p">))</span>
1220 </code></pre></div>
1221
1222 <h2>Performatively not get distracted</h2>
1223 <div class="highlight"><pre><span></span><code><span class="p">;;;</span><span class="w"> </span><span class="n">Show</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">notification</span><span class="w"> </span><span class="n">when</span><span class="w"> </span><span class="n">compilation</span><span class="w"> </span><span class="n">finishes</span><span class="w"></span>
1224
1225 <span class="p">(</span><span class="n">setq</span><span class="w"> </span><span class="n">compilation</span><span class="o">-</span><span class="n">finish</span><span class="o">-</span><span class="n">functions</span><span class="w"></span>
1226 <span class="w"> </span><span class="p">(</span><span class="n">append</span><span class="w"> </span><span class="n">compilation</span><span class="o">-</span><span class="n">finish</span><span class="o">-</span><span class="n">functions</span><span class="w"></span>
1227 <span class="w"> </span><span class="o">&#39;</span><span class="p">(</span><span class="n">fmq</span><span class="o">-</span><span class="n">compilation</span><span class="o">-</span><span class="n">finish</span><span class="p">)))</span><span class="w"></span>
1228
1229 <span class="p">(</span><span class="n">defun</span><span class="w"> </span><span class="n">fmq</span><span class="o">-</span><span class="n">compilation</span><span class="o">-</span><span class="n">finish</span><span class="w"> </span><span class="p">(</span><span class="n">buffer</span><span class="w"> </span><span class="n">status</span><span class="p">)</span><span class="w"></span>
1230 <span class="w"> </span><span class="p">(</span><span class="n">when</span><span class="w"> </span><span class="p">(</span><span class="n">not</span><span class="w"> </span><span class="p">(</span><span class="n">member</span><span class="w"> </span><span class="n">mode</span><span class="o">-</span><span class="n">name</span><span class="w"> </span><span class="o">&#39;</span><span class="p">(</span><span class="s">&quot;Grep&quot;</span><span class="w"> </span><span class="s">&quot;rg&quot;</span><span class="p">)))</span><span class="w"></span>
1231 <span class="w"> </span><span class="p">(</span><span class="n">call</span><span class="o">-</span><span class="n">process</span><span class="w"> </span><span class="s">&quot;notify-send&quot;</span><span class="w"> </span><span class="n">nil</span><span class="w"> </span><span class="n">nil</span><span class="w"> </span><span class="n">nil</span><span class="w"></span>
1232 <span class="w"> </span><span class="s">&quot;-t&quot;</span><span class="w"> </span><span class="s">&quot;0&quot;</span><span class="w"></span>
1233 <span class="w"> </span><span class="s">&quot;-i&quot;</span><span class="w"> </span><span class="s">&quot;emacs&quot;</span><span class="w"></span>
1234 <span class="w"> </span><span class="s">&quot;Compilation finished in Emacs&quot;</span><span class="w"></span>
1235 <span class="w"> </span><span class="n">status</span><span class="p">)))</span><span class="w"></span>
1236 </code></pre></div>
1237
1238 <h2>Stuff from custom.el</h2>
1239 <p>The interesting bits here are making LSP work; everything else is preferences.</p>
1240 <div class="highlight"><pre><span></span><code><span class="p">(</span><span class="nv">custom-set-variables</span>
1241 <span class="c1">;; custom-set-variables was added by Custom.</span>
1242 <span class="c1">;; If you edit it by hand, you could mess it up, so be careful.</span>
1243 <span class="c1">;; Your init file should contain only one such instance.</span>
1244 <span class="c1">;; If there is more than one, they won&#39;t work right.</span>
1245 <span class="o">&#39;</span><span class="p">(</span><span class="nv">column-number-mode</span> <span class="no">t</span><span class="p">)</span>
1246 <span class="o">&#39;</span><span class="p">(</span><span class="nv">custom-safe-themes</span>
1247 <span class="p">(</span><span class="k">quote</span>
1248 <span class="p">(</span><span class="s">&quot;2540689fd0bc5d74c4682764ff6c94057ba8061a98be5dd21116bf7bf301acfb&quot;</span> <span class="s">&quot;bffa9739ce0752a37d9b1eee78fc00ba159748f50dc328af4be661484848e476&quot;</span> <span class="s">&quot;0fffa9669425ff140ff2ae8568c7719705ef33b7a927a0ba7c5e2ffcfac09b75&quot;</span> <span class="s">&quot;2809bcb77ad21312897b541134981282dc455ccd7c14d74cc333b6e549b824f3&quot;</span> <span class="nv">default</span><span class="p">)))</span>
1249 <span class="o">&#39;</span><span class="p">(</span><span class="nv">delete-selection-mode</span> <span class="no">nil</span><span class="p">)</span>
1250 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-rust-analyzer-display-chaining-hints</span> <span class="no">t</span><span class="p">)</span>
1251 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-rust-analyzer-display-parameter-hints</span> <span class="no">nil</span><span class="p">)</span>
1252 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-rust-analyzer-macro-expansion-method</span> <span class="p">(</span><span class="k">quote</span> <span class="nv">rustic-analyzer-macro-expand</span><span class="p">))</span>
1253 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-rust-analyzer-server-command</span> <span class="p">(</span><span class="k">quote</span> <span class="p">(</span><span class="s">&quot;/home/federico/.cargo/bin/rust-analyzer&quot;</span><span class="p">)))</span>
1254 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-rust-analyzer-server-display-inlay-hints</span> <span class="no">nil</span><span class="p">)</span>
1255 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-rust-full-docs</span> <span class="no">t</span><span class="p">)</span>
1256 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-rust-server</span> <span class="p">(</span><span class="k">quote</span> <span class="nv">rust-analyzer</span><span class="p">))</span>
1257 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-ui-doc-alignment</span> <span class="p">(</span><span class="k">quote</span> <span class="nv">window</span><span class="p">))</span>
1258 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-ui-doc-position</span> <span class="p">(</span><span class="k">quote</span> <span class="nv">top</span><span class="p">))</span>
1259 <span class="o">&#39;</span><span class="p">(</span><span class="nv">lsp-ui-sideline-enable</span> <span class="no">nil</span><span class="p">)</span>
1260 <span class="o">&#39;</span><span class="p">(</span><span class="nv">menu-bar-mode</span> <span class="no">nil</span><span class="p">)</span>
1261 <span class="o">&#39;</span><span class="p">(</span><span class="nv">package-selected-packages</span>
1262 <span class="p">(</span><span class="k">quote</span>
1263 <span class="p">(</span><span class="nv">helm-lsp</span> <span class="nv">lsp-ui</span> <span class="nv">lsp-mode</span> <span class="nv">flycheck</span> <span class="nv">rustic</span> <span class="nv">rg</span> <span class="nv">helm-rg</span> <span class="nv">ripgrep</span> <span class="nv">helm-projectile</span> <span class="nv">helm</span> <span class="nv">buffer-move</span> <span class="nv">nyan-mode</span> <span class="nv">flatland-black-theme</span> <span class="nv">flatland-theme</span> <span class="nv">afternoon-theme</span> <span class="nv">spacemacs-theme</span> <span class="nv">solarized-theme</span> <span class="nv">magit</span> <span class="nv">diminish</span> <span class="nv">which-key</span> <span class="nb">use-package</span><span class="p">)))</span>
1264 <span class="o">&#39;</span><span class="p">(</span><span class="nv">rustic-lsp-server</span> <span class="p">(</span><span class="k">quote</span> <span class="nv">rust-analyzer</span><span class="p">))</span>
1265 <span class="o">&#39;</span><span class="p">(</span><span class="nv">scroll-bar-mode</span> <span class="no">nil</span><span class="p">)</span>
1266 <span class="o">&#39;</span><span class="p">(</span><span class="nv">scroll-step</span> <span class="mi">0</span><span class="p">)</span>
1267 <span class="o">&#39;</span><span class="p">(</span><span class="nv">tool-bar-mode</span> <span class="no">nil</span><span class="p">))</span>
1268 <span class="p">(</span><span class="nv">custom-set-faces</span>
1269 <span class="c1">;; custom-set-faces was added by Custom.</span>
1270 <span class="c1">;; If you edit it by hand, you could mess it up, so be careful.</span>
1271 <span class="c1">;; Your init file should contain only one such instance.</span>
1272 <span class="c1">;; If there is more than one, they won&#39;t work right.</span>
1273 <span class="p">)</span>
1274 </code></pre></div>
1275
1276 <h2>Results</h2>
1277 <p>I am very happy with rustic / rust-analyzer and the Language Server. Having
1278 documentation on each thing when one moves the cursor around code is something
1279 that I never thought would work well in Emacs. I haven't decided if I love <code>M-x
1280 lsp-rust-analyzer-inlay-hints-mode</code> or if it drives me nuts; it shows you the
1281 names of function arguments and inferred types among the code. I suppose I'll
1282 turn it off and on as needed.</p>
1283 <p>Some days ago, before using helm, I had projectile-mode to work with git
1284 checkouts and I was quite liking it. I haven't found how to configure
1285 helm-projectile to work; I'll have to keep experimenting.</p></content><category term="misc"></category><category term="emacs"></category></entry><entry><title>Reducing memory consumption in librsvg, part 4: compact representation for Bézier paths</title><link href="https://people.gnome.org/~federico/blog/reducing-memory-consumption-in-librsvg-4.html" rel="alternate"></link><published>2020-03-26T18:58:36-06:00</published><updated>2020-03-26T18:58:36-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-03-26:/~federico/blog/reducing-memory-consumption-in-librsvg-4.html</id><summary type="html"><p>Let's continue with the <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/574">enormous SVG</a> from the last time, a
1286 map extracted from OpenStreetMap.</p>
1287 <p>According to <a href="https://valgrind.org/docs/manual/ms-manual.html">Massif</a>, peak memory consumption for that file occurs at
1288 the following point during the execution of rsvg-convert. I pasted
1289 only the part that refers to Bézier paths:</p>
1290 <div class="highlight"><pre><span></span><code> <span class="nt">--------------------------------------------------------------------------------</span>
1291 <span class="nt">n</span> <span class="nt">time</span><span class="o">(</span><span class="nt">i</span><span class="o">)</span> <span class="nt">total</span><span class="o">(</span><span class="nt">B</span><span class="o">)</span> <span class="nt">useful-heap …</span></code></pre></div></summary><content type="html"><p>Let's continue with the <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/574">enormous SVG</a> from the last time, a
1292 map extracted from OpenStreetMap.</p>
1293 <p>According to <a href="https://valgrind.org/docs/manual/ms-manual.html">Massif</a>, peak memory consumption for that file occurs at
1294 the following point during the execution of rsvg-convert. I pasted
1295 only the part that refers to Bézier paths:</p>
1296 <div class="highlight"><pre><span></span><code> <span class="nt">--------------------------------------------------------------------------------</span>
1297 <span class="nt">n</span> <span class="nt">time</span><span class="o">(</span><span class="nt">i</span><span class="o">)</span> <span class="nt">total</span><span class="o">(</span><span class="nt">B</span><span class="o">)</span> <span class="nt">useful-heap</span><span class="o">(</span><span class="nt">B</span><span class="o">)</span> <span class="nt">extra-heap</span><span class="o">(</span><span class="nt">B</span><span class="o">)</span> <span class="nt">stacks</span><span class="o">(</span><span class="nt">B</span><span class="o">)</span>
1298 <span class="nt">--------------------------------------------------------------------------------</span>
1299 <span class="nt">1</span> <span class="nt">33</span> <span class="nt">24</span><span class="o">,</span><span class="nt">139</span><span class="o">,</span><span class="nt">598</span><span class="o">,</span><span class="nt">653</span> <span class="nt">1</span><span class="o">,</span><span class="nt">416</span><span class="o">,</span><span class="nt">831</span><span class="o">,</span><span class="nt">176</span> <span class="nt">1</span><span class="o">,</span><span class="nt">329</span><span class="o">,</span><span class="nt">943</span><span class="o">,</span><span class="nt">212</span> <span class="nt">86</span><span class="o">,</span><span class="nt">887</span><span class="o">,</span><span class="nt">964</span> <span class="nt">0</span>
1300 <span class="nt">2</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x4A2727E</span><span class="o">:</span> <span class="nt">alloc</span> <span class="o">(</span><span class="nt">alloc</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">84</span><span class="o">)</span>
1301 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x4A2727E</span><span class="o">:</span> <span class="nt">alloc</span> <span class="o">(</span><span class="nt">alloc</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">172</span><span class="o">)</span>
1302 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x4A2727E</span><span class="o">:</span> <span class="nt">allocate_in</span><span class="o">&lt;</span><span class="nt">rsvg_internals</span><span class="p">::</span><span class="nd">path_builder</span><span class="p">::</span><span class="nd">PathCommand</span><span class="o">,</span><span class="nt">alloc</span><span class="p">::</span><span class="nd">alloc</span><span class="p">::</span><span class="nd">Global</span><span class="o">&gt;</span> <span class="o">(</span><span class="nt">raw_vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">98</span><span class="o">)</span>
1303 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x4A2727E</span><span class="o">:</span> <span class="nt">with_capacity</span><span class="o">&lt;</span><span class="nt">rsvg_internals</span><span class="p">::</span><span class="nd">path_builder</span><span class="p">::</span><span class="nd">PathCommand</span><span class="o">&gt;</span> <span class="o">(</span><span class="nt">raw_vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">167</span><span class="o">)</span>
1304 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x4A2727E</span><span class="o">:</span> <span class="nt">with_capacity</span><span class="o">&lt;</span><span class="nt">rsvg_internals</span><span class="p">::</span><span class="nd">path_builder</span><span class="p">::</span><span class="nd">PathCommand</span><span class="o">&gt;</span> <span class="o">(</span><span class="nt">vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">358</span><span class="o">)</span>
1305 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x4A2727E</span><span class="o">:</span> <span class="o">&lt;</span><span class="nt">alloc</span><span class="p">::</span><span class="nd">vec</span><span class="p">::</span><span class="nd">Vec</span><span class="o">&lt;</span><span class="nt">T</span><span class="o">&gt;</span> <span class="nt">as</span> <span class="nt">alloc</span><span class="p">::</span><span class="nd">vec</span><span class="p">::</span><span class="nd">SpecExtend</span><span class="o">&lt;</span><span class="nt">T</span><span class="o">,</span><span class="nt">I</span><span class="o">&gt;&gt;</span><span class="p">::</span><span class="nd">from_iter</span> <span class="o">(</span><span class="nt">vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">1992</span><span class="o">)</span>
1306 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x49D212C</span><span class="o">:</span> <span class="nt">from_iter</span><span class="o">&lt;</span><span class="nt">rsvg_internals</span><span class="p">::</span><span class="nd">path_builder</span><span class="p">::</span><span class="nd">PathCommand</span><span class="o">,</span><span class="nt">smallvec</span><span class="p">::</span><span class="nd">IntoIter</span><span class="o">&lt;</span><span class="cp">[</span><span class="nx">rsvg_internals</span><span class="nl">::path_builder::PathCommand</span><span class="p">;</span> <span class="mi">32</span><span class="cp">]</span><span class="o">&gt;&gt;</span> <span class="o">(</span><span class="nt">vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">1901</span><span class="o">)</span>
1307 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x49D212C</span><span class="o">:</span> <span class="nt">collect</span><span class="o">&lt;</span><span class="nt">smallvec</span><span class="p">::</span><span class="nd">IntoIter</span><span class="o">&lt;</span><span class="cp">[</span><span class="nx">rsvg_internals</span><span class="nl">::path_builder::PathCommand</span><span class="p">;</span> <span class="mi">32</span><span class="cp">]</span><span class="o">&gt;,</span><span class="nt">alloc</span><span class="p">::</span><span class="nd">vec</span><span class="p">::</span><span class="nd">Vec</span><span class="o">&lt;</span><span class="nt">rsvg_internals</span><span class="p">::</span><span class="nd">path_builder</span><span class="p">::</span><span class="nd">PathCommand</span><span class="o">&gt;&gt;</span> <span class="o">(</span><span class="nt">iterator</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">1493</span><span class="o">)</span>
1308 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x49D212C</span><span class="o">:</span> <span class="nt">into_vec</span><span class="o">&lt;</span><span class="cp">[</span><span class="nx">rsvg_internals</span><span class="nl">::path_builder::PathCommand</span><span class="p">;</span> <span class="mi">32</span><span class="cp">]</span><span class="o">&gt;</span> <span class="o">(</span><span class="nt">lib</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">893</span><span class="o">)</span>
1309 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">448B</span><span class="o">)</span> <span class="nt">0x49D212C</span><span class="o">:</span> <span class="nt">smallvec</span><span class="p">::</span><span class="nd">SmallVec</span><span class="o">&lt;</span><span class="nt">A</span><span class="o">&gt;</span><span class="p">::</span><span class="nd">into_boxed_slice</span> <span class="o">(</span><span class="nt">lib</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">902</span><span class="o">)</span>
1310 <span class="nt">3</span> <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">24</span><span class="p">.</span><span class="nc">88</span><span class="o">%</span> <span class="o">(</span><span class="nt">352</span><span class="o">,</span><span class="nt">523</span><span class="o">,</span><span class="nt">016B</span><span class="o">)</span> <span class="nt">0x4A0394C</span><span class="o">:</span> <span class="nt">into_path</span> <span class="o">(</span><span class="nt">path_builder</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">320</span><span class="o">)</span>
1311 <span class="o">|</span>
1312 <span class="nt">4</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">03</span><span class="p">.</span><span class="nc">60</span><span class="o">%</span> <span class="o">(</span><span class="nt">50</span><span class="o">,</span><span class="nt">990</span><span class="o">,</span><span class="nt">328B</span><span class="o">)</span> <span class="nt">0x4A242F0</span><span class="o">:</span> <span class="nt">realloc</span> <span class="o">(</span><span class="nt">alloc</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">128</span><span class="o">)</span>
1313 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">03</span><span class="p">.</span><span class="nc">60</span><span class="o">%</span> <span class="o">(</span><span class="nt">50</span><span class="o">,</span><span class="nt">990</span><span class="o">,</span><span class="nt">328B</span><span class="o">)</span> <span class="nt">0x4A242F0</span><span class="o">:</span> <span class="nt">realloc</span> <span class="o">(</span><span class="nt">alloc</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">187</span><span class="o">)</span>
1314 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">03</span><span class="p">.</span><span class="nc">60</span><span class="o">%</span> <span class="o">(</span><span class="nt">50</span><span class="o">,</span><span class="nt">990</span><span class="o">,</span><span class="nt">328B</span><span class="o">)</span> <span class="nt">0x4A242F0</span><span class="o">:</span> <span class="nt">shrink_to_fit</span><span class="o">&lt;</span><span class="nt">rsvg_internals</span><span class="p">::</span><span class="nd">path_builder</span><span class="p">::</span><span class="nd">PathCommand</span><span class="o">,</span><span class="nt">alloc</span><span class="p">::</span><span class="nd">alloc</span><span class="p">::</span><span class="nd">Global</span><span class="o">&gt;</span> <span class="o">(</span><span class="nt">raw_vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">633</span><span class="o">)</span>
1315 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">03</span><span class="p">.</span><span class="nc">60</span><span class="o">%</span> <span class="o">(</span><span class="nt">50</span><span class="o">,</span><span class="nt">990</span><span class="o">,</span><span class="nt">328B</span><span class="o">)</span> <span class="nt">0x4A242F0</span><span class="o">:</span> <span class="nt">shrink_to_fit</span><span class="o">&lt;</span><span class="nt">rsvg_internals</span><span class="p">::</span><span class="nd">path_builder</span><span class="p">::</span><span class="nd">PathCommand</span><span class="o">&gt;</span> <span class="o">(</span><span class="nt">vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">623</span><span class="o">)</span>
1316 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">03</span><span class="p">.</span><span class="nc">60</span><span class="o">%</span> <span class="o">(</span><span class="nt">50</span><span class="o">,</span><span class="nt">990</span><span class="o">,</span><span class="nt">328B</span><span class="o">)</span> <span class="nt">0x4A242F0</span><span class="o">:</span> <span class="nt">alloc</span><span class="p">::</span><span class="nd">vec</span><span class="p">::</span><span class="nd">Vec</span><span class="o">&lt;</span><span class="nt">T</span><span class="o">&gt;</span><span class="p">::</span><span class="nd">into_boxed_slice</span> <span class="o">(</span><span class="nt">vec</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">679</span><span class="o">)</span>
1317 <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">03</span><span class="p">.</span><span class="nc">60</span><span class="o">%</span> <span class="o">(</span><span class="nt">50</span><span class="o">,</span><span class="nt">990</span><span class="o">,</span><span class="nt">328B</span><span class="o">)</span> <span class="nt">0x49D2136</span><span class="o">:</span> <span class="nt">smallvec</span><span class="p">::</span><span class="nd">SmallVec</span><span class="o">&lt;</span><span class="nt">A</span><span class="o">&gt;</span><span class="p">::</span><span class="nd">into_boxed_slice</span> <span class="o">(</span><span class="nt">lib</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">902</span><span class="o">)</span>
1318 <span class="nt">5</span> <span class="o">|</span> <span class="nt">-</span><span class="o">&gt;</span><span class="nt">03</span><span class="p">.</span><span class="nc">60</span><span class="o">%</span> <span class="o">(</span><span class="nt">50</span><span class="o">,</span><span class="nt">990</span><span class="o">,</span><span class="nt">328B</span><span class="o">)</span> <span class="nt">0x4A0394C</span><span class="o">:</span> <span class="nt">into_path</span> <span class="o">(</span><span class="nt">path_builder</span><span class="p">.</span><span class="nc">rs</span><span class="p">:</span><span class="nd">320</span><span class="o">)</span>
1319 </code></pre></div>
1320
1321 <p>Line 1 has the totals, and we see that at that point the program uses
1322 1,329,943,212 bytes on the heap.</p>
1323 <p>Lines 3 and 5 give us a hint that <code>into_path</code> is being called; this is
1324 the function that converts a temporary/mutable <code>PathBuilder</code> into a
1325 permanent/immutable <code>Path</code>.</p>
1326 <p>Lines 2 and 4 indicate that the arrays of <code>PathCommand</code>, which are
1327 inside those immutable <code>Path</code>s, use 24.88% + 3.60% = 28.48% of the
1328 program's memory; between both they use
1329 352,523,448 + 50,990,328 = 403,513,776 bytes.</p>
1330 <p>That is about 400 MB of <code>PathCommand</code>. Let's see what's going on.</p>
1331 <h2>What is in a PathCommand?</h2>
1332 <p>A <code>Path</code> is a list of commands similar to PostScript, which get used
1333 in SVG to draw Bézier paths. It is a flat array of <code>PathCommand</code>:</p>
1334 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Path</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1335 <span class="w"> </span><span class="n">path_commands</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="p">[</span><span class="n">PathCommand</span><span class="p">]</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1336 <span class="p">}</span><span class="w"></span>
1337
1338 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">PathCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1339 <span class="w"> </span><span class="n">MoveTo</span><span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
1340 <span class="w"> </span><span class="n">LineTo</span><span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
1341 <span class="w"> </span><span class="n">CurveTo</span><span class="p">(</span><span class="n">CubicBezierCurve</span><span class="p">),</span><span class="w"></span>
1342 <span class="w"> </span><span class="n">Arc</span><span class="p">(</span><span class="n">EllipticalArc</span><span class="p">),</span><span class="w"></span>
1343 <span class="w"> </span><span class="n">ClosePath</span><span class="p">,</span><span class="w"></span>
1344 <span class="p">}</span><span class="w"></span>
1345 </code></pre></div>
1346
1347 <p>Let's see the variants of <code>PathCommand</code>:</p>
1348 <ul>
1349 <li><code>MoveTo</code>: 2 double-precision floating-point numbers.</li>
1350 <li><code>LineTo</code>: same.</li>
1351 <li><code>CurveTo</code>: 6 double-precision floating-point numbers.</li>
1352 <li><code>EllipticalArc</code>: 7 double-precision floating-point numbers, plus 2
1353 flags (see below).</li>
1354 <li><code>ClosePath</code>: no extra data.</li>
1355 </ul>
1356 <p>These variants vary a lot in terms of size, and each element of the
1357 <code>Path.path_commands</code> array occupies the maximum of their sizes
1358 (i.e. <code>sizeof::&lt;EllipticalArc&gt;</code>).</p>
1359 <h2>A more compact representation</h2>
1360 <p>Ideally, each command in the array would only occupy as much space as
1361 it needs.</p>
1362 <p>We can represent a <code>Path</code> in a different way, as two separate arrays:</p>
1363 <ul>
1364 <li>A very compact array of commands without coordinates.</li>
1365 <li>An array with coordinates only.</li>
1366 </ul>
1367 <p>That is, the following:</p>
1368 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Path</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1369 <span class="w"> </span><span class="n">commands</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="p">[</span><span class="n">PackedCommand</span><span class="p">]</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1370 <span class="w"> </span><span class="n">coords</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="p">[</span><span class="kt">f64</span><span class="p">]</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1371 <span class="p">}</span><span class="w"></span>
1372 </code></pre></div>
1373
1374 <p>The <code>coords</code> array is obvious; it is just a flat array with all the
1375 coordinates in the <code>Path</code> in the order in which they appear.</p>
1376 <p>And the <code>commands</code> array?</p>
1377 <h3>PackedCommand</h3>
1378 <p>We saw above that the biggest variant in <code>PathCommand</code> is
1379 <code>Arc(EllipticalArc)</code>. Let's look inside it:</p>
1380 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">EllipticalArc</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1381 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">r</span>: <span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
1382 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">x_axis_rotation</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"></span>
1383 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">large_arc</span>: <span class="nc">LargeArc</span><span class="p">,</span><span class="w"></span>
1384 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">sweep</span>: <span class="nc">Sweep</span><span class="p">,</span><span class="w"></span>
1385 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">from</span>: <span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
1386 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">to</span>: <span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
1387 <span class="p">}</span><span class="w"></span>
1388 </code></pre></div>
1389
1390 <p>There are 7 <code>f64</code> floating-point numbers there. The other two fields,
1391 <code>large_arc</code> and <code>sweep</code>, are effectively booleans (they are just enums
1392 with two variants, with pretty names instead of just <code>true</code> and
1393 <code>false</code>).</p>
1394 <p>Thus, we have 7 doubles and two flags. Between the two flags there
1395 are 4 possibilities.</p>
1396 <p>Since no other <code>PathCommand</code> variant has flags, we can have the
1397 following enum, which fits in a single byte:</p>
1398 <div class="highlight"><pre><span></span><code><span class="cp">#[repr(u8)]</span><span class="w"></span>
1399 <span class="k">enum</span> <span class="nc">PackedCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1400 <span class="w"> </span><span class="n">MoveTo</span><span class="p">,</span><span class="w"></span>
1401 <span class="w"> </span><span class="n">LineTo</span><span class="p">,</span><span class="w"></span>
1402 <span class="w"> </span><span class="n">CurveTo</span><span class="p">,</span><span class="w"></span>
1403 <span class="w"> </span><span class="n">ArcSmallNegative</span><span class="p">,</span><span class="w"></span>
1404 <span class="w"> </span><span class="n">ArcSmallPositive</span><span class="p">,</span><span class="w"></span>
1405 <span class="w"> </span><span class="n">ArcLargeNegative</span><span class="p">,</span><span class="w"></span>
1406 <span class="w"> </span><span class="n">ArcLargePositive</span><span class="p">,</span><span class="w"></span>
1407 <span class="w"> </span><span class="n">ClosePath</span><span class="p">,</span><span class="w"></span>
1408 <span class="p">}</span><span class="w"></span>
1409 </code></pre></div>
1410
1411 <p>That is, simple values for <code>MoveTo</code>/etc. and four special values for
1412 the different types of <code>Arc</code>.</p>
1413 <h2>Packing a PathCommand into a PackedCommand</h2>
1414 <p>In order to pack the array of <code>PathCommand</code>, we must first know how
1415 many coordinates each of its variants will produce:</p>
1416 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">PathCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1417 <span class="w"> </span><span class="k">fn</span> <span class="nf">num_coordinates</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"></span>
1418 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1419 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">MoveTo</span><span class="p">(</span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w"></span>
1420 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">LineTo</span><span class="p">(</span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w"></span>
1421 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">CurveTo</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">6</span><span class="p">,</span><span class="w"></span>
1422 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">Arc</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">7</span><span class="p">,</span><span class="w"></span>
1423 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">ClosePath</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
1424 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1425 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1426 <span class="p">}</span><span class="w"></span>
1427 </code></pre></div>
1428
1429 <p>Then, we need to convert each <code>PathCommand</code> into a <code>PackedCommand</code> and
1430 write its coordinates into an array:</p>
1431 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">PathCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1432 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_packed</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">coords</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">f64</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nc">PackedCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1433 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1434 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">MoveTo</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1435 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w"></span>
1436 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">y</span><span class="p">;</span><span class="w"></span>
1437 <span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">MoveTo</span><span class="w"></span>
1438 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1439
1440 <span class="w"> </span><span class="c1">// etc. for the other simple commands</span>
1441
1442 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">Arc</span><span class="p">(</span><span class="k">ref</span><span class="w"> </span><span class="n">a</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="n">to_packed_and_coords</span><span class="p">(</span><span class="n">coords</span><span class="p">),</span><span class="w"></span>
1443 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1444 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1445 <span class="p">}</span><span class="w"></span>
1446 </code></pre></div>
1447
1448 <p>Let's look at that <code>to_packed_and_coords</code> more closely:</p>
1449 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">EllipticalArc</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1450 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_packed_and_coords</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">coords</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">f64</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nc">PackedCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1451 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">r</span><span class="p">.</span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
1452 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">r</span><span class="p">.</span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
1453 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">x_axis_rotation</span><span class="p">;</span><span class="w"></span>
1454 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">from</span><span class="p">.</span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
1455 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">from</span><span class="p">.</span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
1456 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">to</span><span class="p">.</span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
1457 <span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">to</span><span class="p">.</span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
1458
1459 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">large_arc</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">sweep</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1460 <span class="w"> </span><span class="p">(</span><span class="n">LargeArc</span><span class="p">(</span><span class="kc">false</span><span class="p">),</span><span class="w"> </span><span class="n">Sweep</span>::<span class="n">Negative</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcSmallNegative</span><span class="p">,</span><span class="w"></span>
1461 <span class="w"> </span><span class="p">(</span><span class="n">LargeArc</span><span class="p">(</span><span class="kc">false</span><span class="p">),</span><span class="w"> </span><span class="n">Sweep</span>::<span class="n">Positive</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcSmallPositive</span><span class="p">,</span><span class="w"></span>
1462 <span class="w"> </span><span class="p">(</span><span class="n">LargeArc</span><span class="p">(</span><span class="kc">true</span><span class="p">),</span><span class="w"> </span><span class="n">Sweep</span>::<span class="n">Negative</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcLargeNegative</span><span class="p">,</span><span class="w"></span>
1463 <span class="w"> </span><span class="p">(</span><span class="n">LargeArc</span><span class="p">(</span><span class="kc">true</span><span class="p">),</span><span class="w"> </span><span class="n">Sweep</span>::<span class="n">Positive</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcLargePositive</span><span class="p">,</span><span class="w"></span>
1464 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1465 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1466 <span class="p">}</span><span class="w"></span>
1467 </code></pre></div>
1468
1469 <h2>Creating the compact Path</h2>
1470 <p>Let's look at <code>PathBuilder::into_path</code> line by line:</p>
1471 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">PathBuilder</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1472 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">into_path</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Path</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1473 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">num_commands</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">path_commands</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w"></span>
1474 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">num_coords</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"></span>
1475 <span class="w"> </span><span class="p">.</span><span class="n">path_commands</span><span class="w"></span>
1476 <span class="w"> </span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w"></span>
1477 <span class="w"> </span><span class="p">.</span><span class="n">fold</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="o">|</span><span class="n">acc</span><span class="p">,</span><span class="w"> </span><span class="n">cmd</span><span class="o">|</span><span class="w"> </span><span class="n">acc</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">cmd</span><span class="p">.</span><span class="n">num_coordinates</span><span class="p">());</span><span class="w"></span>
1478 </code></pre></div>
1479
1480 <p>First we compute the total number of coordinates using <code>fold</code>; we ask
1481 each command <code>cmd</code> its <code>num_coordinates()</code> and add it into the <code>acc</code>
1482 accumulator.</p>
1483 <p>Now we know how much memory to allocate:</p>
1484 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">packed_commands</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">with_capacity</span><span class="p">(</span><span class="n">num_commands</span><span class="p">);</span><span class="w"></span>
1485 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">coords</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="mf">0.0</span><span class="p">;</span><span class="w"> </span><span class="n">num_coords</span><span class="p">];</span><span class="w"></span>
1486 </code></pre></div>
1487
1488 <p>We use <code>Vec::with_capacity</code> to allocate exactly as much memory as we will
1489 need for the <code>packed_commands</code>; adding elements will not need a
1490 <code>realloc()</code>, since we already know how many elements we will have.</p>
1491 <p>We use the <code>vec!</code> macro to create an array of <code>0.0</code> repeated
1492 <code>num_coords</code> times; that macro uses <code>with_capacity</code> internally. That is the
1493 array we will use to store the coordinates for all the commands.</p>
1494 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">coords_slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">coords</span><span class="p">.</span><span class="n">as_mut_slice</span><span class="p">();</span><span class="w"></span>
1495 </code></pre></div>
1496
1497 <p>We get a mutable slice out of the whole array of coordinates.</p>
1498 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">path_commands</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1499 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">.</span><span class="n">num_coordinates</span><span class="p">();</span><span class="w"></span>
1500 <span class="w"> </span><span class="n">packed_commands</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">c</span><span class="p">.</span><span class="n">to_packed</span><span class="p">(</span><span class="n">coords_slice</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="n">n</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()));</span><span class="w"></span>
1501 <span class="w"> </span><span class="n">coords_slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">coords_slice</span><span class="p">[</span><span class="n">n</span><span class="o">..</span><span class="p">];</span><span class="w"></span>
1502 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1503 </code></pre></div>
1504
1505 <p>For each command, we see how many coordinates it will generate and we
1506 put that number in <code>n</code>. We get a mutable sub-slice from
1507 <code>coords_slice</code> with only that number of elements, and pass it to
1508 <code>to_packed</code> for each command.</p>
1509 <p>At the end of each iteration we move the mutable slice to where the
1510 next command's coordinates will go.</p>
1511 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="n">Path</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1512 <span class="w"> </span><span class="n">commands</span>: <span class="nc">packed_commands</span><span class="p">.</span><span class="n">into_boxed_slice</span><span class="p">(),</span><span class="w"></span>
1513 <span class="w"> </span><span class="n">coords</span>: <span class="nc">coords</span><span class="p">.</span><span class="n">into_boxed_slice</span><span class="p">(),</span><span class="w"></span>
1514 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1515 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1516 </code></pre></div>
1517
1518 <p>At the end, we create the final and immutable <code>Path</code> by converting
1519 each array <code>into_boxed_slice</code> like the last time. That way each of
1520 the two arrays, the one with <code>PackedCommand</code>s and the one with
1521 coordinates, occupy the minimum space they need.</p>
1522 <h2>An iterator for Path</h2>
1523 <p>This is all very well, but we also want it to be easy to iterate on
1524 that compact representation; the <code>PathCommand</code> enums from the
1525 beginning are very convenient to use and that's what the rest of the
1526 code already uses. Let's make an iterator that unpacks what is inside
1527 a <code>Path</code> and produces a <code>PathCommand</code> for each element.</p>
1528 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">PathIter</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1529 <span class="w"> </span><span class="n">commands</span>: <span class="nc">slice</span>::<span class="n">Iter</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">PackedCommand</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1530 <span class="w"> </span><span class="n">coords</span>: <span class="kp">&amp;</span><span class="o">&#39;</span><span class="na">a</span> <span class="p">[</span><span class="kt">f64</span><span class="p">],</span><span class="w"></span>
1531 <span class="p">}</span><span class="w"></span>
1532 </code></pre></div>
1533
1534 <p>We need an iterator over the array of <code>PackedCommand</code> so we can visit
1535 each command. However, to get elements of <code>coords</code>, I am going to
1536 use a slice of <code>f64</code> instead of an iterator.</p>
1537 <p>Let's look at the implementation of the iterator:</p>
1538 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">PathIter</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1539 <span class="w"> </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">PathCommand</span><span class="p">;</span><span class="w"></span>
1540
1541 <span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1542 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">commands</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1543 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">cmd</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">PathCommand</span>::<span class="n">from_packed</span><span class="p">(</span><span class="n">cmd</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">coords</span><span class="p">);</span><span class="w"></span>
1544 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">num_coords</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cmd</span><span class="p">.</span><span class="n">num_coordinates</span><span class="p">();</span><span class="w"></span>
1545 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">coords</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">coords</span><span class="p">[</span><span class="n">num_coords</span><span class="o">..</span><span class="p">];</span><span class="w"></span>
1546 <span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span><span class="w"></span>
1547 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1548 <span class="w"> </span><span class="nb">None</span><span class="w"></span>
1549 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1550 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1551 <span class="p">}</span><span class="w"></span>
1552 </code></pre></div>
1553
1554 <p>Since we want each iteration to produce a <code>PathCommand</code>, we declare it
1555 as having the associated <code>type Item = PathCommand</code>.</p>
1556 <p>If the <code>self.commands</code> iterator has another element, it means there is
1557 another <code>PackedCommand</code> available.</p>
1558 <p>We call <code>PathCommand::from_packed</code> with the <code>self.coords</code> slice to
1559 unpack a command and its coordinates. We see how many coordinates the
1560 command consumed and re-slice <code>self.coords</code> according to the number of
1561 commands, so that it now points to the coordinates for the next
1562 command.</p>
1563 <p>We return <code>Some(cmd)</code> if there was an element, or <code>None</code> if the
1564 iterator is empty.</p>
1565 <p>The implementation of <code>from_packed</code> is obvious and I'll just paste a
1566 bit from it:</p>
1567 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">PathCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1568 <span class="w"> </span><span class="k">fn</span> <span class="nf">from_packed</span><span class="p">(</span><span class="n">packed</span>: <span class="kp">&amp;</span><span class="nc">PackedCommand</span><span class="p">,</span><span class="w"> </span><span class="n">coords</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="kt">f64</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nc">PathCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1569 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="o">*</span><span class="n">packed</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1570 <span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">MoveTo</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1571 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w"></span>
1572 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">coords</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span><span class="w"></span>
1573 <span class="w"> </span><span class="n">PathCommand</span>::<span class="n">MoveTo</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">)</span><span class="w"></span>
1574 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1575
1576 <span class="w"> </span><span class="c1">// etc. for the other variants in PackedCommand</span>
1577
1578 <span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcSmallNegative</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PathCommand</span>::<span class="n">Arc</span><span class="p">(</span><span class="n">EllipticalArc</span>::<span class="n">from_coords</span><span class="p">(</span><span class="w"></span>
1579 <span class="w"> </span><span class="n">LargeArc</span><span class="p">(</span><span class="kc">false</span><span class="p">),</span><span class="w"></span>
1580 <span class="w"> </span><span class="n">Sweep</span>::<span class="n">Negative</span><span class="p">,</span><span class="w"></span>
1581 <span class="w"> </span><span class="n">coords</span><span class="p">,</span><span class="w"></span>
1582 <span class="w"> </span><span class="p">)),</span><span class="w"></span>
1583
1584 <span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcSmallPositive</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="c1">// etc.</span>
1585
1586 <span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcLargeNegative</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="c1">// etc.</span>
1587
1588 <span class="w"> </span><span class="n">PackedCommand</span>::<span class="n">ArcLargePositive</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="c1">// etc.</span>
1589 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1590 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1591 <span class="p">}</span><span class="w"></span>
1592 </code></pre></div>
1593
1594 <h2>Results</h2>
1595 <p>Before the changes (this is the same Massif heading as above):</p>
1596 <div class="highlight"><pre><span></span><code><span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1597 <span class="c"> n time(i) total(B) useful</span><span class="nb">-</span><span class="c">heap(B) extra</span><span class="nb">-</span><span class="c">heap(B) stacks(B)</span>
1598 <span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1599 <span class="c"> 33 24</span><span class="nt">,</span><span class="c">139</span><span class="nt">,</span><span class="c">598</span><span class="nt">,</span><span class="c">653 1</span><span class="nt">,</span><span class="c">416</span><span class="nt">,</span><span class="c">831</span><span class="nt">,</span><span class="c">176 1</span><span class="nt">,</span><span class="c">329</span><span class="nt">,</span><span class="c">943</span><span class="nt">,</span><span class="c">212 86</span><span class="nt">,</span><span class="c">887</span><span class="nt">,</span><span class="c">964 0</span>
1600 <span class="c"> ^^^^^^^^^^^^^</span>
1601 <span class="c"> boo</span>
1602 </code></pre></div>
1603
1604 <p>After:</p>
1605 <div class="highlight"><pre><span></span><code><span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1606 <span class="c"> n time(i) total(B) useful</span><span class="nb">-</span><span class="c">heap(B) extra</span><span class="nb">-</span><span class="c">heap(B) stacks(B)</span>
1607 <span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1608 <span class="c"> 28 26</span><span class="nt">,</span><span class="c">611</span><span class="nt">,</span><span class="c">886</span><span class="nt">,</span><span class="c">993 1</span><span class="nt">,</span><span class="c">093</span><span class="nt">,</span><span class="c">747</span><span class="nt">,</span><span class="c">888 1</span><span class="nt">,</span><span class="c">023</span><span class="nt">,</span><span class="c">147</span><span class="nt">,</span><span class="c">907 70</span><span class="nt">,</span><span class="c">599</span><span class="nt">,</span><span class="c">981 0</span>
1609 <span class="c"> ^^^^^^^^^^^^^</span>
1610 <span class="c"> oh yeah</span>
1611 </code></pre></div>
1612
1613 <p>We went from using 1,329,943,212 bytes down to 1,023,147,907 bytes,
1614 that is, we knocked it down by 300 MB.</p>
1615 <p>However, that is for the whole program. Above we saw that <code>Path</code> data
1616 occupies 403,513,776 bytes; how about now?</p>
1617 <div class="highlight"><pre><span></span><code><span class="o">-&gt;</span><span class="mf">07.45</span><span class="o">%</span> <span class="p">(</span><span class="mi">81</span><span class="p">,</span><span class="mi">525</span><span class="p">,</span><span class="mi">328</span><span class="n">B</span><span class="p">)</span> <span class="mh">0x4A34C6F</span><span class="o">:</span> <span class="n">alloc</span> <span class="p">(</span><span class="n">alloc</span><span class="p">.</span><span class="n">rs</span><span class="o">:</span><span class="mi">84</span><span class="p">)</span>
1618 <span class="o">|</span> <span class="o">-&gt;</span><span class="mf">07.45</span><span class="o">%</span> <span class="p">(</span><span class="mi">81</span><span class="p">,</span><span class="mi">525</span><span class="p">,</span><span class="mi">328</span><span class="n">B</span><span class="p">)</span> <span class="mh">0x4A34C6F</span><span class="o">:</span> <span class="n">alloc</span> <span class="p">(</span><span class="n">alloc</span><span class="p">.</span><span class="n">rs</span><span class="o">:</span><span class="mi">172</span><span class="p">)</span>
1619 <span class="o">|</span> <span class="o">-&gt;</span><span class="mf">07.45</span><span class="o">%</span> <span class="p">(</span><span class="mi">81</span><span class="p">,</span><span class="mi">525</span><span class="p">,</span><span class="mi">328</span><span class="n">B</span><span class="p">)</span> <span class="mh">0x4A34C6F</span><span class="o">:</span> <span class="n">allocate_in</span><span class="o">&lt;</span><span class="n">f64</span><span class="p">,</span><span class="n">alloc</span><span class="o">::</span><span class="n">alloc</span><span class="o">::</span><span class="kr">Global</span><span class="o">&gt;</span> <span class="p">(</span><span class="n">raw_vec</span><span class="p">.</span><span class="n">rs</span><span class="o">:</span><span class="mi">98</span><span class="p">)</span>
1620 <span class="o">|</span> <span class="o">-&gt;</span><span class="mf">07.45</span><span class="o">%</span> <span class="p">(</span><span class="mi">81</span><span class="p">,</span><span class="mi">525</span><span class="p">,</span><span class="mi">328</span><span class="n">B</span><span class="p">)</span> <span class="mh">0x4A34C6F</span><span class="o">:</span> <span class="n">with_capacity</span><span class="o">&lt;</span><span class="n">f64</span><span class="o">&gt;</span> <span class="p">(</span><span class="n">raw_vec</span><span class="p">.</span><span class="n">rs</span><span class="o">:</span><span class="mi">167</span><span class="p">)</span>
1621 <span class="o">|</span> <span class="o">-&gt;</span><span class="mf">07.45</span><span class="o">%</span> <span class="p">(</span><span class="mi">81</span><span class="p">,</span><span class="mi">525</span><span class="p">,</span><span class="mi">328</span><span class="n">B</span><span class="p">)</span> <span class="mh">0x4A34C6F</span><span class="o">:</span> <span class="n">with_capacity</span><span class="o">&lt;</span><span class="n">f64</span><span class="o">&gt;</span> <span class="p">(</span><span class="n">vec</span><span class="p">.</span><span class="n">rs</span><span class="o">:</span><span class="mi">358</span><span class="p">)</span>
1622 <span class="o">|</span> <span class="o">-&gt;</span><span class="mf">07.45</span><span class="o">%</span> <span class="p">(</span><span class="mi">81</span><span class="p">,</span><span class="mi">525</span><span class="p">,</span><span class="mi">328</span><span class="n">B</span><span class="p">)</span> <span class="mh">0x4A34C6F</span><span class="o">:</span> <span class="n">rsvg_internals</span><span class="o">::</span><span class="n">path_builder</span><span class="o">::</span><span class="n">PathBuilder</span><span class="o">::</span><span class="n">into_path</span> <span class="p">(</span><span class="n">path_builder</span><span class="p">.</span><span class="n">rs</span><span class="o">:</span><span class="mi">486</span><span class="p">)</span>
1623 </code></pre></div>
1624
1625 <p>Perfect. We went from occupying 403,513,776 bytes to just
1626 81,525,328 bytes. Instead of <code>Path</code> data amounting to 28.48% of the
1627 heap, it is just 7.45%.</p>
1628 <p>I think we can stop worrying about <code>Path</code> data for now. I like how
1629 this turned out without having to use <code>unsafe</code>.</p>
1630 <h2>References</h2>
1631 <ul>
1632 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/e9db621cdc79d31b8694d0f42cee4e02628ee145">Refactoring to use an iterator</a></li>
1633 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/cb4cde7140cd6ecfd8a78483278dcb1ab8217612">Adding tests for Path/PathBuilder</a></li>
1634 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/b183ac1e3207bd8110d60ded8878a22daed3f891">Using a more compact representation for Path</a></li>
1635 </ul></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category><category term="performance"></category></entry><entry><title>Reducing memory consumption in librsvg, part 3: slack space in Bézier paths</title><link href="https://people.gnome.org/~federico/blog/reducing-memory-consumption-in-librsvg-3.html" rel="alternate"></link><published>2020-03-24T17:14:55-06:00</published><updated>2020-03-24T17:14:55-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-03-24:/~federico/blog/reducing-memory-consumption-in-librsvg-3.html</id><summary type="html"><p>We got a <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/574">bug with a gigantic SVG</a> of a map extracted from
1636 OpenStreetMap, and it has about 600,000 elements. Most of them are
1637 <code>&lt;path&gt;</code>, that is, specifications for Bézier paths.</p>
1638 <p>A <code>&lt;path&gt;</code> can look like this:</p>
1639 <div class="highlight"><pre><span></span><code><span class="nt">&lt;path</span> <span class="na">d=</span><span class="s">&quot;m 2239.05,1890.28 5.3,-1.81&quot;</span><span class="nt">/&gt;</span>
1640 </code></pre></div>
1641
1642 <p>The …</p></summary><content type="html"><p>We got a <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/574">bug with a gigantic SVG</a> of a map extracted from
1643 OpenStreetMap, and it has about 600,000 elements. Most of them are
1644 <code>&lt;path&gt;</code>, that is, specifications for Bézier paths.</p>
1645 <p>A <code>&lt;path&gt;</code> can look like this:</p>
1646 <div class="highlight"><pre><span></span><code><span class="nt">&lt;path</span> <span class="na">d=</span><span class="s">&quot;m 2239.05,1890.28 5.3,-1.81&quot;</span><span class="nt">/&gt;</span>
1647 </code></pre></div>
1648
1649 <p>The <code>d</code> attribute contains a <a href="https://www.w3.org/TR/SVG2/paths.html#TheDProperty">list of commands</a> to
1650 create a Bézier path, very similar to PostScript's operators. Librsvg
1651 has the following to represent those commands:</p>
1652 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">PathCommand</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1653 <span class="w"> </span><span class="n">MoveTo</span><span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
1654 <span class="w"> </span><span class="n">LineTo</span><span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
1655 <span class="w"> </span><span class="n">CurveTo</span><span class="p">(</span><span class="n">CubicBezierCurve</span><span class="p">),</span><span class="w"></span>
1656 <span class="w"> </span><span class="n">Arc</span><span class="p">(</span><span class="n">EllipticalArc</span><span class="p">),</span><span class="w"></span>
1657 <span class="w"> </span><span class="n">ClosePath</span><span class="p">,</span><span class="w"></span>
1658 <span class="p">}</span><span class="w"></span>
1659 </code></pre></div>
1660
1661 <p>Those commands get stored in an array, a <code>Vec</code> inside a <code>PathBuilder</code>:</p>
1662 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">PathBuilder</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1663 <span class="w"> </span><span class="n">path_commands</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">PathCommand</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1664 <span class="p">}</span><span class="w"></span>
1665 </code></pre></div>
1666
1667 <p>Librsvg translates each of the commands inside a <code>&lt;path d="..."/&gt;</code>
1668 into a <code>PathCommand</code> and pushes it into the <code>Vec</code> in the
1669 <code>PathBuilder</code>. When it is done parsing the attribute, the
1670 <code>PathBuilder</code> remains as the final version of the path.</p>
1671 <p>To let a <code>Vec</code> grow efficiently as items are pushed into
1672 it, Rust makes the <code>Vec</code> grow by powers of 2. When we add an item, if
1673 the <em>capacity</em> of the <code>Vec</code> is full, its buffer gets <code>realloc()</code>ed to
1674 twice its capacity. That way there are only O(log₂n) calls to
1675 <code>realloc()</code>, where <code>n</code> is the total number of items in the array.</p>
1676 <p>However, this means that once we are done adding items to the <code>Vec</code>,
1677 there may still be some free space in it: <em>the capacity exceeds the
1678 length of the array</em>. The invariant is that
1679 <code>vec.capacity() &gt;= vec.len()</code>.</p>
1680 <p>First I wanted to shrink the <code>PathBuilder</code>s so that they have no extra
1681 capacity in the end.</p>
1682 <h2>First step: convert to Box&lt;[T]&gt;</h2>
1683 <p>A "boxed slice" is a contiguous array in the heap, that cannot grow or
1684 shrink. That is, it has no extra capacity, only a length.</p>
1685 <p><code>Vec</code> has a method <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.into_boxed_slice"><code>into_boxed_slice</code></a> which does
1686 eactly that: it consumes the vector and converts it into a boxed
1687 slice without extra capacity. In its innards, it does a <code>realloc()</code>
1688 on the <code>Vec</code>'s buffer to match its length.</p>
1689 <p>Let's see the numbers that Massif reports:</p>
1690 <div class="highlight"><pre><span></span><code><span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1691 <span class="c"> n time(i) total(B) useful</span><span class="nb">-</span><span class="c">heap(B) extra</span><span class="nb">-</span><span class="c">heap(B) stacks(B)</span>
1692 <span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1693 <span class="c"> 23 22</span><span class="nt">,</span><span class="c">751</span><span class="nt">,</span><span class="c">613</span><span class="nt">,</span><span class="c">855 1</span><span class="nt">,</span><span class="c">560</span><span class="nt">,</span><span class="c">916</span><span class="nt">,</span><span class="c">408 1</span><span class="nt">,</span><span class="c">493</span><span class="nt">,</span><span class="c">746</span><span class="nt">,</span><span class="c">540 67</span><span class="nt">,</span><span class="c">169</span><span class="nt">,</span><span class="c">868 0</span>
1694 <span class="c"> ^^^^^^^^^^^^^</span>
1695 <span class="c"> before</span>
1696
1697 <span class="c"> 30 22</span><span class="nt">,</span><span class="c">796</span><span class="nt">,</span><span class="c">106</span><span class="nt">,</span><span class="c">012 1</span><span class="nt">,</span><span class="c">553</span><span class="nt">,</span><span class="c">581</span><span class="nt">,</span><span class="c">072 1</span><span class="nt">,</span><span class="c">329</span><span class="nt">,</span><span class="c">943</span><span class="nt">,</span><span class="c">324 223</span><span class="nt">,</span><span class="c">637</span><span class="nt">,</span><span class="c">748 0</span>
1698 <span class="c"> ^^^^^^^^^^^^^</span>
1699 <span class="c"> after</span>
1700 </code></pre></div>
1701
1702 <p>That is, we went from using 1,493,746,540 bytes on the heap to using
1703 1,329,943,324 bytes. Simply removing extra capacity from the path
1704 commands saves about 159 MB for this particular file.</p>
1705 <h2>Second step: make the allocator do less work</h2>
1706 <p>However, the <code>extra-heap</code> column in that table has a number I don't
1707 like: there are 223,637,748 bytes in <code>malloc()</code> metadata and unused
1708 space in the heap.</p>
1709 <p>I suppose that so many calls to <code>realloc()</code> make the heap a bit
1710 fragmented.</p>
1711 <p>It would be good to be able to read most of the <code>&lt;path d="..."/&gt;</code> to
1712 temporary buffers that don't need so many calls to <code>realloc()</code>, and
1713 that in the end get copied to exact-sized buffers, without extra
1714 capacity.</p>
1715 <p>We can do just that with the <a href="https://docs.rs/smallvec/1.2.0/smallvec/">smallvec</a> crate. A <code>SmallVec</code> has the
1716 same API as <code>Vec</code>, but it can store small arrays directly in the
1717 stack, without an extra heap allocation. Once the capacity is full,
1718 the stack buffer "spills" into a heap buffer automatically.</p>
1719 <p>Most of the <code>d</code> attributes in the huge file in the <a href="https://gitlab.gnome.org/GNOME/librsvg/-/issues/574">bug</a> have
1720 fewer than 32 commands. That is, if we use the following:</p>
1721 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">PathBuilder</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1722 <span class="w"> </span><span class="n">path_commands</span>: <span class="nc">SmallVec</span><span class="o">&lt;</span><span class="p">[</span><span class="n">PathCommand</span><span class="p">;</span><span class="w"> </span><span class="mi">32</span><span class="p">]</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1723 <span class="p">}</span><span class="w"></span>
1724 </code></pre></div>
1725
1726 <p>We are saying that there can be up to 32 items in the <code>SmallVec</code>
1727 without causing a heap allocation; once that is exceeded, it will work
1728 like a normal <code>Vec</code>.</p>
1729 <p>At the end we still do <code>into_boxed_slice</code> to turn it into an
1730 independent heap allocation with an exact size.</p>
1731 <p>This reduces the <code>extra-heap</code> quite a bit:</p>
1732 <div class="highlight"><pre><span></span><code><span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1733 <span class="c"> n time(i) total(B) useful</span><span class="nb">-</span><span class="c">heap(B) extra</span><span class="nb">-</span><span class="c">heap(B) stacks(B)</span>
1734 <span class="nb">--------------------------------------------------------------------------------</span><span class="c"></span>
1735 <span class="c"> 33 24</span><span class="nt">,</span><span class="c">139</span><span class="nt">,</span><span class="c">598</span><span class="nt">,</span><span class="c">653 1</span><span class="nt">,</span><span class="c">416</span><span class="nt">,</span><span class="c">831</span><span class="nt">,</span><span class="c">176 1</span><span class="nt">,</span><span class="c">329</span><span class="nt">,</span><span class="c">943</span><span class="nt">,</span><span class="c">212 86</span><span class="nt">,</span><span class="c">887</span><span class="nt">,</span><span class="c">964 0</span>
1736 <span class="c"> ^^^^^^^^^^</span>
1737 </code></pre></div>
1738
1739 <p>Also, the total bytes shrink from 1,553,581,072 to
1740 1,416,831,176 — we have a smaller heap because there is not so much
1741 work for the allocator, and there are a lot fewer temporary blocks
1742 when parsing the <code>d</code> attributes.</p>
1743 <h2>Making the code prettier</h2>
1744 <p>I put in the following:</p>
1745 <div class="highlight"><pre><span></span><code><span class="sd">/// This one is mutable</span>
1746 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">PathBuilder</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1747 <span class="w"> </span><span class="n">path_commands</span>: <span class="nc">SmallVec</span><span class="o">&lt;</span><span class="p">[</span><span class="n">PathCommand</span><span class="p">;</span><span class="w"> </span><span class="mi">32</span><span class="p">]</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1748 <span class="p">}</span><span class="w"></span>
1749
1750 <span class="sd">/// This one is immutable</span>
1751 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Path</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1752 <span class="w"> </span><span class="n">path_commands</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="p">[</span><span class="n">PathCommand</span><span class="p">]</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1753 <span class="p">}</span><span class="w"></span>
1754
1755 <span class="k">impl</span><span class="w"> </span><span class="n">PathBuilder</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1756 <span class="w"> </span><span class="sd">/// Consumes the PathBuilder and converts it into an immutable Path</span>
1757 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">into_path</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Path</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1758 <span class="w"> </span><span class="n">Path</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1759 <span class="w"> </span><span class="n">path_commands</span>: <span class="nc">self</span><span class="p">.</span><span class="n">path_commands</span><span class="p">.</span><span class="n">into_boxed_slice</span><span class="p">(),</span><span class="w"></span>
1760 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1761 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1762 <span class="p">}</span><span class="w"></span>
1763 </code></pre></div>
1764
1765 <p>With that, <code>PathBuilder</code> is just a temporary struct that turns into an
1766 immutable <code>Path</code> once we are done feeding it. <code>Path</code> contains a boxed
1767 slice of the exact size, without any extra capacity.</p>
1768 <h2>Next steps</h2>
1769 <p>All the coordinates in librsvg are stored as <code>f64</code>, double-precision
1770 floating point numbers. The SVG/CSS spec says that single-precision
1771 floats are enough, and that 64-bit floats should be used only for
1772 geometric transformations.</p>
1773 <p>I'm a bit scared to make that change; I'll have to look closely at the
1774 results of the test suite to see if rendered files change very much.
1775 I suppose even big maps require only as much precision as <code>f32</code> —
1776 after all, that is what OpenStreetMap uses.</p>
1777 <h2>References</h2>
1778 <ul>
1779 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/fc44abd1a8a85b7d8c474260e315cf4c73e4ac01">Convert the Vec into a
1780 Box&lt;[T]&gt;</a></li>
1781 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/ee63041d012406a9b05204ef604eb5411a8cf7ae">Convert to SmallVec</a></li>
1782 </ul></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category><category term="performance"></category></entry><entry><title>Reducing memory consumption in librsvg, part 2: SpecifiedValues</title><link href="https://people.gnome.org/~federico/blog/reducing-memory-consumption-in-librsvg-2.html" rel="alternate"></link><published>2020-03-20T14:29:16-06:00</published><updated>2020-03-20T14:29:16-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-03-20:/~federico/blog/reducing-memory-consumption-in-librsvg-2.html</id><summary type="html"><p>To continue with <a href="https://people.gnome.org/~federico/blog/reducing-memory-consumption-in-librsvg-1-es.html">last time's topic</a>, let's see how to make
1783 librsvg's DOM nodes smaller in memory. Since that time, there have
1784 been some changes to the code; that is why in this post some of the
1785 type names are different from last time's.</p>
1786 <p>Every SVG element is represented with …</p></summary><content type="html"><p>To continue with <a href="https://people.gnome.org/~federico/blog/reducing-memory-consumption-in-librsvg-1-es.html">last time's topic</a>, let's see how to make
1787 librsvg's DOM nodes smaller in memory. Since that time, there have
1788 been some changes to the code; that is why in this post some of the
1789 type names are different from last time's.</p>
1790 <p>Every SVG element is represented with this struct:</p>
1791 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Element</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1792 <span class="w"> </span><span class="n">element_type</span>: <span class="nc">ElementType</span><span class="p">,</span><span class="w"></span>
1793 <span class="w"> </span><span class="n">element_name</span>: <span class="nc">QualName</span><span class="p">,</span><span class="w"></span>
1794 <span class="w"> </span><span class="n">id</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1795 <span class="w"> </span><span class="n">class</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1796 <span class="w"> </span><span class="n">specified_values</span>: <span class="nc">SpecifiedValues</span><span class="p">,</span><span class="w"></span>
1797 <span class="w"> </span><span class="n">important_styles</span>: <span class="nc">HashSet</span><span class="o">&lt;</span><span class="n">QualName</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1798 <span class="w"> </span><span class="n">result</span>: <span class="nc">ElementResult</span><span class="p">,</span><span class="w"></span>
1799 <span class="w"> </span><span class="n">transform</span>: <span class="nc">Transform</span><span class="p">,</span><span class="w"></span>
1800 <span class="w"> </span><span class="n">values</span>: <span class="nc">ComputedValues</span><span class="p">,</span><span class="w"></span>
1801 <span class="w"> </span><span class="n">cond</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
1802 <span class="w"> </span><span class="n">style_attr</span>: <span class="nb">String</span><span class="p">,</span><span class="w"></span>
1803 <span class="w"> </span><span class="n">element_impl</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">ElementTrait</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1804 <span class="p">}</span><span class="w"></span>
1805 </code></pre></div>
1806
1807 <p>The two biggest fields are the ones with types <code>SpecifiedValues</code> and
1808 <code>ComputedValues</code>. These are the sizes of the whole <code>Element</code> struct
1809 and those two types:</p>
1810 <div class="highlight"><pre><span></span><code>sizeof Element: 1808
1811 sizeof SpecifiedValues: 824
1812 sizeof ComputedValues: 704
1813 </code></pre></div>
1814
1815 <p>In this post, we'll reduce the size of <code>SpecifiedValues</code>.</p>
1816 <h2>What is SpecifiedValues?</h2>
1817 <p>If we have an element like this:</p>
1818 <div class="highlight"><pre><span></span><code><span class="nt">&lt;circle</span> <span class="na">cx=</span><span class="s">&quot;10&quot;</span> <span class="na">cy=</span><span class="s">&quot;10&quot;</span> <span class="na">r=</span><span class="s">&quot;10&quot;</span> <span class="na">stroke-width=</span><span class="s">&quot;4&quot;</span> <span class="na">stroke=</span><span class="s">&quot;blue&quot;</span><span class="nt">/&gt;</span>
1819 </code></pre></div>
1820
1821 <p>The values of the style properties <code>stroke-width</code> and <code>stroke</code> get
1822 stored in a <code>SpecifiedValues</code> struct. This struct has a bunch of
1823 fields, one for each possible style property:</p>
1824 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">SpecifiedValues</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1825 <span class="w"> </span><span class="n">baseline_shift</span>: <span class="nc">SpecifiedValue</span><span class="o">&lt;</span><span class="n">BaselineShift</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1826 <span class="w"> </span><span class="n">clip_path</span>: <span class="nc">SpecifiedValue</span><span class="o">&lt;</span><span class="n">ClipPath</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1827 <span class="w"> </span><span class="n">clip_rule</span>: <span class="nc">SpecifiedValue</span><span class="o">&lt;</span><span class="n">ClipRule</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1828 <span class="w"> </span><span class="sd">/// ...</span>
1829 <span class="w"> </span><span class="n">stroke</span>: <span class="nc">SpecifiedValue</span><span class="o">&lt;</span><span class="n">Stroke</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1830 <span class="w"> </span><span class="n">stroke_width</span>: <span class="nc">SpecifiedValue</span><span class="o">&lt;</span><span class="n">StrokeWidth</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1831 <span class="w"> </span><span class="sd">/// ...</span>
1832 <span class="p">}</span><span class="w"></span>
1833 </code></pre></div>
1834
1835 <p>Each field is a <code>SpecifiedValue&lt;T&gt;</code> for the following reason. In
1836 CSS/SVG, a style property can be unspecified, or it can have an
1837 <code>inherit</code> value to force the property to be copied from the element's
1838 parent, or it can actually have a specified value. Librsvg represents
1839 these as follows:</p>
1840 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">SpecifiedValue</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"></span>
1841 <span class="k">where</span><span class="w"></span>
1842 <span class="w"> </span><span class="n">T</span>: <span class="c1">// some trait bounds here</span>
1843 <span class="p">{</span><span class="w"></span>
1844 <span class="w"> </span><span class="n">Unspecified</span><span class="p">,</span><span class="w"></span>
1845 <span class="w"> </span><span class="n">Inherit</span><span class="p">,</span><span class="w"></span>
1846 <span class="w"> </span><span class="n">Specified</span><span class="p">(</span><span class="n">T</span><span class="p">),</span><span class="w"></span>
1847 <span class="p">}</span><span class="w"></span>
1848 </code></pre></div>
1849
1850 <p>Now, <code>SpecifiedValues</code> has a bunch of fields, 47 of them to be exact —
1851 one for each of the style properties that librsvg supports. That is
1852 why <code>SpecifiedValues</code> has a size of 824 bytes; it is the largest
1853 sub-structure within <code>Element</code>, and it would be good to reduce its
1854 size.</p>
1855 <h2>Not all properties are specified</h2>
1856 <p>Let's go back to the chunk of SVG from above:</p>
1857 <div class="highlight"><pre><span></span><code><span class="nt">&lt;circle</span> <span class="na">cx=</span><span class="s">&quot;10&quot;</span> <span class="na">cy=</span><span class="s">&quot;10&quot;</span> <span class="na">r=</span><span class="s">&quot;10&quot;</span> <span class="na">stroke-width=</span><span class="s">&quot;4&quot;</span> <span class="na">stroke=</span><span class="s">&quot;blue&quot;</span><span class="nt">/&gt;</span>
1858 </code></pre></div>
1859
1860 <p>Here we only have two specified properties, so the <code>stroke_width</code> and
1861 <code>stroke</code> fields of <code>SpecifiedValues</code> will be set as
1862 <code>SpecifiedValue::Specified(something)</code> and all the other fields will
1863 be left as <code>SpecifiedValue::Unspecified</code>.</p>
1864 <p>It would be good to store only complete values for the properties that
1865 are specified, and just a small flag for unset properties.</p>
1866 <h2>Another way to represent the set of properties</h2>
1867 <p>Since there is a maximum of 47 properties per element (or more if
1868 librsvg adds support for extra ones), we can have a small array of
1869 47 bytes. Each byte contains the index within another array that
1870 contains only the values of specified properties, or a sentinel value
1871 for properties that are unset.</p>
1872 <p>First, I made an enum that fits in a <code>u8</code> for all the properties, plus
1873 the sentinel value, which also gives us the total number of
1874 properties. The <code>#[repr(u8)]</code> guarantees that this enum fits in a
1875 byte.</p>
1876 <div class="highlight"><pre><span></span><code><span class="cp">#[repr(u8)]</span><span class="w"></span>
1877 <span class="k">enum</span> <span class="nc">PropertyId</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1878 <span class="w"> </span><span class="n">BaselineShift</span><span class="p">,</span><span class="w"></span>
1879 <span class="w"> </span><span class="n">ClipPath</span><span class="p">,</span><span class="w"></span>
1880 <span class="w"> </span><span class="n">ClipRule</span><span class="p">,</span><span class="w"></span>
1881 <span class="w"> </span><span class="n">Color</span><span class="p">,</span><span class="w"></span>
1882 <span class="w"> </span><span class="c1">// ...</span>
1883 <span class="w"> </span><span class="n">WritingMode</span><span class="p">,</span><span class="w"></span>
1884 <span class="w"> </span><span class="n">XmlLang</span><span class="p">,</span><span class="w"></span>
1885 <span class="w"> </span><span class="n">XmlSpace</span><span class="p">,</span><span class="w"></span>
1886 <span class="w"> </span><span class="n">UnsetProperty</span><span class="p">,</span><span class="w"> </span><span class="c1">// the number of properties and also the sentinel value</span>
1887 <span class="p">}</span><span class="w"></span>
1888 </code></pre></div>
1889
1890 <p>Also, since before these changes there was the following monster to
1891 represent "which property is this" plus the property's value:</p>
1892 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">ParsedProperty</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1893 <span class="w"> </span><span class="n">BaselineShift</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">BaselineShift</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
1894 <span class="w"> </span><span class="n">ClipPath</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">ClipPath</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
1895 <span class="w"> </span><span class="n">ClipRule</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">ClipRule</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
1896 <span class="w"> </span><span class="n">Color</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">Color</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
1897 <span class="w"> </span><span class="c1">// ...</span>
1898 <span class="p">}</span><span class="w"></span>
1899 </code></pre></div>
1900
1901 <p>I changed the definition of <code>SpecifiedValues</code> to have two arrays, one
1902 to store which properties are specified, and another only with the
1903 values for the properties that are actually specified:</p>
1904 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">SpecifiedValues</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1905 <span class="w"> </span><span class="n">indices</span>: <span class="p">[</span><span class="kt">u8</span><span class="p">;</span><span class="w"> </span><span class="n">PropertyId</span>::<span class="n">UnsetProperty</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="p">],</span><span class="w"></span>
1906 <span class="w"> </span><span class="n">props</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">ParsedProperty</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
1907 <span class="p">}</span><span class="w"></span>
1908 </code></pre></div>
1909
1910 <p>There is a thing that is awkward in Rust, or which I haven't found how
1911 to solve in a nicer way: given a <code>ParsedProperty</code>, find the
1912 corresponding <code>PropertyId</code> for its discriminant. I did the obvious
1913 thing:</p>
1914 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">ParsedProperty</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1915 <span class="w"> </span><span class="k">fn</span> <span class="nf">get_property_id</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">PropertyId</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1916 <span class="w"> </span><span class="k">use</span><span class="w"> </span><span class="n">ParsedProperty</span>::<span class="o">*</span><span class="p">;</span><span class="w"></span>
1917
1918 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1919 <span class="w"> </span><span class="n">BaselineShift</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PropertyId</span>::<span class="n">BaselineShift</span><span class="p">,</span><span class="w"></span>
1920 <span class="w"> </span><span class="n">ClipPath</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PropertyId</span>::<span class="n">ClipPath</span><span class="p">,</span><span class="w"></span>
1921 <span class="w"> </span><span class="n">ClipRule</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PropertyId</span>::<span class="n">ClipRule</span><span class="p">,</span><span class="w"></span>
1922 <span class="w"> </span><span class="n">Color</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">PropertyId</span>::<span class="n">Color</span><span class="p">,</span><span class="w"></span>
1923 <span class="w"> </span><span class="c1">// ...</span>
1924 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1925 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1926 <span class="p">}</span><span class="w"></span>
1927 </code></pre></div>
1928
1929 <h2>Initialization</h2>
1930 <p>First, we want to initialize an empty <code>SpecifiedValues</code>, where every
1931 element of the the <code>indices</code> array is set to the sentinel value that
1932 means that the corresponding property is not set:</p>
1933 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="nb">Default</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SpecifiedValues</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1934 <span class="w"> </span><span class="k">fn</span> <span class="nf">default</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1935 <span class="w"> </span><span class="n">SpecifiedValues</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1936 <span class="w"> </span><span class="n">indices</span>: <span class="p">[</span><span class="n">PropertyId</span>::<span class="n">UnsetProperty</span><span class="p">.</span><span class="n">as_u8</span><span class="p">();</span><span class="w"> </span><span class="n">PropertyId</span>::<span class="n">UnsetProperty</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="p">],</span><span class="w"></span>
1937 <span class="w"> </span><span class="n">props</span>: <span class="nb">Vec</span>::<span class="n">new</span><span class="p">(),</span><span class="w"></span>
1938 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1939 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1940 <span class="p">}</span><span class="w"></span>
1941 </code></pre></div>
1942
1943 <p>That sets the <code>indices</code> field to an array full of the same
1944 <code>PropertyId::UnsetProperty</code> sentinel value. Also, the <code>props</code> array
1945 is empty; it hasn't even had a block of memory allocated for it yet.
1946 That way, SVG elements without style properties don't use any extra
1947 memory.</p>
1948 <h2>Which properties are specified and what are their indices?</h2>
1949 <p>Second, we want a function that will give us the index in <code>props</code> for
1950 some property, or that will tell us if the property has not been set
1951 yet:</p>
1952 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">SpecifiedValues</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1953 <span class="w"> </span><span class="k">fn</span> <span class="nf">property_index</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">id</span>: <span class="nc">PropertyId</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1954 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">indices</span><span class="p">[</span><span class="n">id</span><span class="p">.</span><span class="n">as_usize</span><span class="p">()];</span><span class="w"></span>
1955
1956 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">PropertyId</span>::<span class="n">UnsetProperty</span><span class="p">.</span><span class="n">as_u8</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1957 <span class="w"> </span><span class="nb">None</span><span class="w"></span>
1958 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1959 <span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">v</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="w"></span>
1960 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1961 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1962 <span class="p">}</span><span class="w"></span>
1963 </code></pre></div>
1964
1965 <p>(If someone passes <code>id = PropertyId::UnsetProperty</code>, the array access
1966 to <code>indices</code> will panic, which is what we want, since <em>that</em> is not a
1967 valid property id.)</p>
1968 <h2>Change a property's value</h2>
1969 <p>Third, we want to set the value of a property that has not been set,
1970 or change the value of one that was already specified:</p>
1971 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">SpecifiedValues</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1972 <span class="w"> </span><span class="k">fn</span> <span class="nf">replace_property</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">prop</span>: <span class="kp">&amp;</span><span class="nc">ParsedProperty</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1973 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">prop</span><span class="p">.</span><span class="n">get_property_id</span><span class="p">();</span><span class="w"></span>
1974
1975 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">index</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">property_index</span><span class="p">(</span><span class="n">id</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1976 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">props</span><span class="p">[</span><span class="n">index</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">prop</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w"></span>
1977 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
1978 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">props</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w"></span>
1979 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">pos</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">props</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
1980 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">indices</span><span class="p">[</span><span class="n">id</span><span class="p">.</span><span class="n">as_usize</span><span class="p">()]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pos</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">u8</span><span class="p">;</span><span class="w"></span>
1981 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1982 <span class="w"> </span><span class="p">}</span><span class="w"></span>
1983 <span class="p">}</span><span class="w"></span>
1984 </code></pre></div>
1985
1986 <p>In the first case in the <code>if</code>, the property was already set and we
1987 just replace its value. In the second case, the property was not set;
1988 we add it to the <code>props</code> array and store its resulting index in
1989 <code>indices</code>.</p>
1990 <h2>Results</h2>
1991 <p>Before:</p>
1992 <div class="highlight"><pre><span></span><code>sizeof Element: 1808
1993 sizeof SpecifiedValues: 824
1994 </code></pre></div>
1995
1996 <p>After:</p>
1997 <div class="highlight"><pre><span></span><code>sizeof Element: 1056
1998 sizeof SpecifiedValues: 72
1999 </code></pre></div>
2000
2001 <p>The pathological file <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/42">from the last time</a> used
2002 463,412,720 bytes in memory before these changes. After the changes,
2003 it uses 314,526,136 bytes.</p>
2004 <p>I also measured memory consumption for a normal file, in this case
2005 <a href="https://gitlab.gnome.org/Teams/Design/icon-development-kit/-/blob/master/src/icons.svg">one with a bunch of GNOME's symbolic icons</a>. The old version
2006 uses 17 MB; the new version only 13 MB.</p>
2007 <h2>How to keep fine-tuning this</h2>
2008 <p>For now, I am satisfied with <code>SpecifiedValues</code>, although it could
2009 still be made smaller:</p>
2010 <ul>
2011 <li>
2012 <p>The crate <a href="https://lib.rs/crates/tagged-box">tagged-box</a> converts an enum like <code>ParsedProperty</code> into
2013 an enum-of-boxes, and codifies the enum's discriminant into the
2014 box's pointer. This way each variant occupies the minimum possible
2015 memory, although in a separately-allocated block, and the container
2016 itself uses only a pointer. I am not sure if this is worth it; each
2017 <code>ParsedProperty</code> is 64 bytes, but the flat array <code>props:
2018 Vec&lt;ParsedProperty&gt;</code> is very appealing in a single block of memory.
2019 I have not checked the sizes of each individual property to see if
2020 they vary a lot among them.</p>
2021 </li>
2022 <li>
2023 <p>Look for a crate that lets us have the properties in a single memory
2024 block, a kind of arena with variable types. This can be implemented
2025 with a bit of <code>unsafe</code>, but one has to be careful with the alignment
2026 of different types.</p>
2027 </li>
2028 <li>
2029 <p>The crate <a href="https://lib.rs/crates/enum_set2">enum_set2</a> represents an array of field-less enums as a
2030 compact bit array. If we changed the representation of
2031 <code>SpecifiedValue</code>, this would reduce the <code>indices</code> array to a
2032 minimum.</p>
2033 </li>
2034 </ul>
2035 <p>If someone wants to dedicate some time to implement and measure this,
2036 I would be very grateful.</p>
2037 <h2>Next steps</h2>
2038 <p>According to Massif, the next thing is to keep making <code>Element</code>
2039 smaller. The next thing to shrink is <code>ComputedValues</code>. The obvious
2040 route is to do exactly the same as I did for <code>SpecifiedValues</code>. I am
2041 not sure if it would be better to try to <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/570">share the style
2042 structs</a> between elements.</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category><category term="performance"></category></entry><entry><title>Librsvg accepting interns for Summer of Code 2020</title><link href="https://people.gnome.org/~federico/blog/librsvg-soc-2020.html" rel="alternate"></link><published>2020-03-16T17:53:17-06:00</published><updated>2020-03-16T17:53:17-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-03-16:/~federico/blog/librsvg-soc-2020.html</id><summary type="html"><p>Are you a student qualified to run for Summer of Code 2020? I'm
2043 willing to mentor the following project for librsvg.</p>
2044 <h2>Project: Revamp the text engine in librsvg</h2>
2045 <p>Librsvg supports only a few features of the <a href="https://www.w3.org/TR/SVG2/text.html">SVG Text
2046 specification</a>. It requires
2047 extra features to be really useful:</p>
2048 <ul>
2049 <li>
2050 <p><strong>Proper bidirectional support …</strong></p></li></ul></summary><content type="html"><p>Are you a student qualified to run for Summer of Code 2020? I'm
2051 willing to mentor the following project for librsvg.</p>
2052 <h2>Project: Revamp the text engine in librsvg</h2>
2053 <p>Librsvg supports only a few features of the <a href="https://www.w3.org/TR/SVG2/text.html">SVG Text
2054 specification</a>. It requires
2055 extra features to be really useful:</p>
2056 <ul>
2057 <li>
2058 <p><strong>Proper bidirectional support.</strong> Librsvg supports the <code>direction</code> and
2059 <code>unicode-bidi</code> properties for text elements, among others, but in a
2060 very rudimentary fashion. It just translates those properties to
2061 Pango terminology and asks <code>PangoLayout</code> to lay out the text. SVG
2062 really wants finer control of that, for which...</p>
2063 </li>
2064 <li>
2065 <p>... ideally you would make librsvg <strong>use Harfbuzz directly</strong>, or a
2066 wrapper that is close to its level of operation. Pango is a bit too high
2067 level for the needs of SVG.</p>
2068 </li>
2069 <li>
2070 <p>Manual layout of text glyphs. After a text engine like Harfbuzz
2071 does the shaping, librsvg would need to lay out the produced glyphs in
2072 the way of the SVG attributes <code>dx, dy, x, y</code>, etc. The SVG Text
2073 specification has the algorithms for this.</p>
2074 </li>
2075 <li>
2076 <p>The cherry on top: text-on-a-path. Again, the spec has the details.
2077 You would make Wikimedia content creators very happy with this!</p>
2078 </li>
2079 </ul>
2080 <p><strong>Requirements:</strong> Rust for programming language; some familiarity with
2081 Unicode concepts and text layout. Familiarity with Cairo and Harfbuzz
2082 would help a lot. Preference will be given to people who can write a
2083 right-to-left human language, <strong>or</strong> a language that requires complex
2084 shaping.</p>
2085 <p><a href="https://wiki.gnome.org/Outreach/SummerOfCode/Students">Details for students</a></p></content><category term="misc"></category><category term="librsvg"></category><category term="mentoring"></category><category term="gnome"></category></entry><entry><title>Reducing memory consumption in librsvg, part 1: text nodes</title><link href="https://people.gnome.org/~federico/blog/reducing-memory-consumption-in-librsvg-1.html" rel="alternate"></link><published>2020-03-12T18:57:35-06:00</published><updated>2020-03-12T18:57:35-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-03-12:/~federico/blog/reducing-memory-consumption-in-librsvg-1.html</id><summary type="html"><p>Librsvg's memory consumption has not been a problem so far for GNOME's
2086 use cases, which is basically rendering icons. But for SVG files with
2087 thousands of elements, it could do a lot better.</p>
2088 <h2>Memory consumption in the DOM</h2>
2089 <p>Librsvg shares some common problems with web browsers: it must
2090 construct a …</p></summary><content type="html"><p>Librsvg's memory consumption has not been a problem so far for GNOME's
2091 use cases, which is basically rendering icons. But for SVG files with
2092 thousands of elements, it could do a lot better.</p>
2093 <h2>Memory consumption in the DOM</h2>
2094 <p>Librsvg shares some common problems with web browsers: it must
2095 construct a DOM tree in memory with SVG elements, and keep a bunch of
2096 information for each of the tree's nodes. For example, each SVG
2097 element may have an <code>id</code> attribute, or a <code>class</code>; each one has a
2098 transformation matrix; etc.</p>
2099 <p>Apart from the tree node metadata (pointers to sibling and parent
2100 nodes), each node has this:</p>
2101 <div class="highlight"><pre><span></span><code><span class="sd">/// Contents of a tree node</span>
2102 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">NodeData</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2103 <span class="w"> </span><span class="n">node_type</span>: <span class="nc">NodeType</span><span class="p">,</span><span class="w"></span>
2104 <span class="w"> </span><span class="n">element_name</span>: <span class="nc">QualName</span><span class="p">,</span><span class="w"></span>
2105 <span class="w"> </span><span class="n">id</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="c1">// id attribute from XML element</span>
2106 <span class="w"> </span><span class="n">class</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="c1">// class attribute from XML element</span>
2107 <span class="w"> </span><span class="n">specified_values</span>: <span class="nc">SpecifiedValues</span><span class="p">,</span><span class="w"></span>
2108 <span class="w"> </span><span class="n">important_styles</span>: <span class="nc">HashSet</span><span class="o">&lt;</span><span class="n">QualName</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2109 <span class="w"> </span><span class="n">result</span>: <span class="nc">NodeResult</span><span class="p">,</span><span class="w"></span>
2110 <span class="w"> </span><span class="n">transform</span>: <span class="nc">Transform</span><span class="p">,</span><span class="w"></span>
2111 <span class="w"> </span><span class="n">values</span>: <span class="nc">ComputedValues</span><span class="p">,</span><span class="w"></span>
2112 <span class="w"> </span><span class="n">cond</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
2113 <span class="w"> </span><span class="n">style_attr</span>: <span class="nb">String</span><span class="p">,</span><span class="w"></span>
2114
2115 <span class="w"> </span><span class="n">node_impl</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">NodeTrait</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="c1">// concrete struct for node types</span>
2116 <span class="p">}</span><span class="w"></span>
2117 </code></pre></div>
2118
2119 <p>On a 64-bit box, that <code>NodeData</code> struct is 1808 bytes. And the biggest fields
2120 are the <code>SpecifiedValues</code> (824 bytes) and <code>ComputedValues</code> (704 bytes).</p>
2121 <p>Librsvg represents <em>all</em> tree nodes with that struct. Consider an SVG
2122 like this:</p>
2123 <div class="highlight"><pre><span></span><code><span class="nt">&lt;svg</span> <span class="na">xmlns=</span><span class="s">&quot;http://www.w3.org/2000/svg&quot;</span> <span class="na">width=</span><span class="s">&quot;100&quot;</span> <span class="na">height=</span><span class="s">&quot;100&quot;</span><span class="nt">&gt;</span>
2124 <span class="nt">&lt;rect</span> <span class="na">x=</span><span class="s">&quot;10&quot;</span> <span class="na">y=</span><span class="s">&quot;20&quot;</span><span class="nt">/&gt;</span>
2125 <span class="nt">&lt;path</span> <span class="na">d=</span><span class="s">&quot;...&quot;</span><span class="nt">/&gt;</span>
2126 <span class="nt">&lt;text</span> <span class="na">x=</span><span class="s">&quot;10&quot;</span> <span class="na">y=</span><span class="s">&quot;20&quot;</span><span class="nt">&gt;</span>Hello<span class="nt">&lt;/text&gt;</span>
2127 <span class="c">&lt;!-- etc --&gt;</span>
2128 <span class="nt">&lt;/svg&gt;</span>
2129 </code></pre></div>
2130
2131 <p>There are 4 elements in that file. However, there are also tree nodes
2132 for the XML text nodes, that is, the whitespace between tags and the
2133 "<code>Hello</code>" inside the <code>&lt;text&gt;</code> element.</p>
2134 <p>The contents of each of those text nodes is tiny (a newline and maybe
2135 a couple of spaces), but each node still takes up at least 1808 bytes
2136 from the <code>NodeData</code> struct, plus the size of the text string.</p>
2137 <p>Let's refactor this to make it easier to remove that overhead.</p>
2138 <h2>First step: separate text nodes from element nodes</h2>
2139 <p>Internally, librsvg represents XML text nodes with a <code>NodeChars</code> struct
2140 which is basically a string with some extra stuff. All the concrete
2141 structs for tree node types must implement a trait called <code>NodeTrait</code>,
2142 and <code>NodeChars</code> is no exception:</p>
2143 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">NodeChars</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2144 <span class="w"> </span><span class="c1">// a string with the text node&#39;s contents</span>
2145 <span class="p">}</span><span class="w"></span>
2146
2147 <span class="k">impl</span><span class="w"> </span><span class="n">NodeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">NodeChars</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2148 <span class="w"> </span><span class="c1">// a mostly empty impl with methods that do nothing</span>
2149 <span class="p">}</span><span class="w"></span>
2150 </code></pre></div>
2151
2152 <p>You don't see it in the definition of <code>NodeData</code> in the previous
2153 section, but for a text node, the <code>NodeData.node_impl</code> field would
2154 point to a heap-allocated <code>NodeChars</code> (it can do that, since
2155 <code>NodeChars</code> implements <code>NodeTrait</code>, so it can go into <code>node_impl:
2156 Box&lt;dyn NodeTrait&gt;</code>).</p>
2157 <p>First, I turned the <code>NodeData</code> struct into an enum with two variants,
2158 and moved all of its previous fields to an <code>Element</code> struct:</p>
2159 <div class="highlight"><pre><span></span><code><span class="c1">// This one is new</span>
2160 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">NodeData</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2161 <span class="w"> </span><span class="n">Element</span><span class="p">(</span><span class="n">Element</span><span class="p">),</span><span class="w"></span>
2162 <span class="w"> </span><span class="n">Text</span><span class="p">(</span><span class="n">NodeChars</span><span class="p">),</span><span class="w"></span>
2163 <span class="p">}</span><span class="w"></span>
2164
2165 <span class="c1">// This is the old struct with a different name</span>
2166 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Element</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2167 <span class="w"> </span><span class="n">node_type</span>: <span class="nc">NodeType</span><span class="p">,</span><span class="w"></span>
2168 <span class="w"> </span><span class="n">element_name</span>: <span class="nc">QualName</span><span class="p">,</span><span class="w"></span>
2169 <span class="w"> </span><span class="n">id</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2170 <span class="w"> </span><span class="n">class</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2171 <span class="w"> </span><span class="n">specified_values</span>: <span class="nc">SpecifiedValues</span><span class="p">,</span><span class="w"></span>
2172 <span class="w"> </span><span class="n">important_styles</span>: <span class="nc">HashSet</span><span class="o">&lt;</span><span class="n">QualName</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2173 <span class="w"> </span><span class="n">result</span>: <span class="nc">NodeResult</span><span class="p">,</span><span class="w"></span>
2174 <span class="w"> </span><span class="n">transform</span>: <span class="nc">Transform</span><span class="p">,</span><span class="w"></span>
2175 <span class="w"> </span><span class="n">values</span>: <span class="nc">ComputedValues</span><span class="p">,</span><span class="w"></span>
2176 <span class="w"> </span><span class="n">cond</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
2177 <span class="w"> </span><span class="n">style_attr</span>: <span class="nb">String</span><span class="p">,</span><span class="w"></span>
2178 <span class="w"> </span><span class="n">node_impl</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">NodeTrait</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2179 <span class="p">}</span><span class="w"></span>
2180 </code></pre></div>
2181
2182 <p>The size of a Rust enum is the maximum of the sizes of its variants,
2183 plus a little extra for the discriminant (you can think of a C struct
2184 with an int for the discriminant, and a union of variants).</p>
2185 <p>The code <a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/741a0c0bb4c9bc5dc4b246a92d9ba2e26275d45d">needed a few
2186 changes</a>
2187 to split <code>NodeData</code> in this way, by adding accessor
2188 functions to each of the <code>Element</code> or <code>Text</code> cases conveniently. This
2189 is one of those refactors where you can just change the declaration,
2190 and walk down the compiler's errors to make each case use the accesors
2191 instead of whatever was done before.</p>
2192 <h2>Second step: move the Element variant to a separate allocation</h2>
2193 <p>Now, <a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/8312bc00b6a4abbac82fef0596fb25cad3a56eaf">we turn <code>NodeData</code> into
2194 this</a>:</p>
2195 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">NodeData</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2196 <span class="w"> </span><span class="n">Element</span><span class="p">(</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">Element</span><span class="o">&gt;</span><span class="p">),</span><span class="w"> </span><span class="c1">// This goes inside a Box</span>
2197 <span class="w"> </span><span class="n">Text</span><span class="p">(</span><span class="n">NodeChars</span><span class="p">),</span><span class="w"></span>
2198 <span class="p">}</span><span class="w"></span>
2199 </code></pre></div>
2200
2201 <p>That way, the <code>Element</code> variant is the size of a pointer (i.e. a
2202 pointer to the heap-allocated <code>Box</code>), and the <code>Text</code> variant is as big
2203 as <code>NodeChars</code> as usual.</p>
2204 <p>This means that <code>Element</code> nodes are just as big as before, plus an
2205 extra pointer, plus an extra heap allocation.</p>
2206 <p>However, the <code>Text</code> nodes get a lot smaller!</p>
2207 <ul>
2208 <li>Before: <code>sizeof::&lt;NodeData&gt;() = 1808</code></li>
2209 <li>After: <code>sizeof::&lt;NodeData&gt;() = 72</code></li>
2210 </ul>
2211 <p>By making the <code>Element</code> variant a lot smaller (the size of a <code>Box</code>,
2212 which is just a pointer), it has no extra overhead on the <code>Text</code>
2213 variant.</p>
2214 <p>This means that in the SVG file, all the whitespace between XML
2215 elements now takes a lot less memory.</p>
2216 <h2>Some numbers from a pathological file</h2>
2217 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/issues/42">Issue 42</a> is about
2218 an SVG file that is just a <code>&lt;use&gt;</code> element repeated many times, once
2219 per line:</p>
2220 <div class="highlight"><pre><span></span><code><span class="nt">&lt;svg</span> <span class="na">xmlns=</span><span class="s">&quot;http://www.w3.org/2000/svg&quot;</span><span class="nt">&gt;</span>
2221 <span class="nt">&lt;defs&gt;</span>
2222 <span class="nt">&lt;symbol</span> <span class="na">id=</span><span class="s">&quot;glyph0-0&quot;</span><span class="nt">&gt;</span>
2223 <span class="c">&lt;!-- a few elements here --&gt;</span>
2224 <span class="nt">&lt;/symbol&gt;</span>
2225 <span class="nt">&lt;/defs&gt;</span>
2226
2227 <span class="nt">&lt;use</span> <span class="na">xlink:href=</span><span class="s">&quot;#glyph0-0&quot;</span> <span class="na">x=</span><span class="s">&quot;1&quot;</span> <span class="na">y=</span><span class="s">&quot;10&quot;</span><span class="nt">/&gt;</span>
2228 <span class="nt">&lt;use</span> <span class="na">xlink:href=</span><span class="s">&quot;#glyph0-0&quot;</span> <span class="na">x=</span><span class="s">&quot;1&quot;</span> <span class="na">y=</span><span class="s">&quot;10&quot;</span><span class="nt">/&gt;</span>
2229 <span class="nt">&lt;use</span> <span class="na">xlink:href=</span><span class="s">&quot;#glyph0-0&quot;</span> <span class="na">x=</span><span class="s">&quot;1&quot;</span> <span class="na">y=</span><span class="s">&quot;10&quot;</span><span class="nt">/&gt;</span>
2230 <span class="c">&lt;!-- about 196,000 similar lines --&gt;</span>
2231 <span class="nt">&lt;/svg&gt;</span>
2232 </code></pre></div>
2233
2234 <p>So we have around 196,000 elements. According to <a href="https://valgrind.org/docs/manual/ms-manual.html">Valgrind's Massif
2235 tool</a>, this makes <code>rsvg-convert</code> allocate 800,501,568 bytes in the
2236 old version, versus 463,412,720 bytes in the new version, or about 60%
2237 of the space.</p>
2238 <h2>Next steps</h2>
2239 <p>There is a lot of repetition in the text nodes of a typical SVG file.
2240 For example, in that pathological file above, most of the whitespace is
2241 identical: between each element there is a newline and two spaces.
2242 Instead of having thousands of little allocations, all with the same
2243 string, there could be a pool of shared strings. Files with "real"
2244 indentation could get benefits from sharing the whitespace-only text
2245 nodes.</p>
2246 <p>Real browser engines are very careful to share the style structs
2247 across elements if possible. Look for "style struct sharing" in
2248 <a href="https://hacks.mozilla.org/2017/08/inside-a-super-fast-css-engine-quantum-css-aka-stylo/">"Inside a super fast CSS engine: Quantum CSS"</a>. This is
2249 going to take some good work in librsvg, but we can get there
2250 gradually.</p>
2251 <h2>References</h2>
2252 <ul>
2253 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/302/commits">Commits for the whole
2254 refactoring</a></li>
2255 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/741a0c0bb4c9bc5dc4b246a92d9ba2e26275d45d">Turn the NodeData struct into an enum with variants for Element and
2256 Text</a></li>
2257 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/-/commit/8312bc00b6a4abbac82fef0596fb25cad3a56eaf">Box the Element variant to make Text nodes
2258 smaller</a>.
2259 The commit message has parts of the massif log with all the
2260 interesting numbers.</li>
2261 <li><a href="https://gitlab.gnome.org/GNOME/librsvg/issues/528">Tracker bug for memory
2262 consumption</a></li>
2263 </ul></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category><category term="performance"></category></entry><entry><title>Exposing C and Rust APIs: some thoughts from librsvg</title><link href="https://people.gnome.org/~federico/blog/exposing-c-and-rust-apis.html" rel="alternate"></link><published>2020-01-15T11:15:06-06:00</published><updated>2020-01-15T11:15:06-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2020-01-15:/~federico/blog/exposing-c-and-rust-apis.html</id><summary type="html"><p>Librsvg exports two public APIs: the <a href="https://developer.gnome.org/rsvg/stable/">C API</a> that is in turn available
2264 to other languages through <a href="https://people.gnome.org/~federico/blog/magic-of-gobject-introspection.html">GObject Introspection</a>, and the <a href="https://gnome.pages.gitlab.gnome.org/librsvg/doc/librsvg/">Rust API</a>.</p>
2265 <p>You could call this a use of the <a href="https://en.wikipedia.org/wiki/Facade_pattern">facade pattern</a> on top of the
2266 <a href="https://gnome.pages.gitlab.gnome.org/librsvg/doc/rsvg_internals/index.html">rsvg_internals crate</a>. That crate <em>is</em> the actual
2267 implementation of librsvg, and exports an …</p></summary><content type="html"><p>Librsvg exports two public APIs: the <a href="https://developer.gnome.org/rsvg/stable/">C API</a> that is in turn available
2268 to other languages through <a href="https://people.gnome.org/~federico/blog/magic-of-gobject-introspection.html">GObject Introspection</a>, and the <a href="https://gnome.pages.gitlab.gnome.org/librsvg/doc/librsvg/">Rust API</a>.</p>
2269 <p>You could call this a use of the <a href="https://en.wikipedia.org/wiki/Facade_pattern">facade pattern</a> on top of the
2270 <a href="https://gnome.pages.gitlab.gnome.org/librsvg/doc/rsvg_internals/index.html">rsvg_internals crate</a>. That crate <em>is</em> the actual
2271 implementation of librsvg, and exports an interface with many knobs
2272 that are not exposed from the public APIs. The knobs are to allow for
2273 the variations in each of those APIs.</p>
2274 <p>This post is about some interesting things that have come up during
2275 the creation/separation of those public APIs, and the implications of
2276 having an internals library that implements both.</p>
2277 <h2>Initial code organization</h2>
2278 <p>When librsvg was being ported to Rust, it just had an <code>rsvg_internals</code>
2279 crate that compiled as a <code>staticlib</code> to a <code>.a</code> library, which was
2280 later linked into the final <code>librsvg.so</code>.</p>
2281 <p>Eventually the code got to the point where it was feasible to port the
2282 toplevel C API to Rust. This was relatively easy to do, since
2283 everything else underneath was already in Rust. At that point I
2284 became interested <a href="https://people.gnome.org/~federico/blog/a-rust-api-for-librsvg.html">in also having a Rust API</a> for librsvg —
2285 first to port the test suite to Rust and be able to run tests in
2286 parallel, and then to actually have a public API in Rust with more
2287 modern idioms than the historical, GObject-based API in C.</p>
2288 <p>Version <a href="https://gitlab.gnome.org/GNOME/librsvg/-/tags/2.45.5">2.45.5</a>, from February 2019, is the last release that only had
2289 a C API.</p>
2290 <p>Most of the C API of librsvg is in the <code>RsvgHandle</code> class. An
2291 <code>RsvgHandle</code> gets loaded with SVG data from a file or a stream, and
2292 then gets rendered to a Cairo context. The naming of Rust source
2293 files more or less matched the C source files, so where there was
2294 <code>rsvg-handle.c</code> initially, later we had <code>handle.rs</code> with the Rustified
2295 part of that code.</p>
2296 <p>So, <code>handle.rs</code> had the Rust internals of the <code>RsvgHandle</code> class, and
2297 a bunch of <code>extern "C"</code> functions callable from C. For example, for
2298 this function in the public C API:</p>
2299 <div class="highlight"><pre><span></span><code><span class="kt">void</span> <span class="nf">rsvg_handle_set_base_gfile</span> <span class="p">(</span><span class="n">RsvgHandle</span> <span class="o">*</span><span class="n">handle</span><span class="p">,</span>
2300 <span class="n">GFile</span> <span class="o">*</span><span class="n">base_file</span><span class="p">);</span>
2301 </code></pre></div>
2302
2303 <p>The corresponding Rust implementation <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/3d6d42fe1387588354291595585fbf498a89704e/rsvg_internals/src/handle.rs#L614-626">was this</a>:</p>
2304 <div class="highlight"><pre><span></span><code><span class="cp">#[no_mangle]</span><span class="w"></span>
2305 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_handle_rust_set_base_gfile</span><span class="p">(</span><span class="w"></span>
2306 <span class="w"> </span><span class="n">raw_handle</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RsvgHandle</span><span class="p">,</span><span class="w"></span>
2307 <span class="w"> </span><span class="n">raw_gfile</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">gio_sys</span>::<span class="n">GFile</span><span class="p">,</span><span class="w"></span>
2308 <span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2309 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">rhandle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_rust_handle</span><span class="p">(</span><span class="n">raw_handle</span><span class="p">);</span><span class="w"> </span><span class="c1">// 1</span>
2310
2311 <span class="w"> </span><span class="fm">assert!</span><span class="p">(</span><span class="o">!</span><span class="n">raw_gfile</span><span class="p">.</span><span class="n">is_null</span><span class="p">());</span><span class="w"> </span><span class="c1">// 2</span>
2312 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">file</span>: <span class="nc">gio</span>::<span class="n">File</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_glib_none</span><span class="p">(</span><span class="n">raw_gfile</span><span class="p">);</span><span class="w"> </span><span class="c1">// 3</span>
2313
2314 <span class="w"> </span><span class="n">rhandle</span><span class="p">.</span><span class="n">set_base_gfile</span><span class="p">(</span><span class="o">&amp;</span><span class="n">file</span><span class="p">);</span><span class="w"> </span><span class="c1">// 4</span>
2315 <span class="p">}</span><span class="w"></span>
2316 </code></pre></div>
2317
2318 <ol>
2319 <li>Get the Rust struct corresponding to the C GObject.</li>
2320 <li>Check the arguments.</li>
2321 <li>Convert from C GObject reference to Rust reference.</li>
2322 <li>Call the actual implementation of <code>set_base_gfile</code> in the Rust
2323 struct.</li>
2324 </ol>
2325 <p>You can see that this function takes in arguments with C types, and
2326 converts them to Rust types. It's basically just glue between the C
2327 code and the actual implementation.</p>
2328 <p>Then, the actual implementation of <code>set_base_gfile</code> looked <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/3d6d42fe1387588354291595585fbf498a89704e/rsvg_internals/src/handle.rs#L202-208">like
2329 this</a>:</p>
2330 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">Handle</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2331 <span class="w"> </span><span class="k">fn</span> <span class="nf">set_base_gfile</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">file</span>: <span class="kp">&amp;</span><span class="nc">gio</span>::<span class="n">File</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2332 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">uri</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">file</span><span class="p">.</span><span class="n">get_uri</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2333 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">set_base_url</span><span class="p">(</span><span class="o">&amp;</span><span class="n">uri</span><span class="p">);</span><span class="w"></span>
2334 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2335 <span class="w"> </span><span class="n">rsvg_g_warning</span><span class="p">(</span><span class="s">&quot;file has no URI; will not set the base URI&quot;</span><span class="p">);</span><span class="w"></span>
2336 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2337 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2338 <span class="p">}</span><span class="w"></span>
2339 </code></pre></div>
2340
2341 <p>This is an actual method for a Rust <code>Handle</code> struct, and takes Rust
2342 types as arguments — no conversions are necessary here. However,
2343 there is a pesky call to <code>rsvg_g_warning</code>, about which I'll talk later.</p>
2344 <p>I found it cleanest, although not the shortest code, to structure
2345 things like this:</p>
2346 <ul>
2347 <li>
2348 <p>C code: bunch of stub functions where <code>rsvg_blah</code> just calls a
2349 corresponding <code>rsvg_rust_blah</code>.</p>
2350 </li>
2351 <li>
2352 <p>Toplevel Rust code: bunch of <code>#[no_mangle] unsafe extern "C" fn rust_blah()</code> that
2353 convert from C argument types to Rust types, and call safe Rust
2354 functions — for librsvg, these happened to be methods for a struct.
2355 Before returning, the toplevel functions convert Rust return values
2356 to C return values, and do things like converting the <code>Err(E)</code> of a
2357 <code>Result&lt;&gt;</code> into a <code>GError</code> or a boolean or whatever the traditional
2358 C API required.</p>
2359 </li>
2360 </ul>
2361 <p>In the very first versions of the code where the public API was
2362 implemented in Rust, the <code>extern "C"</code> functions actually contained
2363 their implementation. However, after some refactoring, it turned out
2364 to be cleaner to leave those functions just with the task of
2365 converting C to Rust types and vice-versa, and put the actual
2366 implementation in very Rust-y code. This made it easier to keep the
2367 <code>unsafe</code> conversion code (unsafe because it deals with raw pointers
2368 coming from C) only in the toplevel functions.</p>
2369 <h2>Growing out a Rust API</h2>
2370 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/commit/cd848439e01378461647dd848918262bf6639193">This commit</a> is where the new, public Rust API
2371 started. That commit just created a <a href="https://doc.rust-lang.org/cargo/reference/manifest.html#the-workspace-section">Cargo workspace</a>
2372 with two crates; the <code>rsvg_internals</code> crate that we already had, and a
2373 <code>librsvg_crate</code> with the public Rust API.</p>
2374 <p>The commits over the subsequent couple of months are of intense
2375 refactoring:</p>
2376 <ul>
2377 <li>
2378 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/commit/d01f75ed310b0a7b8d62fc7de65465357f20f535">This commit</a> moves the <code>unsafe extern "C"</code>
2379 functions to a separate <code>c_api.rs</code> source file. This leaves
2380 <code>handle.rs</code> with only the safe Rust implementation of the toplevel
2381 API, and <code>c_api.rs</code> with the unsafe entry points that mostly just
2382 convert argument types, return values, and errors.</p>
2383 </li>
2384 <li>
2385 <p>The API primitives get expanded to allow for a public Rust API that
2386 is "hard to misuse" unlike the C API, which needs to be called
2387 in a certain order.</p>
2388 </li>
2389 </ul>
2390 <h2>Needing to call a C macro</h2>
2391 <p>However, there was a little problem. The Rust code cannot call
2392 <a href="https://developer.gnome.org/glib/stable/glib-Message-Logging.html#g-warning"><code>g_warning</code></a>, a C macro in glib that prints a message to
2393 stderr or uses structured logging. Librsvg used that to signal
2394 conditions where something went (recoverably) wrong, but there was no
2395 way to return a proper error code to the caller — it's mainly used as
2396 a debugging aid.</p>
2397 <p>This is what the <code>rsvg_internals</code> used to be able to call that C macro:</p>
2398 <p>First, the C code exports a function that just calls the macro:</p>
2399 <div class="highlight"><pre><span></span><code><span class="cm">/* This function exists just so that we can effectively call g_warning() from Rust,</span>
2400 <span class="cm"> * since glib-rs doesn&#39;t bind the g_log functions yet.</span>
2401 <span class="cm"> */</span>
2402 <span class="kt">void</span>
2403 <span class="nf">rsvg_g_warning_from_c</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">msg</span><span class="p">)</span>
2404 <span class="p">{</span>
2405 <span class="n">g_warning</span> <span class="p">(</span><span class="s">&quot;%s&quot;</span><span class="p">,</span> <span class="n">msg</span><span class="p">);</span>
2406 <span class="p">}</span>
2407 </code></pre></div>
2408
2409 <p>Second, the Rust code binds that function to be callable from Rust:</p>
2410 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_g_warning</span><span class="p">(</span><span class="n">msg</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2411 <span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2412 <span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_g_warning_from_c</span><span class="p">(</span><span class="n">msg</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">libc</span>::<span class="n">c_char</span><span class="p">);</span><span class="w"></span>
2413 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2414
2415 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2416 <span class="w"> </span><span class="n">rsvg_g_warning_from_c</span><span class="p">(</span><span class="n">msg</span><span class="p">.</span><span class="n">to_glib_none</span><span class="p">().</span><span class="mi">0</span><span class="p">);</span><span class="w"></span>
2417 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2418 <span class="p">}</span><span class="w"></span>
2419 </code></pre></div>
2420
2421 <p>However! Since the standalone <code>librsvg_crate</code> does not link to the C
2422 code from the public <code>librsvg.so</code>, the helper <code>rsvg_g_warning_from_c</code>
2423 is not available!</p>
2424 <h3>A configuration feature for the internals library</h3>
2425 <p>And yet! Those warnings are only meaningful for the C API, which is
2426 not able to return error codes from all situations. However, the Rust
2427 API <em>is</em> able to do that, and so doesn't need the warnings printed to
2428 stderr. My first solution was to add a build-time option for whether
2429 the <code>rsvg_internals</code> library is being build for the C library, or for
2430 the Rust one.</p>
2431 <p>In case we are building for the C library, the code calls
2432 <code>rsvg_g_warning_from_c</code> as usual.</p>
2433 <p>But in case we are building for the Rust library, that code is a
2434 no-op.</p>
2435 <p>This is the <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/4df0ef3d6dd8c277ef484b33104e9666ea6ea38c/rsvg_internals/Cargo.toml#L81-83">bit in rsvg_internals/Cargo.toml</a> to declare the feature:</p>
2436 <div class="highlight"><pre><span></span><code><span class="k">[features]</span>
2437 <span class="c1"># Enables calling g_warning() when built as part of librsvg.so</span>
2438 <span class="n">c-library</span> <span class="o">=</span> <span class="k">[]</span>
2439 </code></pre></div>
2440
2441 <p>And <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/4df0ef3d6dd8c277ef484b33104e9666ea6ea38c/rsvg_internals/src/util.rs#L31-48">this is the corresponding code</a>:</p>
2442 <div class="highlight"><pre><span></span><code><span class="cp">#[cfg(feature = </span><span class="s">&quot;c-library&quot;</span><span class="cp">)]</span><span class="w"></span>
2443 <span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_g_warning</span><span class="p">(</span><span class="n">msg</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2444 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2445 <span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2446 <span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_g_warning_from_c</span><span class="p">(</span><span class="n">msg</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">libc</span>::<span class="n">c_char</span><span class="p">);</span><span class="w"></span>
2447 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2448
2449 <span class="w"> </span><span class="n">rsvg_g_warning_from_c</span><span class="p">(</span><span class="n">msg</span><span class="p">.</span><span class="n">to_glib_none</span><span class="p">().</span><span class="mi">0</span><span class="p">);</span><span class="w"></span>
2450 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2451 <span class="p">}</span><span class="w"></span>
2452
2453 <span class="cp">#[cfg(not(feature = </span><span class="s">&quot;c-library&quot;</span><span class="cp">))]</span><span class="w"></span>
2454 <span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_g_warning</span><span class="p">(</span><span class="n">_msg</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2455 <span class="w"> </span><span class="c1">// The only callers of this are in handle.rs. When those functions</span>
2456 <span class="w"> </span><span class="c1">// are called from the Rust API, they are able to return a</span>
2457 <span class="w"> </span><span class="c1">// meaningful error code, but the C API isn&#39;t - so they issues a</span>
2458 <span class="w"> </span><span class="c1">// g_warning() instead.</span>
2459 <span class="p">}</span><span class="w"></span>
2460 </code></pre></div>
2461
2462 <p>The first function is the one that is compiled when the <code>c-library</code>
2463 feature is enabled; this happens when building <code>rsvg_internals</code> to
2464 link into <code>librsvg.so</code>.</p>
2465 <p>The second function does nothing; it is what is compiled when
2466 <code>rsvg_internals</code> is being used just from the <code>librsvg_crate</code> crate
2467 with the Rust API.</p>
2468 <p>While this worked well, it meant that <strong>the internals library was
2469 built twice</strong> on each compilation run of the whole librsvg module:
2470 once for <code>librsvg.so</code>, and once for <code>librsvg_crate</code>.</p>
2471 <h2>Making programming errors a <code>g_critical</code></h2>
2472 <p>While <code>g_warning()</code> means "something went wrong, but the program will
2473 continue", <code>g_critical()</code> means "there is a programming error". For
2474 historical reasons Glib does not abort when <code>g_critical()</code> is called,
2475 except by setting <a href="https://developer.gnome.org/glib/stable/glib-running.html#G-DEBUG:CAPS"><code>G_DEBUG=fatal-criticals</code></a>, or by
2476 running a development version of Glib.</p>
2477 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/commit/9d26e03dc64342978ab8273dcf6a474af3843b9f">This commit</a> turned warnings into critical errors when the
2478 C API was called out of order, by using a similar
2479 <code>rsvg_g_critical_from_c()</code> wrapper for a C macro.</p>
2480 <h2>Separating the C-callable code into yet another crate</h2>
2481 <p>To recapitulate, at that point we had this:</p>
2482 <div class="highlight"><pre><span></span><code><span class="nv">librsvg</span><span class="o">/</span>
2483 <span class="o">|</span> <span class="nv">Cargo</span>.<span class="nv">toml</span> <span class="o">-</span> <span class="nv">declares</span> <span class="nv">the</span> <span class="nv">Cargo</span> <span class="nv">workspace</span>
2484 <span class="o">|</span>
2485 <span class="o">+-</span> <span class="nv">rsvg_internals</span><span class="o">/</span>
2486 <span class="o">|</span> <span class="o">|</span> <span class="nv">Cargo</span>.<span class="nv">toml</span>
2487 <span class="o">|</span> <span class="o">+-</span> <span class="nv">src</span><span class="o">/</span>
2488 <span class="o">|</span> <span class="nv">c_api</span>.<span class="nv">rs</span> <span class="o">-</span> <span class="nv">convert</span> <span class="nv">types</span> <span class="nv">and</span> <span class="k">return</span> <span class="nv">values</span>, <span class="k">call</span> <span class="nl">into</span> <span class="nv">implementation</span>
2489 <span class="o">|</span> <span class="nv">handle</span>.<span class="nv">rs</span> <span class="o">-</span> <span class="nv">actual</span> <span class="nv">implementation</span>
2490 <span class="o">|</span> <span class="o">*</span>.<span class="nv">rs</span> <span class="o">-</span> <span class="nv">all</span> <span class="nv">the</span> <span class="nv">other</span> <span class="nv">internals</span>
2491 <span class="o">|</span>
2492 <span class="o">+-</span> <span class="nv">librsvg</span><span class="o">/</span>
2493 <span class="o">|</span> <span class="o">*</span>.<span class="nv">c</span> <span class="o">-</span> <span class="nv">stub</span> <span class="nv">functions</span> <span class="nv">that</span> <span class="k">call</span> <span class="nl">into</span> <span class="nv">Rust</span>
2494 <span class="o">|</span> <span class="nv">rsvg</span><span class="o">-</span><span class="nv">base</span>.<span class="nv">c</span> <span class="o">-</span> <span class="nv">contains</span> <span class="nv">rsvg_g_warning_from_c</span><span class="ss">()</span> <span class="nv">among</span> <span class="nv">others</span>
2495 <span class="o">|</span>
2496 <span class="o">+-</span> <span class="nv">librsvg_crate</span><span class="o">/</span>
2497 <span class="o">|</span> <span class="nv">Cargo</span>.<span class="nv">toml</span>
2498 <span class="o">+-</span> <span class="nv">src</span><span class="o">/</span>
2499 <span class="o">|</span> <span class="nv">lib</span>.<span class="nv">rs</span> <span class="o">-</span> <span class="nv">public</span> <span class="nv">Rust</span> <span class="nv">API</span>
2500 <span class="o">+-</span> <span class="nv">tests</span><span class="o">/</span> <span class="o">-</span> <span class="nv">tests</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">public</span> <span class="nv">Rust</span> <span class="nv">API</span>
2501 <span class="o">*</span>.<span class="nv">rs</span>
2502 </code></pre></div>
2503
2504 <p>At this point <code>c_api.rs</code> with all the <code>unsafe</code> functions looked out of
2505 place. That code is only relevant to <code>librsvg.so</code> — the public C API
2506 —, not to the Rust API in <code>librsvg_crate</code>.</p>
2507 <p>I started moving the C API glue to a separate <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/78219898d17440a41d21a206afa5a5d982dcbf9f"><code>librsvg_c_api</code> crate</a> that lives
2508 along with the C stubs:</p>
2509 <div class="highlight"><pre><span></span><code>+- librsvg/
2510 | *.c - stub functions that call into Rust
2511 | rsvg-base.c - contains rsvg_g_warning_from_c() among others
2512 | Cargo.toml
2513 | c_api.rs - what we had before
2514 </code></pre></div>
2515
2516 <p>This made the dependencies look like the following:</p>
2517 <div class="highlight"><pre><span></span><code> rsvg_internals
2518 ^ ^
2519 | \
2520 | \
2521 librsvg_crate librsvg_c_api
2522 (Rust API) ^
2523 |
2524 librsvg.so
2525 (C API)
2526 </code></pre></div>
2527
2528 <p>And also, this made it possible to <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/38fe68c2c3165a923dee9cfaa9a9f57960fb9b95">remove the configuration feature</a>
2529 for <code>rsvg_internals</code>, since the code that calls
2530 <code>rsvg_g_warning_from_c</code> now lives in <code>librsvg_c_api</code>.</p>
2531 <p>With that, <code>rsvg_internals</code> is compiled only once, as it should be.</p>
2532 <p>This also helped clean up some code in the internals library.
2533 Deprecated functions that render SVGs directly to <code>GdkPixbuf</code> are now
2534 in <code>librsvg_c_api</code> and don't clutter the <code>rsvg_internals</code> library.
2535 All the GObject boilerplate is there as well now; <code>rsvg_internals</code> is
2536 mostly safe code except for the glue to libxml2.</p>
2537 <h2>Summary</h2>
2538 <p>It was useful to move all the code that dealt with incoming C types,
2539 our outgoing C return values and errors, into the same place, and
2540 separate it from the "pure Rust" code.</p>
2541 <p>This took gradual refactoring and was not done in a single step, but
2542 it left the resulting Rust code rather nice and clean.</p>
2543 <p>When we added a new public Rust API, we had to shuffle some code
2544 around that could only be linked in the context of a C library.</p>
2545 <p>Compile-time configuration features are useful (like <code>#ifdef</code> in the C
2546 world), but they do cause double compilation if you need a C-internals
2547 and a Rust-internals library from the same code.</p>
2548 <p>Having proper error reporting throughout the Rust code is a lot of
2549 work, but pretty much invaluable. The glue code to C can then convert
2550 and expose those errors as needed.</p>
2551 <p>If you need both C and Rust APIs into the same code base, you may end
2552 up naturally using a facade pattern for each. It helps to gradually
2553 refactor the internals to be as "pure idiomatic Rust" as possible,
2554 while letting API idiosyncrasies bubble up to each individual facade.</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category></entry><entry><title>Moving gnome-shell's styles to Rust</title><link href="https://people.gnome.org/~federico/blog/moving-gnome-shell-styles-to-rust.html" rel="alternate"></link><published>2019-11-25T21:10:06-06:00</published><updated>2019-11-25T21:10:06-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-11-25:/~federico/blog/moving-gnome-shell-styles-to-rust.html</id><summary type="html"><p>Gnome-shell uses CSS processing code that dates from
2555 <a href="https://blog.ometer.com/2006/10/14/text-layout-that-works-properly/">HippoCanvas</a>,
2556 a CSS-aware canvas from around 2006. It uses libcroco to parse CSS,
2557 and implements selector matching by hand in C.</p>
2558 <p>This code is getting rather dated, and libcroco is unmaintained.</p>
2559 <p>I've been reading the code for
2560 <a href="https://gitlab.gnome.org/GNOME/gnome-shell/blob/66fc5c07/src/st/st-theme.c"><code>StTheme</code></a>
2561 and
2562 <a href="https://gitlab.gnome.org/GNOME/gnome-shell/blob/66fc5c07/src/st/st-theme-node.c"><code>StThemeNode</code></a>,
2563 and it …</p></summary><content type="html"><p>Gnome-shell uses CSS processing code that dates from
2564 <a href="https://blog.ometer.com/2006/10/14/text-layout-that-works-properly/">HippoCanvas</a>,
2565 a CSS-aware canvas from around 2006. It uses libcroco to parse CSS,
2566 and implements selector matching by hand in C.</p>
2567 <p>This code is getting rather dated, and libcroco is unmaintained.</p>
2568 <p>I've been reading the code for
2569 <a href="https://gitlab.gnome.org/GNOME/gnome-shell/blob/66fc5c07/src/st/st-theme.c"><code>StTheme</code></a>
2570 and
2571 <a href="https://gitlab.gnome.org/GNOME/gnome-shell/blob/66fc5c07/src/st/st-theme-node.c"><code>StThemeNode</code></a>,
2572 and it looks very feasible to port it gradually to Rust, by using the
2573 same crates that librsvg uses, and eventually removing libcroco
2574 altogether: <strong>gnome-shell is the last module that uses libcroco in
2575 distro packages</strong>.</p>
2576 <h2>Strategy</h2>
2577 <p><code>StTheme</code> and <code>StThemeNode</code> use libcroco to load CSS stylesheets and
2578 keep them in memory. The values of individual properties are just
2579 tokenized and kept around as a linked list of <code>CRTerm</code>; this struct
2580 represents a single token.</p>
2581 <p>Later, the drawing code uses functions like
2582 <code>st_theme_node_lookup_color(node, "property_name")</code> or
2583 <code>st_theme_node_lookup_length()</code> to query the various properties that
2584 it needs. It is <em>then</em> that the type of each property gets
2585 determined: prior to that step, property values are just tokenized,
2586 not parsed into usable values.</p>
2587 <p>I am going to start by porting the individual parsers to Rust, similar
2588 to what Paolo and I did for librsvg. It turns out that there's some
2589 code we can share.</p>
2590 <p>So far I have the <a href="https://gitlab.gnome.org/federico/stylish/blob/master/src/color.rs">parser for
2591 colors</a>
2592 implemented in Rust. This <a href="https://gitlab.gnome.org/federico/gnome-shell/commit/f1bc7b8ece4dd3384a76c46492d61de70b3d670a">removes a little bunch of
2593 code</a>
2594 from the C parsers, and replaces it with a little Rust code, since the
2595 cssparser crate can already parse CSS colors with alpha with no extra
2596 work — libcroco didn't support alpha.</p>
2597 <p>As a bonus, this supports <code>hsl()</code> colors in addition to <code>rgb()</code> ones
2598 out of the box!</p>
2599 <p>After all the parsers are done, the next step would be to convert the
2600 representation of complete stylesheets into pure Rust code.</p>
2601 <h2>What can we expect?</h2>
2602 <p><strong>A well-maintained CSS stack.</strong> Firefox and Servo both use the
2603 crates in question, so librsvg and gnome-shell should get maintenance
2604 of a robust CSS stack "for free", for the foreseeable future.</p>
2605 <p><strong>Speed.</strong> Caveat: I have no profile data for gnome-shell yet, so I don't
2606 know how much time it spends doing CSS parsing and cascading, but it
2607 looks like the Rust version has a good chance of being more efficient.</p>
2608 <p>The <a href="https://docs.rs/selectors/">selectors crate</a> has some very
2609 interesting
2610 <a href="https://hacks.mozilla.org/2017/08/inside-a-super-fast-css-engine-quantum-css-aka-stylo/">optimizations</a>
2611 from Mozilla Servo, and it is also now used in Firefox. It supports
2612 doing selector matching using Bloom filters, and can also avoid
2613 re-cascading child nodes if a change to a parent would not cause its
2614 children to change.</p>
2615 <p>All the parsing is done with zero-copy parsers thanks to Rust's string
2616 slices; without so many <code>malloc()</code> calls in the parsing code path,
2617 the parsing stage should really fly.</p>
2618 <p><strong>More CSS features.</strong> The selectors crate can do matching on
2619 basically all kinds of selectors as defined by recent CSS specs; one
2620 just has to provide the correct hooks into the calling code's
2621 representation of the DOM tree. The kind of matching that <code>StTheme</code>
2622 can do is somewhat limited; the rustification should make it match
2623 much more closely to what people expect from CSS engines in web
2624 browsers.</p>
2625 <p><strong>A well-defined model of property inheritance.</strong> <code>StThemeNode</code>'s
2626 model for CSS property inheritance is a bit ad-hoc and inconsistent.
2627 I haven't quite tested it, but from looking at the code, it seems that
2628 not all properties get inherited in the same way. I hope to move it
2629 to something closer to what librsvg already does, which should make it
2630 match people's expectations from the web.</p>
2631 <h2>In the meantime</h2>
2632 <p>I have a merge request ready to simply move the libcroco source code
2633 directly inside gnome-shell's source tree. This should let distros
2634 remove their libcroco package as soon as possible. That MR does not
2635 require Rust yet.</p>
2636 <p>My playground is here:</p>
2637 <ul>
2638 <li><a href="https://gitlab.gnome.org/federico/gnome-shell/commits/rustify-styles">Gnome-shell branch to rustify the
2639 styles</a></li>
2640 <li><a href="https://gitlab.gnome.org/federico/stylish">Stylish</a>, a Rust
2641 library that will implement gnome-shell's styling code.</li>
2642 </ul>
2643 <p>This does not compile yet! I'll plug things together tomorrow.</p>
2644 <p>(Oh, yes, the project to redo Firefox's CSS stack in Rust used to be
2645 called Stylo. I'm calling this Stylish, as in Styles for the Shell.)</p></content><category term="misc"></category><category term="gnome-shell"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Refactoring the Length type</title><link href="https://people.gnome.org/~federico/blog/refactoring-the-length-type.html" rel="alternate"></link><published>2019-11-19T10:01:14-06:00</published><updated>2019-11-19T10:01:14-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-11-19:/~federico/blog/refactoring-the-length-type.html</id><summary type="html"><p><a href="https://www.w3.org/TR/css3-values/#lengths">CSS length values</a> have a number and a unit, e.g. <code>5cm</code>
2646 or <code>6px</code>. Sometimes the unit is a <strong>percentage</strong>, like <code>50%</code>, and SVG
2647 says that lengths with percentage units should be resolved with
2648 respect to a certain rectangle. For example, consider this circle
2649 element:</p>
2650 <div class="highlight"><pre><span></span><code><span class="nt">&lt;circle</span> <span class="na">cx=</span><span class="s">&quot;50%&quot;</span> <span class="na">cy=</span><span class="s">&quot;75 …</span></code></pre></div></summary><content type="html"><p><a href="https://www.w3.org/TR/css3-values/#lengths">CSS length values</a> have a number and a unit, e.g. <code>5cm</code>
2651 or <code>6px</code>. Sometimes the unit is a <strong>percentage</strong>, like <code>50%</code>, and SVG
2652 says that lengths with percentage units should be resolved with
2653 respect to a certain rectangle. For example, consider this circle
2654 element:</p>
2655 <div class="highlight"><pre><span></span><code><span class="nt">&lt;circle</span> <span class="na">cx=</span><span class="s">&quot;50%&quot;</span> <span class="na">cy=</span><span class="s">&quot;75%&quot;</span> <span class="na">r=</span><span class="s">&quot;4px&quot;</span> <span class="na">fill=</span><span class="s">&quot;black&quot;</span><span class="nt">/&gt;</span>
2656 </code></pre></div>
2657
2658 <p>This means, draw a solid black circle whose center is at 50% of the
2659 width and 75% of the height of the current viewport. The circle
2660 should have a 4-pixel radius.</p>
2661 <p>The process of converting that kind of units into absolute pixels for
2662 the final drawing is called <strong>normalization</strong>. In SVG, percentage
2663 units sometimes need to be normalized with respect to the current
2664 viewport (a local coordinate system), or with respect to the size of
2665 another object (e.g. when a clipping path is used to cut the current
2666 shape in half).</p>
2667 <p>One detail about normalization is that it can be with respect to the
2668 horizontal dimension of the current viewport, the vertical dimension,
2669 or both. Keep this in mind: at normalization time, we need to be able
2670 to distinguish between those three modes.</p>
2671 <h2>The original C version</h2>
2672 <p>I have <a href="https://people.gnome.org/~federico/news-2016-11.html#03">talked about the original C code for lengths</a> before; the
2673 following is a small summary.</p>
2674 <p>The original C code had <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/ef720eeabf6a1bf2bca9b31756398578d75998a6/rsvg-private.h#L256-259">this struct</a> to represent lengths:</p>
2675 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
2676 <span class="kt">double</span> <span class="n">length</span><span class="p">;</span>
2677 <span class="kt">char</span> <span class="n">factor</span><span class="p">;</span>
2678 <span class="p">}</span> <span class="n">RsvgLength</span><span class="p">;</span>
2679 </code></pre></div>
2680
2681 <p>The <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/ef720eeabf6a1bf2bca9b31756398578d75998a6/rsvg-css.c#L181-205">parsing code</a> would set the <code>factor</code> field to a
2682 character depending on the length's unit: <code>'p'</code> for percentages, <code>'i'</code>
2683 for inches, etc., and <code>'\0'</code> for the default unit, which is pixels.</p>
2684 <p>Along with that, the <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/ef720eeabf6a1bf2bca9b31756398578d75998a6/rsvg-css.c#L181-205">normalization code</a> needed to know
2685 the direction (horizontal, vertical, both) to which the length in
2686 question refers. It did this by taking another character as an
2687 argument to the normalization function:</p>
2688 <div class="highlight"><pre><span></span><code><span class="kt">double</span>
2689 <span class="nf">_rsvg_css_normalize_length</span> <span class="p">(</span><span class="k">const</span> <span class="n">RsvgLength</span> <span class="o">*</span> <span class="n">in</span><span class="p">,</span> <span class="n">RsvgDrawingCtx</span> <span class="o">*</span> <span class="n">ctx</span><span class="p">,</span> <span class="kt">char</span> <span class="n">dir</span><span class="p">)</span>
2690 <span class="p">{</span>
2691 <span class="k">if</span> <span class="p">(</span><span class="n">in</span><span class="o">-&gt;</span><span class="n">factor</span> <span class="o">==</span> <span class="sc">&#39;\0&#39;</span><span class="p">)</span> <span class="cm">/* pixels, no need to normalize */</span>
2692 <span class="k">return</span> <span class="n">in</span><span class="o">-&gt;</span><span class="n">length</span><span class="p">;</span>
2693 <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">in</span><span class="o">-&gt;</span><span class="n">factor</span> <span class="o">==</span> <span class="sc">&#39;p&#39;</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* percentages; need to consider direction */</span>
2694 <span class="k">if</span> <span class="p">(</span><span class="n">dir</span> <span class="o">==</span> <span class="sc">&#39;h&#39;</span><span class="p">)</span> <span class="cm">/* horizontal */</span>
2695 <span class="k">return</span> <span class="n">in</span><span class="o">-&gt;</span><span class="n">length</span> <span class="o">*</span> <span class="n">ctx</span><span class="o">-&gt;</span><span class="n">vb</span><span class="p">.</span><span class="n">rect</span><span class="p">.</span><span class="n">width</span><span class="p">;</span>
2696 <span class="k">if</span> <span class="p">(</span><span class="n">dir</span> <span class="o">==</span> <span class="sc">&#39;v&#39;</span><span class="p">)</span> <span class="cm">/* vertical */</span>
2697 <span class="k">return</span> <span class="n">in</span><span class="o">-&gt;</span><span class="n">length</span> <span class="o">*</span> <span class="n">ctx</span><span class="o">-&gt;</span><span class="n">vb</span><span class="p">.</span><span class="n">rect</span><span class="p">.</span><span class="n">height</span><span class="p">;</span>
2698 <span class="k">if</span> <span class="p">(</span><span class="n">dir</span> <span class="o">==</span> <span class="sc">&#39;o&#39;</span><span class="p">)</span> <span class="cm">/* both */</span>
2699 <span class="k">return</span> <span class="n">in</span><span class="o">-&gt;</span><span class="n">length</span> <span class="o">*</span> <span class="n">rsvg_viewport_percentage</span> <span class="p">(</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">vb</span><span class="p">.</span><span class="n">rect</span><span class="p">.</span><span class="n">width</span><span class="p">,</span>
2700 <span class="n">ctx</span><span class="o">-&gt;</span><span class="n">vb</span><span class="p">.</span><span class="n">rect</span><span class="p">.</span><span class="n">height</span><span class="p">);</span>
2701 <span class="p">}</span> <span class="k">else</span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
2702 <span class="p">}</span>
2703 </code></pre></div>
2704
2705 <p><a href="https://people.gnome.org/~federico/news-2016-11.html#03">The original post</a> talks about how I found a couple of bugs with
2706 how the directions are identified at normalization time. The function
2707 above expects one of <code>'h'/'v'/'o'</code> for horizontal/vertical/both, and one or
2708 two places in the code passed the wrong character.</p>
2709 <h2>Making the C version cleaner</h2>
2710 <p>Before converting that code to Rust, I <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/b7768db1a9cd129298737f0d0ea9fd7cd7d444a0">removed the pesky
2711 characters</a> and made the code use proper enums to
2712 identify a length's units.</p>
2713 <div class="highlight"><pre><span></span><code><span class="o">+</span><span class="k">typedef</span> <span class="k">enum</span> <span class="p">{</span>
2714 <span class="o">+</span> <span class="n">LENGTH_UNIT_DEFAULT</span><span class="p">,</span>
2715 <span class="o">+</span> <span class="n">LENGTH_UNIT_PERCENT</span><span class="p">,</span>
2716 <span class="o">+</span> <span class="n">LENGTH_UNIT_FONT_EM</span><span class="p">,</span>
2717 <span class="o">+</span> <span class="n">LENGTH_UNIT_FONT_EX</span><span class="p">,</span>
2718 <span class="o">+</span> <span class="n">LENGTH_UNIT_INCH</span><span class="p">,</span>
2719 <span class="o">+</span> <span class="n">LENGTH_UNIT_RELATIVE_LARGER</span><span class="p">,</span>
2720 <span class="o">+</span> <span class="n">LENGTH_UNIT_RELATIVE_SMALLER</span>
2721 <span class="o">+</span><span class="p">}</span> <span class="n">LengthUnit</span><span class="p">;</span>
2722 <span class="o">+</span>
2723 <span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
2724 <span class="kt">double</span> <span class="n">length</span><span class="p">;</span>
2725 <span class="o">-</span> <span class="kt">char</span> <span class="n">factor</span><span class="p">;</span>
2726 <span class="o">+</span> <span class="n">LengthUnit</span> <span class="n">unit</span><span class="p">;</span>
2727 <span class="p">}</span> <span class="n">RsvgLength</span><span class="p">;</span>
2728 </code></pre></div>
2729
2730 <p>Then, <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/cb166d90d1b4370108ce57b8651a6a7f61ccd89d">do the same for the normalization function</a>, so it will get the
2731 direction in which to normalize as an enum instead of a char.</p>
2732 <div class="highlight"><pre><span></span><code><span class="o">+</span><span class="k">typedef</span> <span class="k">enum</span> <span class="p">{</span>
2733 <span class="o">+</span> <span class="n">LENGTH_DIR_HORIZONTAL</span><span class="p">,</span>
2734 <span class="o">+</span> <span class="n">LENGTH_DIR_VERTICAL</span><span class="p">,</span>
2735 <span class="o">+</span> <span class="n">LENGTH_DIR_BOTH</span>
2736 <span class="o">+</span><span class="p">}</span> <span class="n">LengthDir</span><span class="p">;</span>
2737
2738 <span class="kt">double</span>
2739 <span class="o">-</span><span class="n">_rsvg_css_normalize_length</span> <span class="p">(</span><span class="k">const</span> <span class="n">RsvgLength</span> <span class="o">*</span> <span class="n">in</span><span class="p">,</span> <span class="n">RsvgDrawingCtx</span> <span class="o">*</span> <span class="n">ctx</span><span class="p">,</span> <span class="kt">char</span> <span class="n">dir</span><span class="p">)</span>
2740 <span class="o">+</span><span class="n">_rsvg_css_normalize_length</span> <span class="p">(</span><span class="k">const</span> <span class="n">RsvgLength</span> <span class="o">*</span> <span class="n">in</span><span class="p">,</span> <span class="n">RsvgDrawingCtx</span> <span class="o">*</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">LengthDir</span> <span class="n">dir</span><span class="p">)</span>
2741 </code></pre></div>
2742
2743 <h2>Making the C version easier to get right</h2>
2744 <p>While doing the last change above, I found a place in the code that
2745 used the wrong direction by mistake, probably due to a cut&amp;paste
2746 error. Part of the problem here is that the code was specifying the
2747 direction at normalization time.</p>
2748 <p>I decided to change it so that <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/5a85e7cf2ecc90486346debc9a5a426163348f52">each direction value carried its own
2749 direction since initialization</a>, so that subsequent
2750 code wouldn't have to worry about that. Hopefully, initializing a
2751 <code>width</code> field should make it obvious that it needed
2752 <code>LENGTH_DIR_HORIZONTAL</code>.</p>
2753 <div class="highlight"><pre><span></span><code> <span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
2754 <span class="kt">double</span> <span class="n">length</span><span class="p">;</span>
2755 <span class="n">LengthUnit</span> <span class="n">unit</span><span class="p">;</span>
2756 <span class="o">+</span> <span class="n">LengthDir</span> <span class="n">dir</span><span class="p">;</span>
2757 <span class="p">}</span> <span class="n">RsvgLength</span><span class="p">;</span>
2758 </code></pre></div>
2759
2760 <p>That is, so that instead of</p>
2761 <div class="highlight"><pre><span></span><code> <span class="cm">/* at initialization time */</span>
2762 <span class="n">foo</span><span class="p">.</span><span class="n">width</span> <span class="o">=</span> <span class="n">_rsvg_css_parse_length</span> <span class="p">(</span><span class="n">str</span><span class="p">);</span>
2763
2764 <span class="p">...</span>
2765
2766 <span class="cm">/* at rendering time */</span>
2767 <span class="kt">double</span> <span class="n">final_width</span> <span class="o">=</span> <span class="n">_rsvg_css_normalize_length</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">.</span><span class="n">width</span><span class="p">,</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">LENGTH_DIR_HORIZONTAL</span><span class="p">);</span>
2768 </code></pre></div>
2769
2770 <p>we would instead do this:</p>
2771 <div class="highlight"><pre><span></span><code> <span class="cm">/* at initialization time */</span>
2772 <span class="n">foo</span><span class="p">.</span><span class="n">width</span> <span class="o">=</span> <span class="n">_rsvg_css_parse_length</span> <span class="p">(</span><span class="n">str</span><span class="p">,</span> <span class="n">LENGTH_DIR_HORIZONTAL</span><span class="p">);</span>
2773
2774 <span class="p">...</span>
2775
2776 <span class="cm">/* at rendering time */</span>
2777 <span class="kt">double</span> <span class="n">final_width</span> <span class="o">=</span> <span class="n">_rsvg_css_normalize_length</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">.</span><span class="n">width</span><span class="p">,</span> <span class="n">ctx</span><span class="p">);</span>
2778 </code></pre></div>
2779
2780 <p>This made the drawing code, which deals with a lot of coordinates at
2781 the same time, a lot less noisy.</p>
2782 <h2>Initial port to Rust</h2>
2783 <p>To recap, this was the state of the structs after the initial
2784 refactoring in C:</p>
2785 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">enum</span> <span class="p">{</span>
2786 <span class="n">LENGTH_UNIT_DEFAULT</span><span class="p">,</span>
2787 <span class="n">LENGTH_UNIT_PERCENT</span><span class="p">,</span>
2788 <span class="n">LENGTH_UNIT_FONT_EM</span><span class="p">,</span>
2789 <span class="n">LENGTH_UNIT_FONT_EX</span><span class="p">,</span>
2790 <span class="n">LENGTH_UNIT_INCH</span><span class="p">,</span>
2791 <span class="n">LENGTH_UNIT_RELATIVE_LARGER</span><span class="p">,</span>
2792 <span class="n">LENGTH_UNIT_RELATIVE_SMALLER</span>
2793 <span class="p">}</span> <span class="n">LengthUnit</span><span class="p">;</span>
2794
2795 <span class="k">typedef</span> <span class="k">enum</span> <span class="p">{</span>
2796 <span class="n">LENGTH_DIR_HORIZONTAL</span><span class="p">,</span>
2797 <span class="n">LENGTH_DIR_VERTICAL</span><span class="p">,</span>
2798 <span class="n">LENGTH_DIR_BOTH</span>
2799 <span class="p">}</span> <span class="n">LengthDir</span><span class="p">;</span>
2800
2801 <span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
2802 <span class="kt">double</span> <span class="n">length</span><span class="p">;</span>
2803 <span class="n">LengthUnit</span> <span class="n">unit</span><span class="p">;</span>
2804 <span class="n">LengthDir</span> <span class="n">dir</span><span class="p">;</span>
2805 <span class="p">}</span> <span class="n">RsvgLength</span><span class="p">;</span>
2806 </code></pre></div>
2807
2808 <p>This <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/03d7716b4dd2e4e7cb04f58b14f00d2bff42c0d4">ported to Rust</a> in a straightforward fashion:</p>
2809 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">LengthUnit</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2810 <span class="w"> </span><span class="nb">Default</span><span class="p">,</span><span class="w"></span>
2811 <span class="w"> </span><span class="n">Percent</span><span class="p">,</span><span class="w"></span>
2812 <span class="w"> </span><span class="n">FontEm</span><span class="p">,</span><span class="w"></span>
2813 <span class="w"> </span><span class="n">FontEx</span><span class="p">,</span><span class="w"></span>
2814 <span class="w"> </span><span class="n">Inch</span><span class="p">,</span><span class="w"></span>
2815 <span class="w"> </span><span class="n">RelativeLarger</span><span class="p">,</span><span class="w"></span>
2816 <span class="w"> </span><span class="n">RelativeSmaller</span><span class="w"></span>
2817 <span class="p">}</span><span class="w"></span>
2818
2819 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">LengthDir</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2820 <span class="w"> </span><span class="n">Horizontal</span><span class="p">,</span><span class="w"></span>
2821 <span class="w"> </span><span class="n">Vertical</span><span class="p">,</span><span class="w"></span>
2822 <span class="w"> </span><span class="n">Both</span><span class="w"></span>
2823 <span class="p">}</span><span class="w"></span>
2824
2825 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">RsvgLength</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2826 <span class="w"> </span><span class="n">length</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"></span>
2827 <span class="w"> </span><span class="n">unit</span>: <span class="nc">LengthUnit</span><span class="p">,</span><span class="w"></span>
2828 <span class="w"> </span><span class="n">dir</span>: <span class="nc">LengthDir</span><span class="w"></span>
2829 <span class="p">}</span><span class="w"></span>
2830 </code></pre></div>
2831
2832 <p>It <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/822459f29b74e4e154d721c80f6e16fe6f05e0f2">got a similar constructor</a> that took the
2833 direction and produced an <code>RsvgLength</code>:</p>
2834 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">RsvgLength</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2835 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="w"> </span><span class="p">(</span><span class="n">string</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">dir</span>: <span class="nc">LengthDir</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RsvgLength</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
2836 <span class="p">}</span><span class="w"></span>
2837 </code></pre></div>
2838
2839 <p>(This was <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/e299ef0e285f7d1267528a38d8c41596b89da526">before using <code>Result</code></a>; remember that the original
2840 C code did very little error checking!)</p>
2841 <h2>The initial Parse trait</h2>
2842 <p>It was at that point that it seemed convenient to <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/5510e4c3f9c9c84bac4a05f6d93a09c9b4862544">introduce a <code>Parse</code>
2843 trait</a>, which all CSS value types would implement to
2844 parse themselves from string.</p>
2845 <p>However, parsing an <code>RsvgLength</code> also needed an extra piece of data,
2846 the <code>LengthDir</code>. My initial version of the <code>Parse</code> trait had an
2847 associated called <code>Data</code>, through which one could pass an extra piece
2848 of data during parsing/initialization:</p>
2849 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Parse</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"></span>
2850 <span class="w"> </span><span class="k">type</span> <span class="nc">Data</span><span class="p">;</span><span class="w"></span>
2851 <span class="w"> </span><span class="k">type</span> <span class="nb">Err</span><span class="p">;</span><span class="w"></span>
2852
2853 <span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="w"> </span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="nc">Self</span>::<span class="n">Data</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="nb">Err</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
2854 <span class="p">}</span><span class="w"></span>
2855 </code></pre></div>
2856
2857 <p>This was explicitly to be able to <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/0cd2fb2145c25898f296fe2c268f7527f58834c8">pass a <code>LengthDir</code> to the parser</a> for
2858 <code>RsvgLength</code>:</p>
2859 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">Parse</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">RsvgLength</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2860 <span class="w"> </span><span class="k">type</span> <span class="nc">Data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">LengthDir</span><span class="p">;</span><span class="w"></span>
2861 <span class="w"> </span><span class="k">type</span> <span class="nb">Err</span> <span class="o">=</span><span class="w"> </span><span class="n">AttributeError</span><span class="p">;</span><span class="w"></span>
2862
2863 <span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="w"> </span><span class="p">(</span><span class="n">string</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">dir</span>: <span class="nc">LengthDir</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span> <span class="o">&lt;</span><span class="n">RsvgLength</span><span class="p">,</span><span class="w"> </span><span class="n">AttributeError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
2864 <span class="p">}</span><span class="w"></span>
2865 </code></pre></div>
2866
2867 <p>This was okay for lengths, but <em>very noisy</em> for everything else that
2868 didn't require an extra bit of data. In the rest of the code, the
2869 helper type was <code>Data = ()</code> and there was a pair of extra parentheses <code>()</code>
2870 in every place that <code>parse()</code> was called.</p>
2871 <h2>Removing the helper Data type</h2>
2872 <h3>Introducing one type per direction</h3>
2873 <p>Over a year later, that <code>()</code> bit of data everywhere was driving me
2874 nuts. I started refactoring the <code>Length</code> module to remove it.</p>
2875 <p>First, I <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/cab989b9f749ca71551ad1fb68411bb2e96d96b8">introduced three newtypes</a> to wrap <code>Length</code>, and indicate
2876 their direction at the same time:</p>
2877 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">LengthHorizontal</span><span class="p">(</span><span class="n">Length</span><span class="p">);</span><span class="w"></span>
2878 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">LengthVertical</span><span class="p">(</span><span class="n">Length</span><span class="p">);</span><span class="w"></span>
2879 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">LengthBoth</span><span class="p">(</span><span class="n">Length</span><span class="p">);</span><span class="w"></span>
2880 </code></pre></div>
2881
2882 <p>This was <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/cab989b9f749ca71551ad1fb68411bb2e96d96b8">done with a macro</a> because now each wrapper type
2883 needed to know the relevant <code>LengthDir</code>.</p>
2884 <p>Now, for example, the declaration for the <code>&lt;circle&gt;</code> element looked
2885 like this:</p>
2886 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">NodeCircle</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2887 <span class="w"> </span><span class="n">cx</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="n">LengthHorizontal</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2888 <span class="w"> </span><span class="n">cy</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="n">LengthVertical</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2889 <span class="w"> </span><span class="n">r</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="n">LengthBoth</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
2890 <span class="p">}</span><span class="w"></span>
2891 </code></pre></div>
2892
2893 <p>(Ignore the <code>Cell</code> everywhere; we got rid of that later.)</p>
2894 <h3>Removing the <code>dir</code> field</h3>
2895 <p>Since now the information about the length's direction is embodied in
2896 the <code>LengthHorizontal/LengthVertical/LengthBoth</code> types, this made it
2897 possible to <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/78f39807ce75e176840ec049539d9897f53316d5">remove the <code>dir</code> field</a> from the inner
2898 <code>Length</code> struct.</p>
2899 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">RsvgLength</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2900 <span class="w"> </span><span class="n">length</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"></span>
2901 <span class="w"> </span><span class="n">unit</span>: <span class="nc">LengthUnit</span><span class="p">,</span><span class="w"></span>
2902 <span class="o">-</span><span class="w"> </span><span class="n">dir</span>: <span class="nc">LengthDir</span><span class="w"></span>
2903 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2904
2905 <span class="o">+</span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">LengthHorizontal</span><span class="p">(</span><span class="n">Length</span><span class="p">);</span><span class="w"></span>
2906 <span class="o">+</span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">LengthVertical</span><span class="p">(</span><span class="n">Length</span><span class="p">);</span><span class="w"></span>
2907 <span class="o">+</span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">LengthBoth</span><span class="p">(</span><span class="n">Length</span><span class="p">);</span><span class="w"></span>
2908 <span class="o">+</span><span class="w"></span>
2909 <span class="o">+</span><span class="n">define_length_type</span><span class="o">!</span><span class="p">(</span><span class="n">LengthHorizontal</span><span class="p">,</span><span class="w"> </span><span class="n">LengthDir</span>::<span class="n">Horizontal</span><span class="p">);</span><span class="w"></span>
2910 <span class="o">+</span><span class="n">define_length_type</span><span class="o">!</span><span class="p">(</span><span class="n">LengthVertical</span><span class="p">,</span><span class="w"> </span><span class="n">LengthDir</span>::<span class="n">Vertical</span><span class="p">);</span><span class="w"></span>
2911 <span class="o">+</span><span class="n">define_length_type</span><span class="o">!</span><span class="p">(</span><span class="n">LengthBoth</span><span class="p">,</span><span class="w"> </span><span class="n">LengthDir</span>::<span class="n">Both</span><span class="p">);</span><span class="w"></span>
2912 </code></pre></div>
2913
2914 <p>Note the use of a <code>define_length_type!</code> macro to generate code for
2915 those three newtypes.</p>
2916 <h3>Removing the <code>Data</code> associated type</h3>
2917 <p>And finally, this made it possible to <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/0532d20d92971e7346fade2c48483b06dd49762c">remove the <code>Data</code> associated
2918 type</a> from the <code>Parse</code> trait.</p>
2919 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Parse</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"></span>
2920 <span class="o">-</span><span class="w"> </span><span class="k">type</span> <span class="nc">Data</span><span class="p">;</span><span class="w"></span>
2921 <span class="w"> </span><span class="k">type</span> <span class="nb">Err</span><span class="p">;</span><span class="w"></span>
2922
2923 <span class="o">-</span><span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="n">parser</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Parser</span><span class="o">&lt;&#39;</span><span class="nb">_</span><span class="p">,</span><span class="w"> </span><span class="o">&#39;</span><span class="nb">_</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="nc">Self</span>::<span class="n">Data</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="nb">Err</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
2924 <span class="o">+</span><span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="n">parser</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Parser</span><span class="o">&lt;&#39;</span><span class="nb">_</span><span class="p">,</span><span class="w"> </span><span class="o">&#39;</span><span class="nb">_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="nb">Err</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
2925 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2926 </code></pre></div>
2927
2928 <p>The resulting mega-commit removed a bunch of stray parentheses <code>()</code>
2929 from all calls to <code>parse()</code>, and the code ended up a lot easier to
2930 read.</p>
2931 <h2>Removing the newtypes</h2>
2932 <p>This was fine for a while. Recently, however, I figured out that it
2933 would be possible to embody the information for a length's direction
2934 in a different way.</p>
2935 <p>But to get there, I first needed a temporary refactor.</p>
2936 <h3>Replacing the macro with a trait with a default implementation</h3>
2937 <p>Deep in the guts of <code>length.rs</code>, the key function that does something
2938 different based on <code>LengthDir</code> is its <code>scaling_factor</code> method:</p>
2939 <div class="highlight"><pre><span></span><code><span class="k">enum</span> <span class="nc">LengthDir</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2940 <span class="w"> </span><span class="n">Horizontal</span><span class="p">,</span><span class="w"></span>
2941 <span class="w"> </span><span class="n">Vertical</span><span class="p">,</span><span class="w"></span>
2942 <span class="w"> </span><span class="n">Both</span><span class="p">,</span><span class="w"></span>
2943 <span class="p">}</span><span class="w"></span>
2944
2945 <span class="k">impl</span><span class="w"> </span><span class="n">LengthDir</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2946 <span class="w"> </span><span class="k">fn</span> <span class="nf">scaling_factor</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">x</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="kt">f64</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f64</span> <span class="p">{</span><span class="w"></span>
2947 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2948 <span class="w"> </span><span class="n">LengthDir</span>::<span class="n">Horizontal</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">x</span><span class="p">,</span><span class="w"></span>
2949 <span class="w"> </span><span class="n">LengthDir</span>::<span class="n">Vertical</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"></span>
2950 <span class="w"> </span><span class="n">LengthDir</span>::<span class="n">Both</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">viewport_percentage</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">),</span><span class="w"></span>
2951 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2952 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2953 <span class="p">}</span><span class="w"></span>
2954 </code></pre></div>
2955
2956 <p>That method gets passed, for example, the <code>width/height</code> of the
2957 current viewport for the <code>x/y</code> arguments. The method decides whether
2958 to use the width, height, or a combination of both.</p>
2959 <p>And of course, the interesting part of the <code>define_length_type!</code> macro
2960 was to generate code for calling <code>LengthDir::Horizontal::scaling_factor()</code>/etc. as
2961 appropriate depending on the <code>LengthDir</code> in question.</p>
2962 <p>First I made a trait called <code>Orientation</code> with a <code>scaling_factor</code>
2963 method, and three zero-sized types that implement that trait. Note
2964 how each of these three implementations corresponds to one of the
2965 <code>match</code> arms above:</p>
2966 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Orientation</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2967 <span class="w"> </span><span class="k">fn</span> <span class="nf">scaling_factor</span><span class="p">(</span><span class="n">x</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="kt">f64</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f64</span><span class="p">;</span><span class="w"></span>
2968 <span class="p">}</span><span class="w"></span>
2969
2970 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Horizontal</span><span class="p">;</span><span class="w"></span>
2971 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Vertical</span><span class="p">;</span><span class="w"></span>
2972 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Both</span><span class="p">;</span><span class="w"></span>
2973
2974 <span class="k">impl</span><span class="w"> </span><span class="n">Orientation</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Horizontal</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2975 <span class="w"> </span><span class="k">fn</span> <span class="nf">scaling_factor</span><span class="p">(</span><span class="n">x</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">_y</span>: <span class="kt">f64</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f64</span> <span class="p">{</span><span class="w"></span>
2976 <span class="w"> </span><span class="n">x</span><span class="w"></span>
2977 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2978 <span class="p">}</span><span class="w"></span>
2979
2980 <span class="k">impl</span><span class="w"> </span><span class="n">Orientation</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Vertical</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2981 <span class="w"> </span><span class="k">fn</span> <span class="nf">scaling_factor</span><span class="p">(</span><span class="n">_x</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="kt">f64</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f64</span> <span class="p">{</span><span class="w"></span>
2982 <span class="w"> </span><span class="n">y</span><span class="w"></span>
2983 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2984 <span class="p">}</span><span class="w"></span>
2985
2986 <span class="k">impl</span><span class="w"> </span><span class="n">Orientation</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Both</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
2987 <span class="w"> </span><span class="k">fn</span> <span class="nf">scaling_factor</span><span class="p">(</span><span class="n">x</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="kt">f64</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f64</span> <span class="p">{</span><span class="w"></span>
2988 <span class="w"> </span><span class="n">viewport_percentage</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">)</span><span class="w"></span>
2989 <span class="w"> </span><span class="p">}</span><span class="w"></span>
2990 <span class="p">}</span><span class="w"></span>
2991 </code></pre></div>
2992
2993 <p>Now most of the contents of the <code>define_length_type!</code> macro can go in
2994 the <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/21a40f4ef801eecfe58c7602835c37f436f0e926">default implementation of a new trait
2995 <code>LengthTrait</code></a>. Crucially, this trait has an
2996 <code>Orientation</code> associated type, <strong>which it uses to call into the
2997 Orientation trait</strong>:</p>
2998 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">LengthTrait</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"></span>
2999 <span class="w"> </span><span class="k">type</span> <span class="nc">Orientation</span>: <span class="nc">Orientation</span><span class="p">;</span><span class="w"></span>
3000
3001 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
3002
3003 <span class="w"> </span><span class="k">fn</span> <span class="nf">normalize</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="kp">&amp;</span><span class="nc">ComputedValues</span><span class="p">,</span><span class="w"> </span><span class="n">params</span>: <span class="kp">&amp;</span><span class="nc">ViewParams</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f64</span> <span class="p">{</span><span class="w"></span>
3004 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">unit</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3005 <span class="w"> </span><span class="n">LengthUnit</span>::<span class="n">Px</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">length</span><span class="p">(),</span><span class="w"></span>
3006
3007 <span class="w"> </span><span class="n">LengthUnit</span>::<span class="n">Percent</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3008 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">length</span><span class="p">()</span><span class="w"> </span><span class="o">*</span><span class="w"></span>
3009 <span class="w"> </span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Orientation</span><span class="o">&gt;</span>::<span class="n">scaling_factor</span><span class="p">(</span><span class="n">params</span><span class="p">.</span><span class="n">view_box_width</span><span class="p">,</span><span class="w"> </span><span class="n">params</span><span class="p">.</span><span class="n">view_box_height</span><span class="p">)</span><span class="w"></span>
3010 <span class="w"> </span><span class="p">}</span><span class="w"></span>
3011
3012 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
3013 <span class="w"> </span><span class="p">}</span><span class="w"></span>
3014 <span class="p">}</span><span class="w"></span>
3015 </code></pre></div>
3016
3017 <p>Note that the incantation is
3018 <code>&lt;Self::Orientation&gt;::scaling_factor(...)</code> to call that method on the
3019 associated type.</p>
3020 <p>Now the <code>define_length_type!</code> macro is shrunk a lot, with the
3021 interesting part being just this:</p>
3022 <div class="highlight"><pre><span></span><code><span class="fm">macro_rules!</span><span class="w"> </span><span class="n">define_length_type</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3023 <span class="w"> </span><span class="p">{</span><span class="cp">$name</span>:<span class="nc">ident</span><span class="p">,</span><span class="w"> </span><span class="cp">$orient</span>:<span class="nc">ty</span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3024 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="cp">$name</span><span class="p">(</span><span class="n">Length</span><span class="p">);</span><span class="w"></span>
3025
3026 <span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">LengthTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="cp">$name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3027 <span class="w"> </span><span class="k">type</span> <span class="nc">Orientation</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="cp">$orient</span><span class="p">;</span><span class="w"></span>
3028 <span class="w"> </span><span class="p">}</span><span class="w"></span>
3029 <span class="w"> </span><span class="p">}</span><span class="w"></span>
3030 <span class="p">}</span><span class="w"></span>
3031
3032 <span class="n">define_length_type</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">LengthHorizontal</span><span class="p">,</span><span class="w"> </span><span class="n">Horizontal</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
3033 <span class="n">define_length_type</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">LengthVertical</span><span class="p">,</span><span class="w"> </span><span class="n">Vertical</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
3034 <span class="n">define_length_type</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">LengthBoth</span><span class="p">,</span><span class="w"> </span><span class="n">Both</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
3035 </code></pre></div>
3036
3037 <p>We moved from having three newtypes of length-with-LengthDir to three
3038 newtypes with dir-as-associated-type.</p>
3039 <h3>Removing the newtypes and the macro</h3>
3040 <p>After that temporary refactoring, we had the <code>Orientation</code> trait and
3041 the three zero-sized types <code>Horizontal</code>, <code>Vertical</code>, <code>Both</code>.</p>
3042 <p>I figured out that one can use <a href="https://doc.rust-lang.org/std/marker/struct.PhantomData.html"><code>PhantomData</code></a> as a way to carry
3043 around the type that <code>Length</code> needs to normalize itself, instead of
3044 using an associated type in an extra <code>LengthTrait</code>. Behold!</p>
3045 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Length</span><span class="o">&lt;</span><span class="n">O</span>: <span class="nc">Orientation</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3046 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">length</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"></span>
3047 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">unit</span>: <span class="nc">LengthUnit</span><span class="p">,</span><span class="w"></span>
3048 <span class="w"> </span><span class="n">orientation</span>: <span class="nc">PhantomData</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3049 <span class="p">}</span><span class="w"></span>
3050
3051 <span class="k">impl</span><span class="o">&lt;</span><span class="n">O</span>: <span class="nc">Orientation</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Length</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3052 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">normalize</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="kp">&amp;</span><span class="nc">ComputedValues</span><span class="p">,</span><span class="w"> </span><span class="n">params</span>: <span class="kp">&amp;</span><span class="nc">ViewParams</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f64</span> <span class="p">{</span><span class="w"></span>
3053 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">unit</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3054 <span class="w"> </span><span class="n">LengthUnit</span>::<span class="n">Px</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">length</span><span class="p">,</span><span class="w"></span>
3055
3056 <span class="w"> </span><span class="n">LengthUnit</span>::<span class="n">Percent</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3057 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">length</span><span class="w"> </span>
3058 <span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="o">&lt;</span><span class="n">O</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">Orientation</span><span class="o">&gt;</span>::<span class="n">scaling_factor</span><span class="p">(</span><span class="n">params</span><span class="p">.</span><span class="n">view_box_width</span><span class="p">,</span><span class="w"> </span><span class="n">params</span><span class="p">.</span><span class="n">view_box_height</span><span class="p">)</span><span class="w"></span>
3059 <span class="w"> </span><span class="p">}</span><span class="w"></span>
3060
3061 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
3062 <span class="w"> </span><span class="p">}</span><span class="w"></span>
3063 <span class="w"> </span><span class="p">}</span><span class="w"></span>
3064 <span class="p">}</span><span class="w"></span>
3065 </code></pre></div>
3066
3067 <p>Now the incantation is <code>&lt;O as Orientation&gt;::scaling_factor()</code> to call
3068 the method on the generic type; it is no longer an associated type in
3069 a trait.</p>
3070 <p>With that, users of lengths look like this; here, our <code>&lt;circle&gt;</code>
3071 element from before:</p>
3072 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Circle</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3073 <span class="w"> </span><span class="n">cx</span>: <span class="nc">Length</span><span class="o">&lt;</span><span class="n">Horizontal</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3074 <span class="w"> </span><span class="n">cy</span>: <span class="nc">Length</span><span class="o">&lt;</span><span class="n">Vertical</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3075 <span class="w"> </span><span class="n">r</span>: <span class="nc">Length</span><span class="o">&lt;</span><span class="n">Both</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3076 <span class="p">}</span><span class="w"></span>
3077 </code></pre></div>
3078
3079 <p>I'm very happy with the readability of all the code now. I used to
3080 think of <code>PhantomData</code> as a way to deal with <a href="https://doc.rust-lang.org/nomicon/phantom-data.html">wrapping pointers from
3081 C</a>, but it turns out that it is also useful to keep a generic
3082 type around should one need it.</p>
3083 <p>The final <code>Length</code> struct is this:</p>
3084 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Length</span><span class="o">&lt;</span><span class="n">O</span>: <span class="nc">Orientation</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3085 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">length</span>: <span class="kt">f64</span><span class="p">,</span><span class="w"></span>
3086 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">unit</span>: <span class="nc">LengthUnit</span><span class="p">,</span><span class="w"></span>
3087 <span class="w"> </span><span class="n">orientation</span>: <span class="nc">PhantomData</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3088 <span class="p">}</span><span class="w"></span>
3089 </code></pre></div>
3090
3091 <p>And it only takes up as much space as its <code>length</code> and <code>unit</code> fields;
3092 <code>PhantomData</code> is zero-sized after all.</p>
3093 <p>(Later, we renamed <code>Orientation</code> to <code>Normalize</code>, but the code's
3094 structure remained the same.)</p>
3095 <h2>Summary</h2>
3096 <p>Over a couple of years, librsvg's type that represents CSS lengths
3097 went from a C representation along the lines of "all data in the world
3098 is an int", to a Rust representation that uses some interesting type
3099 trickery:</p>
3100 <ul>
3101 <li>
3102 <p>C struct with <code>char</code> for units.</p>
3103 </li>
3104 <li>
3105 <p>C struct with a <code>LengthUnits</code> enum.</p>
3106 </li>
3107 <li>
3108 <p>C struct without an embodied direction; each place that needs to
3109 normalize needs to get the orientation right.</p>
3110 </li>
3111 <li>
3112 <p>C struct with a built-in direction as an extra field, done at
3113 initialization time.</p>
3114 </li>
3115 <li>
3116 <p>Same struct but in Rust.</p>
3117 </li>
3118 <li>
3119 <p>An ugly but workable <code>Parse</code> trait so that the direction can be set
3120 at parse/initialization time.</p>
3121 </li>
3122 <li>
3123 <p>Three newtypes <code>LengthHorizontal</code>, <code>LengthVertical</code>,
3124 <code>LengthBoth</code> with a common core. A cleaned-up <code>Parse</code> trait. A
3125 macro to generate those newtypes.</p>
3126 </li>
3127 <li>
3128 <p>Replace the <code>LengthDir</code> enum with an <code>Orientation</code>
3129 trait, and three zero-sized types <code>Horizontal/Vertical/Both</code> that
3130 implement the trait.</p>
3131 </li>
3132 <li>
3133 <p>Replace most of the macro with a helper trait <code>LengthTrait</code> that has
3134 an <code>Orientation</code> associated type.</p>
3135 </li>
3136 <li>
3137 <p>Replace the helper trait with a single <code>Length&lt;T: Orientation&gt;</code>
3138 type, which puts the orientation as a generic parameter. The macro
3139 disappears and there is a single implementation for everything.</p>
3140 </li>
3141 </ul>
3142 <p>Refactoring never ends!</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category></entry><entry><title>CSS in librsvg is now in Rust, courtesy of Mozilla Servo</title><link href="https://people.gnome.org/~federico/blog/css-in-librsvg-is-now-in-rust.html" rel="alternate"></link><published>2019-11-11T19:36:04-06:00</published><updated>2019-11-11T19:36:04-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-11-11:/~federico/blog/css-in-librsvg-is-now-in-rust.html</id><summary type="html"><p>Summary: after an <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/237">epic amount of
3143 refactoring</a>,
3144 librsvg now does all CSS parsing and matching in Rust, <strong>without using
3145 libcroco</strong>. In addition, the CSS engine comes from Mozilla Servo, so
3146 it should be able to handle much more complex CSS than librsvg ever
3147 could before.</p>
3148 <p>This is the story of …</p></summary><content type="html"><p>Summary: after an <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/237">epic amount of
3149 refactoring</a>,
3150 librsvg now does all CSS parsing and matching in Rust, <strong>without using
3151 libcroco</strong>. In addition, the CSS engine comes from Mozilla Servo, so
3152 it should be able to handle much more complex CSS than librsvg ever
3153 could before.</p>
3154 <p>This is the story of CSS support in librsvg.</p>
3155 <h2>Introduction</h2>
3156 <p>The <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/c5cbb842b25980321a09c427531279d56097b646">first commit to introduce CSS
3157 parsing</a>
3158 in librsvg dates from 2002. It was as minimal as possible, written to
3159 support a small subset of what was then
3160 <a href="https://www.w3.org/TR/1998/REC-CSS2-19980512/">CSS2</a>.</p>
3161 <p>Librsvg handled CSS stylesheets more "piecing them apart" than
3162 "parsing them". You know, when <code>g_strsplit()</code> is your best friend.
3163 The basic parsing algorithm was to turn a stylesheet like this:</p>
3164 <div class="highlight"><pre><span></span><code><span class="nt">rect</span> <span class="p">{</span> <span class="n">fill</span><span class="p">:</span> <span class="kc">blue</span><span class="p">;</span> <span class="p">}</span>
3165
3166 <span class="p">.</span><span class="nc">classname</span> <span class="p">{</span>
3167 <span class="n">fill</span><span class="p">:</span> <span class="kc">green</span><span class="p">;</span>
3168 <span class="n">stroke-width</span><span class="p">:</span> <span class="mi">4</span><span class="p">;</span>
3169 <span class="p">}</span>
3170 </code></pre></div>
3171
3172 <p>Into a hash table whose keys are strings like <code>rect</code> and <code>.classname</code>,
3173 and whose values are everything inside curly braces.</p>
3174 <p>The selector matching phase was equally simple. The code only handled
3175 a few possible match types as follows. If it wanted to match a
3176 certain kind of CSS selector, it would say, "what would this selector
3177 look like in CSS syntax", it would make up a string with that syntax,
3178 and compare it to the key strings it had stored in the hash table from
3179 above.</p>
3180 <p>So, to match an <strong>element name selector</strong>, it would <code>sprintf("%s",
3181 element-&gt;name)</code>, obtain something like <code>rect</code> and see if the hash
3182 table had such a key.</p>
3183 <p>To match a <strong>class selector</strong>, it would <code>sprintf(".%s",
3184 element-&gt;class)</code>, obtain something like <code>.classname</code>, and look it up
3185 in the hash table.</p>
3186 <p>This scheme supported only a few combinations. It handled <code>tag</code>,
3187 <code>.class</code>, <code>tag.class</code>, and a few combinations with <code>#id</code> in them.
3188 This was enough to support very simple stylesheets.</p>
3189 <p>The value corresponding to each key in the hash table was the stuff
3190 between curly braces in the stylesheet, so the second rule from the
3191 example above would contain <code>fill: green; stroke-width: 4;</code>. Once
3192 librsvg decided that an SVG element matched that CSS rule, it would
3193 re-parse the string with the CSS properties and apply them to the
3194 element's style.</p>
3195 <p>I'm amazed that so little code was enough to deal with a good number
3196 of SVG files with stylesheets. I suspect that this was due to a few
3197 things:</p>
3198 <ul>
3199 <li>
3200 <p>While people were using complex CSS in HTML all the time, it was
3201 less common for SVG...</p>
3202 </li>
3203 <li>
3204 <p>... because CSS2 was somewhat new, and the SVG spec was still being
3205 written...</p>
3206 </li>
3207 <li>
3208 <p>... and SVGs created with illustration programs don't really use
3209 stylesheets; they include the full style information inside each
3210 element instead of symbolically referencing it from a stylesheet.</p>
3211 </li>
3212 </ul>
3213 <p>From the kinds of bugs that librsvg has gotten around "CSS support is
3214 too limited", it feels like SVGs which use CSS features are either
3215 hand-written, or machine-generated from custom programs like data
3216 plotting software. Illustration programs tend to list all style
3217 properties explicitly in each SVG element, and don't use CSS.</p>
3218 <h2>Libcroco appears</h2>
3219 <p>The first commit to <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/99de937717129fdf8539904618b918f6119f43a1">introduce
3220 libcroco</a>
3221 was to do CSS parsing, from March 2003.</p>
3222 <p>At the same time, libcroco was introducing code to do CSS matching.
3223 However, this code never got used in librsvg; it still kept its simple
3224 string-based matcher. Maybe libcroco's API was not ready?</p>
3225 <p>Libcroco fell out of maintainership around the first half of 2005, and
3226 volunteers have kept fixing it since then.</p>
3227 <h2>Problems with librsvg's string matcher for CSS</h2>
3228 <p>The C implementation of CSS matching in librsvg remained basically
3229 untouched until 2018, when Paolo Borelli and I started porting the
3230 surrounding code to Rust.</p>
3231 <p>I had a lot of trouble figuring out the concepts from the code. I
3232 didn't know all the <a href="https://gnome.pages.gitlab.gnome.org/librsvg/doc/rsvg_internals/css/index.html#terminology">terminology of CSS
3233 implementations</a>,
3234 and librsvg didn't use it, either.</p>
3235 <p>I think that librsvg's code suffered from what the refactoring
3236 literature calls <a href="https://refactoring.guru/smells/primitive-obsession"><strong>primitive
3237 obsession</strong></a>.
3238 Instead of having a parsed representation of CSS selectors, librsvg
3239 just stored a stringified version of them. So, a selector like
3240 <code>rect#classname</code> really was stored with a string like that, instead of
3241 an actual decomposition into structs.</p>
3242 <p>Moreover, things were misnamed. This is the field that stored
3243 stylesheet data inside an RsvgHandle:</p>
3244 <div class="highlight"><pre><span></span><code> <span class="n">GHashTable</span> <span class="o">*</span><span class="n">css_props</span><span class="p">;</span>
3245 </code></pre></div>
3246
3247 <p>From just looking at the field declaration, this doesn't tell me
3248 anything about what kind of data is stored there. One has to grep the
3249 source code for where that field is used:</p>
3250 <div class="highlight"><pre><span></span><code><span class="k">static</span> <span class="kt">void</span>
3251 <span class="nf">rsvg_css_define_style</span> <span class="p">(</span><span class="n">RsvgHandle</span> <span class="o">*</span> <span class="n">ctx</span><span class="p">,</span>
3252 <span class="k">const</span> <span class="n">gchar</span> <span class="o">*</span> <span class="n">selector</span><span class="p">,</span>
3253 <span class="k">const</span> <span class="n">gchar</span> <span class="o">*</span> <span class="n">style_name</span><span class="p">,</span>
3254 <span class="k">const</span> <span class="n">gchar</span> <span class="o">*</span> <span class="n">style_value</span><span class="p">,</span>
3255 <span class="n">gboolean</span> <span class="n">important</span><span class="p">)</span>
3256 <span class="p">{</span>
3257 <span class="n">GHashTable</span> <span class="o">*</span><span class="n">styles</span><span class="p">;</span>
3258
3259 <span class="n">styles</span> <span class="o">=</span> <span class="n">g_hash_table_lookup</span> <span class="p">(</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">priv</span><span class="o">-&gt;</span><span class="n">css_props</span><span class="p">,</span> <span class="n">selector</span><span class="p">);</span>
3260 </code></pre></div>
3261
3262 <p>Okay, it looks up a <code>selector</code> by name in the <code>css_props</code>, and it
3263 gives back... another hash table <code>styles</code>? What's in there?</p>
3264 <div class="highlight"><pre><span></span><code> <span class="n">g_hash_table_insert</span> <span class="p">(</span><span class="n">styles</span><span class="p">,</span>
3265 <span class="n">g_strdup</span> <span class="p">(</span><span class="n">style_name</span><span class="p">),</span>
3266 <span class="n">style_value_data_new</span> <span class="p">(</span><span class="n">style_value</span><span class="p">,</span> <span class="n">important</span><span class="p">));</span>
3267 </code></pre></div>
3268
3269 <p>Another string key called <code>style_name</code>, whose key is a
3270 <code>StyleValueData</code>; what's in it?</p>
3271 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="nc">_StyleValueData</span> <span class="p">{</span>
3272 <span class="n">gchar</span> <span class="o">*</span><span class="n">value</span><span class="p">;</span>
3273 <span class="n">gboolean</span> <span class="n">important</span><span class="p">;</span>
3274 <span class="p">}</span> <span class="n">StyleValueData</span><span class="p">;</span>
3275 </code></pre></div>
3276
3277 <p>The <code>value</code> is another string. Strings all the way!</p>
3278 <p>At the time, I didn't really figure out what each level of nested hash
3279 tables was supposed to mean. I didn't understand why we handled style
3280 properties in a completely different part of the code, and yet this
3281 part had a <code>css_props</code> field that didn't seem to store properties at all.</p>
3282 <p>It took a while to realize that <code>css_props</code> was misnamed. It wasn't
3283 storing a mapping of selector names to properties; it was storing a
3284 mapping of selector names to <strong>declaration lists</strong>, which are lists of
3285 property/value pairs.</p>
3286 <p>So, when I started <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/08f28816868878962647651118d5a1eb95d06d17">porting the CSS parsing code to
3287 Rust</a>,
3288 I started to create real types with for each concept.</p>
3289 <div class="highlight"><pre><span></span><code><span class="c1">// Maps property_name -&gt; Declaration</span>
3290 <span class="k">type</span> <span class="nc">DeclarationList</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">Declaration</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
3291
3292 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">CssStyles</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3293 <span class="w"> </span><span class="n">selectors_to_declarations</span>: <span class="nc">HashMap</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">DeclarationList</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3294 <span class="p">}</span><span class="w"></span>
3295 </code></pre></div>
3296
3297 <p>Even though the keys of those HashMaps are still strings, because
3298 librsvg didn't have a better way to represent their corresponding
3299 concepts, at least those declarations let one see what the hell is
3300 being stored without grepping the rest of the code. This is a part of
3301 the code that I didn't really touch very much, so it was nice to have
3302 that reminder.</p>
3303 <p>The <a href="https://gitlab.gnome.org/GNOME/librsvg/commit/12dd9b1bb039484ce273228049f96f8e39d2f339">first port of the CSS matching code to
3304 Rust</a>
3305 kept the same algorithm as the C code, the one that created strings
3306 with <code>element.class</code> and compared them to the stored selector names.
3307 Ugly, but it still worked in the same limited fashion.</p>
3308 <h2>Rustifying the CSS parsers</h2>
3309 <p>It turns out that CSS parsing is divided in two parts. One can have a
3310 <code>style</code> attribute inside an element, for example</p>
3311 <div class="highlight"><pre><span></span><code><span class="nt">&lt;rect</span> <span class="na">x=</span><span class="s">&quot;0&quot;</span> <span class="na">y=</span><span class="s">&quot;0&quot;</span> <span class="na">width=</span><span class="s">&quot;100&quot;</span> <span class="na">height=</span><span class="s">&quot;100&quot;</span>
3312 <span class="na">style=</span><span class="s">&quot;fill: green; stroke: magenta; stroke-width: 4;&quot;</span><span class="nt">/&gt;</span>
3313 </code></pre></div>
3314
3315 <p>This is a plain declaration list which is not associated to any
3316 selectors, and which is applied directly to just the element in which it
3317 appears.</p>
3318 <p>Then, there is the <code>&lt;style&gt;</code> element itself, with a normal-looking CSS stylesheet</p>
3319 <div class="highlight"><pre><span></span><code><span class="nt">&lt;style</span> <span class="na">type=</span><span class="s">&quot;text/css&quot;</span><span class="nt">&gt;</span>
3320 rect {
3321 fill: green;
3322 stroke: magenta;
3323 stroke-width: 4;
3324 }
3325 <span class="nt">&lt;/style&gt;</span>
3326 </code></pre></div>
3327
3328 <p>This means that all <code>&lt;rect&gt;</code> elements will get that style applied.</p>
3329 <p>I started to look for existing Rust crates to parse and handle CSS
3330 data. The <a href="https://docs.rs/cssparser/">cssparser</a> and
3331 <a href="https://docs.rs/selectors/">selectors</a> crates come from Mozilla, so
3332 I thought they should do a pretty good job of things.</p>
3333 <p>And they do! Except that they are not a drop-in replacement for
3334 anything. They are what gets used in Mozilla's Servo browser engine,
3335 so they are optimized to hell, and the code can be pretty intimidating.</p>
3336 <p>Out of the box, cssparser provides a CSS tokenizer, but it does
3337 not know how to handle any properties/values in particular. One must
3338 use the tokenizer to implement a parser for each kind of CSS property
3339 one wants to support — Servo has mountains of code for all of HTML's
3340 style properties, and librsvg had to provide a smaller mountain of code
3341 for SVG style properties.</p>
3342 <p>Thus started the big task of porting librsvg's string-based parsers
3343 for CSS properties into ones based on cssparser tokens. Cssparser
3344 provides a <code>Parser</code> struct, which extracts tokens out of a CSS
3345 stream. Out of this, librsvg defines a <code>Parse</code> trait for parsable
3346 things:</p>
3347 <div class="highlight"><pre><span></span><code><span class="k">use</span><span class="w"> </span><span class="n">cssparser</span>::<span class="n">Parser</span><span class="p">;</span><span class="w"></span>
3348
3349 <span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Parse</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"></span>
3350 <span class="w"> </span><span class="k">type</span> <span class="nb">Err</span><span class="p">;</span><span class="w"></span>
3351
3352 <span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="n">parser</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Parser</span><span class="o">&lt;&#39;</span><span class="nb">_</span><span class="p">,</span><span class="w"> </span><span class="o">&#39;</span><span class="nb">_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="nb">Err</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
3353 <span class="p">}</span><span class="w"></span>
3354 </code></pre></div>
3355
3356 <p>What's with those two default lifetimes in <code>Parser&lt;'_, '_&gt;</code>?
3357 Cssparser tries very hard to be a zero-copy tokenizer. One of the
3358 lifetimes refers to the input string which is wrapped in a
3359 <code>Tokenizer</code>, which is wrapped in a <code>ParserInput</code>. The other lifetime
3360 is for the <code>ParserInput</code> itself.</p>
3361 <p>In the actual implementation of that trait, the <code>Err</code> type also uses
3362 the lifetime that refers to the input string. For example, there is a
3363 <code>BasicParseErrorKind::UnexpectedToken(Token&lt;'i&gt;)</code>, which one returns
3364 when there is an unexpected token. And to avoid copying the substring
3365 into the error, one returns a slice reference into the original
3366 string, thus the lifetime.</p>
3367 <p>I was more of a Rust newbie back then, and it was very hard to make
3368 sense of how cssparser was meant to be used.</p>
3369 <p>The process was more or less this:</p>
3370 <ul>
3371 <li>
3372 <p>Port the C parsers to Rust; implement types for each CSS property.</p>
3373 </li>
3374 <li>
3375 <p>Port the <code>&amp;str</code>-based parsers into ones that use <code>cssparser</code>.</p>
3376 </li>
3377 <li>
3378 <p>Fix the error handling scheme to match what cssparser's high-level
3379 traits expect.</p>
3380 </li>
3381 </ul>
3382 <p>This last point was... hard. Again, I wasn't comfortable enough with
3383 Rust lifetimes and nested generics; in the end it was all right.</p>
3384 <h2>Moving declaration lists to Rust</h2>
3385 <p>With the individual parsers for CSS properties done, and with them
3386 already using a different type for each property, the next thing was
3387 to implement cssparser's traits to parse declaration lists.</p>
3388 <p>Again, a declaration list looks like this:</p>
3389 <div class="highlight"><pre><span></span><code><span class="nt">fill</span><span class="o">:</span> <span class="nt">blue</span><span class="o">;</span>
3390 <span class="nt">stroke-width</span><span class="o">:</span> <span class="nt">4</span><span class="o">;</span>
3391 </code></pre></div>
3392
3393 <p>It's essentially a key/value list.</p>
3394 <p>The trait that cssparser wants us to implement is this:</p>
3395 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">DeclarationParser</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3396 <span class="w"> </span><span class="k">type</span> <span class="nc">Declaration</span><span class="p">;</span><span class="w"></span>
3397 <span class="w"> </span><span class="k">type</span> <span class="nc">Error</span>: <span class="o">&#39;</span><span class="na">i</span><span class="p">;</span><span class="w"></span>
3398
3399 <span class="w"> </span><span class="k">fn</span> <span class="nf">parse_value</span><span class="o">&lt;&#39;</span><span class="na">t</span><span class="o">&gt;</span><span class="p">(</span><span class="w"></span>
3400 <span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"></span>
3401 <span class="w"> </span><span class="n">name</span>: <span class="nc">CowRcStr</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3402 <span class="w"> </span><span class="n">input</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Parser</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="p">,</span><span class="w"> </span><span class="o">&#39;</span><span class="na">t</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
3403 <span class="w"> </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Declaration</span><span class="p">,</span><span class="w"> </span><span class="n">ParseError</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Error</span><span class="o">&gt;&gt;</span><span class="p">;</span><span class="w"></span>
3404 <span class="p">}</span><span class="w"></span>
3405 </code></pre></div>
3406
3407 <p>That is, define a type for a <code>Declaration</code>, and implement a
3408 <code>parse_value()</code> method that takes a <code>name</code> and a <code>Parser</code>, and outputs
3409 a <code>Declaration</code> or an error.</p>
3410 <p>What this <em>really</em> means is that the type you implement for
3411 <code>Declaration</code> needs to be able to represent all the CSS property types
3412 that you care about. Thus, a struct plus a big enum like this:</p>
3413 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Declaration</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3414 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">prop_name</span>: <span class="nb">String</span><span class="p">,</span><span class="w"></span>
3415 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">property</span>: <span class="nc">ParsedProperty</span><span class="p">,</span><span class="w"></span>
3416 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">important</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
3417 <span class="p">}</span><span class="w"></span>
3418
3419 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">ParsedProperty</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3420 <span class="w"> </span><span class="n">BaselineShift</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">BaselineShift</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
3421 <span class="w"> </span><span class="n">ClipPath</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">ClipPath</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
3422 <span class="w"> </span><span class="n">ClipRule</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">ClipRule</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
3423 <span class="w"> </span><span class="n">Color</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">Color</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
3424 <span class="w"> </span><span class="n">ColorInterpolationFilters</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">ColorInterpolationFilters</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
3425 <span class="w"> </span><span class="n">Direction</span><span class="p">(</span><span class="n">SpecifiedValue</span><span class="o">&lt;</span><span class="n">Direction</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
3426 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
3427 <span class="p">}</span><span class="w"></span>
3428 </code></pre></div>
3429
3430 <p>This gives us declaration lists (the stuff inside curly braces in a
3431 CSS stylesheet), but it doesn't give us qualified rules, which are
3432 composed of selector names plus a declaration list.</p>
3433 <h2>Refactoring towards real CSS concepts</h2>
3434 <p>Paolo Borelli has been steadily refactoring librsvg and fixing things
3435 like the primitive obsession I mentioned above. We now have real
3436 concepts like a Document, Stylesheet, QualifiedRule, Rule, AtRule.</p>
3437 <p>This refactoring took a long time, because it involved redoing the XML
3438 loading code and its interaction with the CSS parser a few times.</p>
3439 <h2>Implementing traits from the selectors crate</h2>
3440 <p>The <a href="https://docs.rs/selectors">selectors</a> crate
3441 contains Servo's code for parsing CSS selectors and doing matching.
3442 However, it is <em>extremely</em> generic. Using it involves implementing a
3443 good number of concepts.</p>
3444 <p>For example, this <code>SelectorImpl</code> trait has no methods, and is just a
3445 collection of types that refer to your implementation of an element
3446 tree. How do you represent an attribute/value? How do you represent
3447 an identifier? How do you represent a namespace and a local name?</p>
3448 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">SelectorImpl</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3449 <span class="w"> </span><span class="k">type</span> <span class="nc">ExtraMatchingData</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3450 <span class="w"> </span><span class="k">type</span> <span class="nc">AttrValue</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3451 <span class="w"> </span><span class="k">type</span> <span class="nc">Identifier</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3452 <span class="w"> </span><span class="k">type</span> <span class="nc">ClassName</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3453 <span class="w"> </span><span class="k">type</span> <span class="nc">PartName</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3454 <span class="w"> </span><span class="k">type</span> <span class="nc">LocalName</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3455 <span class="w"> </span><span class="k">type</span> <span class="nc">NamespaceUrl</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3456 <span class="w"> </span><span class="k">type</span> <span class="nc">NamespacePrefix</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3457 <span class="w"> </span><span class="k">type</span> <span class="nc">BorrowedNamespaceUrl</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3458 <span class="w"> </span><span class="k">type</span> <span class="nc">BorrowedLocalName</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3459 <span class="w"> </span><span class="k">type</span> <span class="nc">NonTSPseudoClass</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3460 <span class="w"> </span><span class="k">type</span> <span class="nc">PseudoElement</span>: <span class="o">..</span><span class="p">.;</span><span class="w"></span>
3461 <span class="p">}</span><span class="w"></span>
3462 </code></pre></div>
3463
3464 <p>A lot of those can be <code>String</code>, but Servo has smarter things in store.
3465 I ended up using the <a href="https://docs.rs/markup5ever"><code>markup5ever</code></a> crate, which provides a string
3466 interning framework for markup and XML concepts like a <code>LocalName</code>, a
3467 <code>Namespace</code>, etc. This reduces memory consumption, because instead of
3468 storing string copies of element names everywhere, one just stores
3469 tokens for interned strings.</p>
3470 <p>(In the meantime I had to implement support for XML namespaces, which
3471 the selectors code really wants, but which librsvg never supported.)</p>
3472 <p>Then, the selectors crate wants you to say how your code implements an
3473 element tree. It has a monster trait <code>Element</code>:</p>
3474 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Element</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
3475 <span class="w"> </span><span class="k">type</span> <span class="nc">Impl</span>: <span class="nc">SelectorImpl</span><span class="p">;</span><span class="w"></span>
3476
3477 <span class="w"> </span><span class="k">fn</span> <span class="nf">opaque</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">OpaqueElement</span><span class="p">;</span><span class="w"></span>
3478
3479 <span class="w"> </span><span class="k">fn</span> <span class="nf">parent_element</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
3480
3481 <span class="w"> </span><span class="k">fn</span> <span class="nf">parent_node_is_shadow_root</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">;</span><span class="w"></span>
3482
3483 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
3484
3485 <span class="w"> </span><span class="k">fn</span> <span class="nf">prev_sibling_element</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
3486 <span class="w"> </span><span class="k">fn</span> <span class="nf">next_sibling_element</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
3487
3488 <span class="w"> </span><span class="k">fn</span> <span class="nf">has_local_name</span><span class="p">(</span><span class="w"></span>
3489 <span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span>
3490 <span class="w"> </span><span class="n">local_name</span>: <span class="kp">&amp;</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Impl</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">SelectorImpl</span><span class="o">&gt;</span>::<span class="n">BorrowedLocalName</span><span class="w"></span>
3491 <span class="w"> </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">;</span><span class="w"></span>
3492
3493 <span class="w"> </span><span class="k">fn</span> <span class="nf">has_id</span><span class="p">(</span><span class="w"></span>
3494 <span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"></span>
3495 <span class="w"> </span><span class="n">id</span>: <span class="kp">&amp;</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Impl</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">SelectorImpl</span><span class="o">&gt;</span>::<span class="n">Identifier</span><span class="p">,</span><span class="w"></span>
3496 <span class="w"> </span><span class="n">case_sensitivity</span>: <span class="nc">CaseSensitivity</span><span class="p">,</span><span class="w"></span>
3497 <span class="w"> </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">;</span><span class="w"></span>
3498
3499 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
3500 <span class="p">}</span><span class="w"></span>
3501 </code></pre></div>
3502
3503 <p>That is, when you provide an implementation of <code>Element</code> and
3504 <code>SelectorImpl</code>, the selectors crate will know how to navigate your
3505 element tree and ask it questions like, "does this element have the id
3506 <code>#foo</code>?"; "does this element have the name <code>rect</code>?". It makes perfect
3507 sense in the end, but it is quite intimidating when you are not 100%
3508 comfortable with webs of traits and associated types and generics with
3509 a bunch of trait bounds!</p>
3510 <p>I tried implementing that trait twice in the last year, and failed.
3511 It turns out that its API <a href="https://github.com/servo/servo/issues/22972">needed a key
3512 fix</a> that landed last
3513 June, but I didn't notice until a couple of weeks ago.</p>
3514 <h2>So?</h2>
3515 <p>Two days ago, Paolo and I committed the <a href="https://gitlab.gnome.org/federico/librsvg/merge_requests/6">last code to be able to
3516 completely replace
3517 libcroco</a>.</p>
3518 <p>And, after implementing CSS <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/525">specificity</a> (which was easy now that we
3519 have real CSS concepts and a good pipeline for the CSS cascade), a
3520 bunch of very old bugs started falling down
3521 (<a href="https://gitlab.gnome.org/GNOME/librsvg/issues/336">1</a>
3522 <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/466">2</a>
3523 <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/428">3</a>
3524 <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/167">4</a>
3525 <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/79">5</a>
3526 <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/441">6</a>).</p>
3527 <p>Now it is going to be easy to implement things like <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/379">letting the
3528 application specify a user
3529 stylesheet</a>. In
3530 particular, this should let GTK remove the <a href="https://gitlab.gnome.org/GNOME/gtk/blob/ad48bbb8/gtk/gtkicontheme.c#L3728-3750">rather egregious
3531 hack</a>
3532 it has to recolor SVG icons while using librsvg indirectly.</p>
3533 <h2>Conclusion</h2>
3534 <p>This will appear in librsvg 2.47.1 — that version will no longer
3535 require libcroco.</p>
3536 <p>As far as I know, the only module that still depends on libcroco (in
3537 GNOME or otherwise) is <strong>gnome-shell</strong>. It uses libcroco to parse CSS
3538 and get the basic structure of selectors so it can implement matching
3539 by hand.</p>
3540 <p>Gnome-shell has some code which looks awfully similar to what librsvg
3541 had when it was written in C:</p>
3542 <ul>
3543 <li>
3544 <p><a href="https://gitlab.gnome.org/GNOME/gnome-shell/blob/66fc5c07/src/st/st-theme.c">StTheme</a>
3545 has the high-level CSS stylesheet parser and the selector matching code.</p>
3546 </li>
3547 <li>
3548 <p><a href="https://gitlab.gnome.org/GNOME/gnome-shell/blob/66fc5c07/src/st/st-theme-node.c">StThemeNode</a>
3549 has the low-level CSS property parsers.</p>
3550 </li>
3551 </ul>
3552 <p>... and it turns out that those files come all the way from
3553 <a href="https://blog.ometer.com/2006/10/14/text-layout-that-works-properly/">HippoCanvas</a>,
3554 the CSS-aware canvas that Mugshot used! Mugshot was a circa-2006
3555 pre-Facebook aggregator for social media data like blogs, Flickr
3556 pictures, etc. HippoCanvas also got used in Sugar, the GUI for
3557 One Laptop Per Child. Yes, our code is <em>that</em> old.</p>
3558 <p>Libcroco is unmaintained, and has outstanding CVEs. I would be very
3559 happy to assist someone in porting gnome-shell's CSS code to Rust :)</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category></entry><entry><title>Gdk-pixbuf modules - call for help</title><link href="https://people.gnome.org/~federico/blog/gdk-pixbuf-modules.html" rel="alternate"></link><published>2019-09-11T17:54:03-05:00</published><updated>2019-09-12T13:15:17-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-09-11:/~federico/blog/gdk-pixbuf-modules.html</id><summary type="html"><p>I've been doing a little refactoring of gdk-pixbuf's crufty code, to
3560 see if the gripes from my <a href="https://people.gnome.org/~federico/blog/my-gdk-pixbuf-braindump.html">braindump</a> can be solved. For
3561 things where it is not obvious how to proceed, I've started taking
3562 more detailed notes in a <a href="https://gitlab.gnome.org/federico/gdk-pixbuf-survey">gdk-pixbuf survey</a>.</p>
3563 <p>Today I was looking at which <a href="https://gitlab.gnome.org/federico/gdk-pixbuf-survey/blob/master/src/modules.md">gdk-pixbuf modules</a> are …</p></summary><content type="html"><p>I've been doing a little refactoring of gdk-pixbuf's crufty code, to
3564 see if the gripes from my <a href="https://people.gnome.org/~federico/blog/my-gdk-pixbuf-braindump.html">braindump</a> can be solved. For
3565 things where it is not obvious how to proceed, I've started taking
3566 more detailed notes in a <a href="https://gitlab.gnome.org/federico/gdk-pixbuf-survey">gdk-pixbuf survey</a>.</p>
3567 <p>Today I was looking at which <a href="https://gitlab.gnome.org/federico/gdk-pixbuf-survey/blob/master/src/modules.md">gdk-pixbuf modules</a> are
3568 implemented by third parties, that is, which external projects provide
3569 their own image codecs pluggable into gdk-pixbuf.</p>
3570 <p>And there are not that many!</p>
3571 <p>The only four that I found are <strong>libheif, libopenraw, libwmf,
3572 librsvg</strong> (this last one, of course).</p>
3573 <p><em>Update 2019/Sep/12</em> - Added <strong>apng, exif-raw, psd, pvr, vtf, webp, xcf</strong>.</p>
3574 <p>All of those use the gdk-pixbuf module API in a remarkably similar
3575 fashion. Did they cut&amp;paste each other's code? Did they do the
3576 simplest thing that didn't crash in gdk-pixbuf's checks for buggy
3577 loaders, which happens to be exactly what they do? Who knows! Either
3578 way, this makes future API changes in the modules a lot easier, since
3579 they all do the same right now.</p>
3580 <p>I'm trying to decide between these:</p>
3581 <ul>
3582 <li>
3583 <p>Keep modules as they are; find a way to sandbox them from gdk-pixbuf
3584 itself. This is hard because the API is "chatty"; modules and
3585 calling code go back and forth peeking at each other's structures.</p>
3586 </li>
3587 <li>
3588 <p>Decide that third-party modules are only useful for thumbnailers;
3589 modify them to <em>be</em> thumbnailers instead of generic gdk-pixbuf
3590 modules. This would mean that those formats would stop working
3591 automatically in gdk-pixbuf based viewers like EOG.</p>
3592 </li>
3593 <li>
3594 <p>Have "blessed" codecs inside gdk-pixbuf which are not modules so
3595 their no longer have API/ABI stability constraints. Keep
3596 third-party modules separate. Sandbox the internal ones with a
3597 non-chatty API.</p>
3598 </li>
3599 <li>
3600 <p>If all third-party modules work indeed as <a href="https://gitlab.gnome.org/federico/gdk-pixbuf-survey/blob/master/src/modules.md">I found</a>, the
3601 module API can be simplified quite a lot since no third-party
3602 modules implement animations or saving. If so, simplify the module
3603 API and the gdk-pixbuf internals rather drastically.</p>
3604 </li>
3605 </ul>
3606 <p>Do you know any other image formats which provide <a href="https://gitlab.gnome.org/federico/gdk-pixbuf-survey/blob/master/src/modules.md">gdk-pixbuf
3607 modules</a>? <a href="mailto:federico@gnome.org">Mail me, please!</a></p></content><category term="misc"></category><category term="gdk-pixbuf"></category></entry><entry><title>On responsible vulnerability disclosure</title><link href="https://people.gnome.org/~federico/blog/on-responsible-vulnerability-disclosure.html" rel="alternate"></link><published>2019-08-10T21:05:34-05:00</published><updated>2019-08-10T21:05:34-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-08-10:/~federico/blog/on-responsible-vulnerability-disclosure.html</id><summary type="html"><p>Recently KDE had an unfortunate event. Someone <a href="https://gist.github.com/zeropwn/630832df151029cb8f22d5b6b9efaefb">found a
3608 vulnerability</a> in the code that processes <code>.desktop</code> and
3609 <code>.directory</code> files, through which an attacker could create a malicious
3610 file that causes shell command execution (<a href="https://paper.seebug.org/1008/">analysis</a>). They went for immediate,
3611 full disclosure, where KDE didn't even get a chance of fixing the …</p></summary><content type="html"><p>Recently KDE had an unfortunate event. Someone <a href="https://gist.github.com/zeropwn/630832df151029cb8f22d5b6b9efaefb">found a
3612 vulnerability</a> in the code that processes <code>.desktop</code> and
3613 <code>.directory</code> files, through which an attacker could create a malicious
3614 file that causes shell command execution (<a href="https://paper.seebug.org/1008/">analysis</a>). They went for immediate,
3615 full disclosure, where KDE didn't even get a chance of fixing the bug
3616 before it was published.</p>
3617 <p>There are many protocols for <a href="https://en.wikipedia.org/wiki/Responsible_disclosure">disclosing vulnerabilities in a
3618 coordinated, responsible fashion</a>, but the gist of them is this:</p>
3619 <ol>
3620 <li>
3621 <p>Someone finds a vulnerability in some software through studying
3622 some code, or some other mechanism.</p>
3623 </li>
3624 <li>
3625 <p>They report the vulnerability to the software's author through some
3626 private channel. For free softare in particular, researchers can
3627 use <a href="https://oss-security.openwall.org/wiki/disclosure/researcher">Openwall's recommended process for
3628 researchers</a>, which includes <strong>notifying the
3629 author/maintainer and distros and security groups.</strong> Free
3630 software projects can <a href="https://alexgaynor.net/2013/oct/19/security-process-open-source-projects/">follow a well-established process</a>.</p>
3631 </li>
3632 <li>
3633 <p>The author and reporter agree on a deadline for releasing a public
3634 report of the vulnerability, or in semi-automated systems like
3635 Google Zero, a deadline is automatically established.</p>
3636 </li>
3637 <li>
3638 <p>The author works on fixing the vulnerability.</p>
3639 </li>
3640 <li>
3641 <p>The deadline is reached; the patch has been publically released,
3642 the appropriate people have been notified, systems have been
3643 patched. If there is no patch, the author and reporter can agree
3644 on postponing the date, or the reporter can publish the
3645 vulnerability report, thus creating public pressure for a fix.</p>
3646 </li>
3647 </ol>
3648 <p>The steps above gloss over many practicalities and issues from the
3649 real world, but the idea is basically this: the author or maintainer
3650 of the software is given a chance to fix a security bug before
3651 information on the vulnerability is released to the hostile world.
3652 The idea is to <strong>keep harm from being done</strong> by not publishing
3653 unpatched vulnerabilities until there is a fix for them (... or until
3654 the deadline expires).</p>
3655 <h2>What happened instead</h2>
3656 <p>Around the beginning of July, the reporter <a href="https://twitter.com/zer0pwn/status/1146554083056193536">posts about looking for
3657 bugs in KDE</a>.</p>
3658 <p>On <a href="https://twitter.com/zer0pwn/status/1156312405472923648">July 30</a>, he posts a video with the proof of concept.</p>
3659 <p>On August 3, he <a href="https://twitter.com/zer0pwn/status/1157760759289528321">makes a Twitter poll</a> about what to do with the
3660 vulnerability.</p>
3661 <p>On August 4, he <a href="https://twitter.com/zer0pwn/status/1158167374799020039">publishes the vulnerability</a>.</p>
3662 <p>KDE is left with having to patch this in emergency mode. On August 7,
3663 KDE releases a <a href="https://kde.org/info/security/advisory-20190807-1.txt">security advisory</a> in perfect form:</p>
3664 <ul>
3665 <li>
3666 <p>Description of exactly what causes the vulnerability.</p>
3667 </li>
3668 <li>
3669 <p>Description of how it was solved.</p>
3670 </li>
3671 <li>
3672 <p>Instructions on what to do for users of various versions of KDE
3673 libraries.</p>
3674 </li>
3675 <li>
3676 <p>Links to easy-to-cherry-pick patches for distro vendors.</p>
3677 </li>
3678 </ul>
3679 <p>Now, distro vendors are, in turn, in emergency mode, as they must
3680 apply the patch, run it through QA, release their own advisories,
3681 etc.</p>
3682 <h2>What if this had been done with coordinated disclosure?</h2>
3683 <p>The bug would have been fixed, probably in the same way, <em>but it would
3684 not be in emergency mode</em>. <a href="https://kde.org/info/security/advisory-20190807-1.txt">KDE's advisory</a> contains this:</p>
3685 <blockquote>
3686 <p>Thanks to Dominik Penner for finding and documenting this issue (we wish however that he would
3687 have contacted us before making the issue public) and to David Faure for the fix.</p>
3688 </blockquote>
3689 <p>This is an extremely gracious way of thanking the reporter.</p>
3690 <h2>I am not an infosec person...</h2>
3691 <p>... but some behaviors in the infosec sphere are deeply uncomfortable
3692 to me. I don't like it when security "research" is hard to tell from
3693 vandalism. "Excuse me, you left your car door unlocked" vs. "Hey
3694 everyone, this car is unlocked, have at it".</p>
3695 <p>I don't know the details of the discourse in the infosec sphere around
3696 full disclosure against irresponsible vendors of proprietary software or
3697 services. However, <strong>KDE is free software</strong>! There is no need to be
3698 an asshole to them.</p></content><category term="misc"></category><category term="security"></category><category term="kde"></category></entry><entry><title>Constructors</title><link href="https://people.gnome.org/~federico/blog/constructors.html" rel="alternate"></link><published>2019-07-24T13:59:01-05:00</published><updated>2019-07-24T13:59:01-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-07-24:/~federico/blog/constructors.html</id><summary type="html"><p>Have you ever had these annoyances with GObject-style constructors?</p>
3699 <ul>
3700 <li>
3701 <p>From a constructor, calling a method on a partially-constructed
3702 object is dangerous.</p>
3703 </li>
3704 <li>
3705 <p>A constructor needs to set up "not quite initialized" values in the
3706 instance struct until a construct-time property is set.</p>
3707 </li>
3708 <li>
3709 <p>You actually need to override <code>GObjectClass::constructed</code> (or was …</p></li></ul></summary><content type="html"><p>Have you ever had these annoyances with GObject-style constructors?</p>
3710 <ul>
3711 <li>
3712 <p>From a constructor, calling a method on a partially-constructed
3713 object is dangerous.</p>
3714 </li>
3715 <li>
3716 <p>A constructor needs to set up "not quite initialized" values in the
3717 instance struct until a construct-time property is set.</p>
3718 </li>
3719 <li>
3720 <p>You actually need to override <code>GObjectClass::constructed</code> (or was it
3721 <code>::constructor</code>?) to take care of construct-only properties which
3722 need to be considered together, not individually.</p>
3723 </li>
3724 <li>
3725 <p>Constructors can't report an error, unless you derive from
3726 <code>GInitable</code>, which is not in gobject, but in gio instead. (Also,
3727 why does <em>that</em> force the constructor to take a <code>GCancellable</code>...?)</p>
3728 </li>
3729 <li>
3730 <p>You need more than one constructor, but that needs to be done with
3731 helper functions.</p>
3732 </li>
3733 </ul>
3734 <p>This article, <a href="https://matklad.github.io/2019/07/16/perils-of-constructors.html">Perils of
3735 Constructors</a>,
3736 explains all of these problems very well. It is not centered on
3737 GObject, but rather on constructors in object-oriented languages in
3738 general.</p>
3739 <p>(Spoiler: Rust does not have constructors or partially-initialized structs, so
3740 these problems don't really exist there.)</p>
3741 <p>(Addendum: <em>that</em> makes it somewhat awkward to convert GObject code in
3742 C to Rust, but librsvg was able to solve it nicely with
3743 <code>&lt;buzzword&gt;</code><a href="http://cliffle.com/blog/rust-typestate/">the typestate pattern</a><code>&lt;/buzzword&gt;</code>.)</p></content><category term="misc"></category><category term="GObject"></category><category term="rust"></category></entry><entry><title>Gtk-rs tutorial</title><link href="https://people.gnome.org/~federico/blog/gtk-rs-tutorial.html" rel="alternate"></link><published>2019-07-08T18:36:11-05:00</published><updated>2019-07-08T18:36:11-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-07-08:/~federico/blog/gtk-rs-tutorial.html</id><content type="html"><p><a href="https://cybre.space/@tindall">Leonora Tindall</a> has written a
3744 very nice tutorial on <a href="https://nora.codes/tutorial/speedy-desktop-apps-with-gtk-and-rust/">Speedy Desktop Apps With GTK and
3745 Rust</a>.
3746 It covers prototyping a dice roller app with Glade, writing the code with
3747 Rust and the gtk-rs bindings, and integrating the app into the desktop with
3748 a <code>.desktop</code> file.</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Removing rsvg-view</title><link href="https://people.gnome.org/~federico/blog/removing-rsvg-view.html" rel="alternate"></link><published>2019-07-02T11:36:59-05:00</published><updated>2019-07-02T11:36:59-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-07-02:/~federico/blog/removing-rsvg-view.html</id><summary type="html"><p>I am preparing the 2.46.0 librsvg release. <strong>This will no longer have
3749 the rsvg-view-3 program.</strong></p>
3750 <h2>History of rsvg-view</h2>
3751 <p>Rsvg-view <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/c56e236c5b5fa89c365b87ac59bd476d67c5ee27/test-display.c">started
3752 out</a>
3753 as a 71-line C program to aid development of librsvg. It would just
3754 render an SVG file to a pixbuf, stick that pixbuf in a <code>GtkImage</code>
3755 widget …</p></summary><content type="html"><p>I am preparing the 2.46.0 librsvg release. <strong>This will no longer have
3756 the rsvg-view-3 program.</strong></p>
3757 <h2>History of rsvg-view</h2>
3758 <p>Rsvg-view <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/c56e236c5b5fa89c365b87ac59bd476d67c5ee27/test-display.c">started
3759 out</a>
3760 as a 71-line C program to aid development of librsvg. It would just
3761 render an SVG file to a pixbuf, stick that pixbuf in a <code>GtkImage</code>
3762 widget, and show a window with that.</p>
3763 <p>Over time, it slowly acquired most of the command-line options that
3764 <code>rsvg-convert</code> supports. And I suppose, as a way of testing the
3765 Cairo-ification of librsvg, it also got the ability to print SVG files
3766 to a <code>GtkPrintContext</code>. At last count, it was a 784-line C program
3767 that is not really the best code in the world.</p>
3768 <h2>What makes rsvg-view awkward?</h2>
3769 <p>Rsvg-view requires GTK. But GTK requires librsvg, indirectly, through
3770 gdk-pixbuf! There is not a hard circular dependency because GTK goes,
3771 "gdk-pixbuf, load me this SVG file" without knowing how it will be
3772 loaded. In turn, gdk-pixbuf initializes the SVG loader provided by
3773 librsvg, and that loader reads/renders the SVG file.</p>
3774 <p>Ideally librsvg would only depend on gdk-pixbuf, so it would be able
3775 to provide the SVG loader.</p>
3776 <p>The rsvg-view source code still has a few calls to GTK functions which
3777 are now deprecated. The program emits GTK warnings during normal use.</p>
3778 <p>Rsvg-view is... not a very good SVG viewer. It doesn't even start up
3779 with the window scaled properly to the SVG's dimensions! If used for
3780 quick testing during development, it cannot even aid in viewing the
3781 transparent background regions which the SVG does not cover. It just
3782 sticks a lousy custom widget inside a <code>GtkScrolledWindow</code>, and does
3783 not have the conventional niceties to view images like zooming with
3784 the scroll wheel.</p>
3785 <p><a href="https://wiki.gnome.org/Apps/EyeOfGnome/">EOG</a> is a much better SVG viewer than rsvg-view, and people actually
3786 invest effort in making it pleasant to use.</p>
3787 <h2>Removal of rsvg-view</h2>
3788 <p>So, the next version of librsvg will not provide the <code>rsvg-view-3</code>
3789 binary. Please update your packages accordingly. Distros may be able to
3790 move the compilation of librsvg to a more sensible place in the
3791 platform stack, now that it doesn't depend on GTK being available.</p>
3792 <p>What can you use instead? Any other image viewer. <a href="https://wiki.gnome.org/Apps/EyeOfGnome/">EOG</a> works fine;
3793 there are dozens of other good viewers, too.</p></content><category term="misc"></category><category term="librsvg"></category></entry><entry><title>Bzip2 1.0.7 is released</title><link href="https://people.gnome.org/~federico/blog/bzip2-107-is-released.html" rel="alternate"></link><published>2019-06-27T13:55:38-05:00</published><updated>2019-06-27T13:55:38-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-06-27:/~federico/blog/bzip2-107-is-released.html</id><summary type="html"><p><a href="https://sourceware.org/ml/bzip2-devel/2019-q2/msg00022.html">Bzip2 1.0.7 has been released</a> by Mark Wielaard. We have a
3794 slight change of plans since <a href="https://people.gnome.org/~federico/blog/preparing-the-bzip2-107-release.html">my last post</a>:</p>
3795 <ul>
3796 <li>
3797 <p>The 1.0.x series is in strict maintenance mode and will not change
3798 build systems. <strong>This is targeted towards embedded use</strong>, as in
3799 projects which already embed the …</p></li></ul></summary><content type="html"><p><a href="https://sourceware.org/ml/bzip2-devel/2019-q2/msg00022.html">Bzip2 1.0.7 has been released</a> by Mark Wielaard. We have a
3800 slight change of plans since <a href="https://people.gnome.org/~federico/blog/preparing-the-bzip2-107-release.html">my last post</a>:</p>
3801 <ul>
3802 <li>
3803 <p>The 1.0.x series is in strict maintenance mode and will not change
3804 build systems. <strong>This is targeted towards embedded use</strong>, as in
3805 projects which already embed the bzip2-1.0.6 sources and undoubtedly
3806 patch the build system. Right now this series, and the tagged 1.0.7
3807 release, live in the <a href="https://sourceware.org/git/?p=bzip2.git">sourceware repository for bzip2</a>.</p>
3808 </li>
3809 <li>
3810 <p>The 1.1.x series has Meson and CMake build systems, and a couple of
3811 extra changes to modernize the C code but which were not fit for the
3812 1.0.7 release. <strong>This is targeted towards operating system
3813 distributions</strong>. This lives in the master branch of the <a href="https://gitlab.com/federicomenaquintero/bzip2">gitlab
3814 repository for bzip2</a>.</p>
3815 </li>
3816 </ul>
3817 <p><strong>Distros and embedded users</strong> should start using bzip2-1.0.7
3818 immediately. The patches they already have for the bzip2's
3819 traditional build system should still apply. The release includes bug
3820 fixes and security fixes that have accumulated over the years,
3821 including the new <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/74de1e2e6ffc9d51ef9824db71a8ffee5962cdbc">CVE-2019-12900</a>.</p>
3822 <p>Once 1.1.0 is released, distributions should be able to remove their
3823 patches to the build system and just start using Meson or CMake. You
3824 may want to monitor the <a href="https://gitlab.com/federicomenaquintero/bzip2/-/milestones/1">1.1.0 milestone</a> — help is
3825 appreciated fixing the issues there so we can make the first release
3826 with the new build systems!</p></content><category term="misc"></category><category term="bzip2"></category></entry><entry><title>Preparing the bzip2-1.0.7 release</title><link href="https://people.gnome.org/~federico/blog/preparing-the-bzip2-107-release.html" rel="alternate"></link><published>2019-06-20T11:47:26-05:00</published><updated>2019-06-20T11:47:26-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-06-20:/~federico/blog/preparing-the-bzip2-107-release.html</id><summary type="html"><p><strong>ATTENTION ALL DISTRIBUTIONS</strong>: this is for you. <strong>THE SONAME MAY CHANGE!</strong></p>
3827 <p>I am preparing a bzip2-1.0.7 release. You can see the <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/master/NEWS">release
3828 notes</a>, which should be of interest:</p>
3829 <ul>
3830 <li>
3831 <p>Many historical patches from various distributions are integrated
3832 now.</p>
3833 </li>
3834 <li>
3835 <p>We have a new fix for the just-published <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-12900">CVE-2019-12900</a>, courtesy of …</p></li></ul></summary><content type="html"><p><strong>ATTENTION ALL DISTRIBUTIONS</strong>: this is for you. <strong>THE SONAME MAY CHANGE!</strong></p>
3836 <p>I am preparing a bzip2-1.0.7 release. You can see the <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/master/NEWS">release
3837 notes</a>, which should be of interest:</p>
3838 <ul>
3839 <li>
3840 <p>Many historical patches from various distributions are integrated
3841 now.</p>
3842 </li>
3843 <li>
3844 <p>We have a new fix for the just-published <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-12900">CVE-2019-12900</a>, courtesy of
3845 Albert Astals Cid.</p>
3846 </li>
3847 <li>
3848 <p><strong>Bzip2 has moved to Meson</strong> for its preferred build system,
3849 courtesy of Dylan Baker. For special situations, a CMake build
3850 system is also provided, courtesy of Micah Snyder.</p>
3851 </li>
3852 </ul>
3853 <h2>What's with the soname?</h2>
3854 <p>From bzip2-1.0.1 (from the year 2000), until bzip2-1.0.6 (from 2010),
3855 release tarballs came with a special
3856 <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/962d60610cb31e0f294a834e55ebb355be55d05a/Makefile-libbz2_so#L38"><code>Makefile-libbz2_so</code></a> to generate a shared library
3857 instead of a static one.</p>
3858 <p>This never used libtool or anything; it specified linker flags by
3859 hand. Various distributions either patched this special makefile, or
3860 replaced it by another one, or outright replaced the complete build
3861 system for a different one.</p>
3862 <p>Some things to note:</p>
3863 <ul>
3864 <li>
3865 <p>This hand-written <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/962d60610cb31e0f294a834e55ebb355be55d05a/Makefile-libbz2_so#L38"><code>Makefile-libbz2_so</code></a> used a link
3866 line like <code>$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o
3867 libbz2.so.1.0.6</code>. This means, make the DT_SONAME field inside the
3868 ELF file be <code>libbz2.so.1.0</code> (note the two digits in <code>1.0</code>), and make
3869 the filename of the shared library be <code>libbz2.so.1.0.6</code>.</p>
3870 </li>
3871 <li>
3872 <p>Fedora patched the soname in a patch called "saneso" to just be
3873 <code>libbz2.so.1</code>.</p>
3874 </li>
3875 <li>
3876 <p>Stanislav Brabec, from openSUSE, <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/70ec984159c8263fdd4aac7c4670977aff0fe5b3#a4d1bfa62791f4ba7465f19c7f7da6b3b78714a5_0_30">replaced the hand-written
3877 makefiles with autotools</a>, which meant using libtool. It
3878 has this interesting note:</p>
3879 </li>
3880 </ul>
3881 <blockquote>
3882 <p>Incompatible changes:</p>
3883 <p>soname change. Libtool has no support for two parts soname suffix (e. g.
3884 libbz2.so.1.0). It must be a single number (e. g. libbz2.so.1). That is
3885 why soname must change. But I see not a big problem with it. Several
3886 distributions already use the new number instead of the non-standard
3887 number from Makefile-libbz2_so.</p>
3888 </blockquote>
3889 <p>(In fact, if I do <code>objdump -x /usr/lib64/*.so | grep SONAME</code>, I see
3890 that most libraries have single-digit sonames.)</p>
3891 <p>In my experience, both Fedora and openSUSE are very strict, and
3892 correct, about obscure things like library sonames.</p>
3893 <p>With the switch to Meson, bzip2 no longer uses libtool. It will have
3894 a single-digit soname — this is not in the <code>meson.build</code> yet, but
3895 expect it to be there within the next couple of days.</p>
3896 <p>I don't know what distros which decided to preserve the <code>1.0</code> soname
3897 will need to do; maybe they will need to patch <code>meson.build</code> on their
3898 own.</p>
3899 <p>Fortunately, <strong>the API/ABI are still exactly the same</strong>. You can
3900 preserve the old soname which your distro was using and linking libbz2
3901 will probably keep working as usual.</p>
3902 <p>(This is a C-only release as usual. The Rust branch is still
3903 experimental.)</p></content><category term="misc"></category><category term="bzip2"></category></entry><entry><title>Bzip2 in Rust: porting the randomization table</title><link href="https://people.gnome.org/~federico/blog/bzip2-in-rust-randomization-table.html" rel="alternate"></link><published>2019-06-11T14:30:17-05:00</published><updated>2019-06-11T14:30:17-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-06-11:/~federico/blog/bzip2-in-rust-randomization-table.html</id><summary type="html"><p>Here is a straightforward port of some easy code.</p>
3904 <p><a href="https://gitlab.com/federicomenaquintero/bzip2/blob/7bd2dc3c13d01a963ef80ae96727ce247acb77fa/randtable.c#L26"><code>randtable.c</code></a> has a lookup table with seemingly-random
3905 numbers. This table is used by <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/5f8f075a554fc593be827cff5f8bc6ce880b08f7/bzlib_private.h#L131-149">the following macros in
3906 bzlib_private.h</a>:</p>
3907 <div class="highlight"><pre><span></span><code><span class="k">extern</span> <span class="n">Int32</span> <span class="n">BZ2_rNums</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
3908
3909 <span class="cp">#define BZ_RAND_DECLS \</span>
3910 <span class="cp"> Int32 rNToGo; \</span>
3911 <span class="cp"> Int32 rTPos \</span>
3912
3913 <span class="cp">#define BZ_RAND_INIT_MASK \</span>
3914 <span class="cp"> s-&gt;rNToGo = 0; \</span>
3915 <span class="cp"> s-&gt;rTPos = 0 \</span>
3916
3917 <span class="cp">#define BZ_RAND_MASK ((s- …</span></code></pre></div></summary><content type="html"><p>Here is a straightforward port of some easy code.</p>
3918 <p><a href="https://gitlab.com/federicomenaquintero/bzip2/blob/7bd2dc3c13d01a963ef80ae96727ce247acb77fa/randtable.c#L26"><code>randtable.c</code></a> has a lookup table with seemingly-random
3919 numbers. This table is used by <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/5f8f075a554fc593be827cff5f8bc6ce880b08f7/bzlib_private.h#L131-149">the following macros in
3920 bzlib_private.h</a>:</p>
3921 <div class="highlight"><pre><span></span><code><span class="k">extern</span> <span class="n">Int32</span> <span class="n">BZ2_rNums</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
3922
3923 <span class="cp">#define BZ_RAND_DECLS \</span>
3924 <span class="cp"> Int32 rNToGo; \</span>
3925 <span class="cp"> Int32 rTPos \</span>
3926
3927 <span class="cp">#define BZ_RAND_INIT_MASK \</span>
3928 <span class="cp"> s-&gt;rNToGo = 0; \</span>
3929 <span class="cp"> s-&gt;rTPos = 0 \</span>
3930
3931 <span class="cp">#define BZ_RAND_MASK ((s-&gt;rNToGo == 1) ? 1 : 0)</span>
3932
3933 <span class="cp">#define BZ_RAND_UPD_MASK \</span>
3934 <span class="cp"> if (s-&gt;rNToGo == 0) { \</span>
3935 <span class="cp"> s-&gt;rNToGo = BZ2_rNums[s-&gt;rTPos]; \</span>
3936 <span class="cp"> s-&gt;rTPos++; \</span>
3937 <span class="cp"> if (s-&gt;rTPos == 512) s-&gt;rTPos = 0; \</span>
3938 <span class="cp"> } \</span>
3939 <span class="cp"> s-&gt;rNToGo--;</span>
3940 </code></pre></div>
3941
3942 <p>Here, <code>BZ_RAND_DECLS</code> is used to declare two fields, <code>rNToGo</code> and
3943 <code>rTPos</code>, into two structs (<a href="https://gitlab.com/federicomenaquintero/bzip2/blob/5f8f075a554fc593be827cff5f8bc6ce880b08f7/bzlib_private.h#L213">1</a>, <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/5f8f075a554fc593be827cff5f8bc6ce880b08f7/bzlib_private.h#L345">2</a>). Both are similar to this:</p>
3944 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
3945 <span class="p">...</span>
3946 <span class="n">Bool</span> <span class="n">blockRandomised</span><span class="p">;</span>
3947 <span class="n">BZ_RAND_DECLS</span>
3948 <span class="p">...</span>
3949 <span class="p">}</span> <span class="n">DState</span><span class="p">;</span>
3950 </code></pre></div>
3951
3952 <p>Then, the code that needs to initialize those fields calls
3953 <code>BZ_RAND_INIT_MASK</code>, which expands into code to set the two fields to
3954 zero.</p>
3955 <p>At several points in the code, <code>BZ_RAND_UPD_MASK</code> gets called, which
3956 expands into code that updates the randomization state, or something
3957 like that, and uses <code>BZ_RAND_MASK</code> to get a useful value out of the
3958 randomization state.</p>
3959 <p>I have no idea yet what the state is about, but let's port it
3960 directly.</p>
3961 <h2>Give things a name</h2>
3962 <p>It's interesting to see that <strong>no code except for those macros</strong> uses
3963 the fields <code>rNToGo</code> and <code>rTPos</code>, which are declared via
3964 <code>BZ_RAND_DECLS</code>. So, let's make up a <strong>type with a name</strong> for that.
3965 Since I have no better name for it, I shall call it just
3966 <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/7bd2dc3c13d01a963ef80ae96727ce247acb77fa"><code>RandState</code></a>. I added that type definition in the C code,
3967 and replaced the macro-which-creates-struct-fields with a
3968 <code>RandState</code>-typed field:</p>
3969 <div class="highlight"><pre><span></span><code>-#define BZ_RAND_DECLS \
3970 - Int32 rNToGo; \
3971 - Int32 rTPos \
3972 +typedef struct {
3973 + Int32 rNToGo;
3974 + Int32 rTPos;
3975 +} RandState;
3976
3977 ...
3978
3979 - BZ_RAND_DECLS;
3980 + RandState rand;
3981 </code></pre></div>
3982
3983 <p>Since the fields now live inside a sub-struct, I changed the other
3984 macros to use <code>s-&gt;rand.rNToGo</code> instead of <code>s-&gt;rNToGo</code>, and similarly
3985 for the other field.</p>
3986 <h2>Turn macros into functions</h2>
3987 <p>Now, three commits (<a href="https://gitlab.com/federicomenaquintero/bzip2/commit/95d3e979a4ac69d382b67d0d4c97e956002797b4">1</a>, <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/35242daa8f760c68e0b358f4ee38488374f440db">2</a>, <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/5d376c13abb891d311a00dc1ce0533390fc20b51">3</a>) to turn the
3988 macros <code>BZ_RAND_INIT_MASK</code>, <code>BZ_RAND_MASK</code>, and <code>BZ_RAND_UPD_MASK</code>
3989 into functions.</p>
3990 <p>And now that the functions live in the same C source file as the
3991 lookup table they reference, <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/5d5f98ab88526498a20969d565bf2f3831b4467a">the table can be made <code>static const</code></a> to
3992 avoid having it as read/write unshared data in the linked binary.</p>
3993 <p>Premature optimization concern: doesn't de-inlining those macros
3994 cause performance problems? At first, we will get the added overhead
3995 from a function call. When the whole code is ported to Rust, the Rust
3996 compiler will probably be able to figure out that those tiny functions
3997 can be inlined (or we can <code>#[inline]</code> them by hand if we have proof,
3998 or if we have more hubris than faith in LLVM).</p>
3999 <h2>Port functions and table to Rust</h2>
4000 <p>The functions are so tiny, and the table so cut-and-pasteable, that
4001 it's easy to <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/dc01d95a704d02a810072a0aab5fb11e58150f78">port them to Rust</a> in a single shot:</p>
4002 <div class="highlight"><pre><span></span><code><span class="cp">#[no_mangle]</span><span class="w"></span>
4003 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">BZ2_rand_init</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">RandState</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4004 <span class="w"> </span><span class="n">RandState</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4005 <span class="w"> </span><span class="n">rNToGo</span>: <span class="mi">0</span><span class="p">,</span><span class="w"></span>
4006 <span class="w"> </span><span class="n">rTPos</span>: <span class="mi">0</span><span class="p">,</span><span class="w"></span>
4007 <span class="w"> </span><span class="p">}</span><span class="w"></span>
4008 <span class="p">}</span><span class="w"></span>
4009
4010 <span class="cp">#[no_mangle]</span><span class="w"></span>
4011 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">BZ2_rand_mask</span><span class="p">(</span><span class="n">r</span>: <span class="kp">&amp;</span><span class="nc">RandState</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"></span>
4012 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">rNToGo</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4013 <span class="w"> </span><span class="mi">1</span><span class="w"></span>
4014 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4015 <span class="w"> </span><span class="mi">0</span><span class="w"></span>
4016 <span class="w"> </span><span class="p">}</span><span class="w"></span>
4017 <span class="p">}</span><span class="w"></span>
4018
4019 <span class="cp">#[no_mangle]</span><span class="w"></span>
4020 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">BZ2_rand_update_mask</span><span class="p">(</span><span class="n">r</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">RandState</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4021 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">rNToGo</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4022 <span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">rNToGo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">RAND_TABLE</span><span class="p">[</span><span class="n">r</span><span class="p">.</span><span class="n">rTPos</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="p">];</span><span class="w"></span>
4023 <span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">rTPos</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
4024 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">rTPos</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">512</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4025 <span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">rTPos</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
4026 <span class="w"> </span><span class="p">}</span><span class="w"></span>
4027 <span class="w"> </span><span class="p">}</span><span class="w"></span>
4028 <span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">rNToGo</span><span class="w"> </span><span class="o">-=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
4029 <span class="p">}</span><span class="w"></span>
4030 </code></pre></div>
4031
4032 <p>Also, we define the <code>RandState</code> type as a Rust struct with a
4033 <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/dc01d95a704d02a810072a0aab5fb11e58150f78/bzlib_rust/src/rand_table.rs#L56-61">C-compatible representation</a>, so it will have the same layout in memory
4034 as the C struct. <strong>This is what allows us to have a <code>RandState</code> in
4035 the C struct</strong>, while in reality the C code doesn't access it
4036 directly; it is just used as a struct field.</p>
4037 <div class="highlight"><pre><span></span><code><span class="c1">// Keep this in sync with bzlib_private.h:</span>
4038 <span class="cp">#[repr(C)]</span><span class="w"></span>
4039 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">RandState</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4040 <span class="w"> </span><span class="n">rNToGo</span>: <span class="kt">i32</span><span class="p">,</span><span class="w"></span>
4041 <span class="w"> </span><span class="n">rTPos</span>: <span class="kt">i32</span><span class="p">,</span><span class="w"></span>
4042 <span class="p">}</span><span class="w"></span>
4043 </code></pre></div>
4044
4045 <p><a href="https://gitlab.com/federicomenaquintero/bzip2/commit/dc01d95a704d02a810072a0aab5fb11e58150f78">See the commit</a> for the corresponding <code>extern</code>
4046 declarations in <code>bzlib_private.h</code>. With those functions and the table
4047 ported to Rust, we can remove <code>randtable.c</code>. Yay!</p>
4048 <h2>A few cleanups</h2>
4049 <p>After moving to another house one throws away useless boxes; we have
4050 to do some cleanup in the Rust code after the initial port, too.</p>
4051 <p>Rust prefers snake_case fields rather than camelCase ones, and I
4052 agree. I <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/d13a9fbc51f5c49b859485d6097ee0dd2b3410f0">renamed the fields</a> to <code>n_to_go</code> and <code>table_pos</code>.</p>
4053 <p>Then, I discovered that the <code>EState</code> struct doesn't actually use the
4054 fields for the randomization state. I just <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/b73c6a507de373dbb0cc8deab0767b41967e678b">removed them</a>.</p>
4055 <h2>Exegesis</h2>
4056 <p>What is that randomization state all about?</p>
4057 <p>And why does <code>DState</code> (the struct used during decompression) need the
4058 randomization state, but <code>EState</code> (used during compression) doesn't
4059 need it?</p>
4060 <p>I found <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/rustify/compress.c#L639-648">this interesting comment</a>:</p>
4061 <div class="highlight"><pre><span></span><code> <span class="cm">/*-- </span>
4062 <span class="cm"> Now a single bit indicating (non-)randomisation. </span>
4063 <span class="cm"> As of version 0.9.5, we use a better sorting algorithm</span>
4064 <span class="cm"> which makes randomisation unnecessary. So always set</span>
4065 <span class="cm"> the randomised bit to &#39;no&#39;. Of course, the decoder</span>
4066 <span class="cm"> still needs to be able to handle randomised blocks</span>
4067 <span class="cm"> so as to maintain backwards compatibility with</span>
4068 <span class="cm"> older versions of bzip2.</span>
4069 <span class="cm"> --*/</span>
4070 <span class="n">bsW</span><span class="p">(</span><span class="n">s</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">);</span>
4071 </code></pre></div>
4072
4073 <p>Okay! So <em>compression</em> no longer uses randomization, but
4074 <em>decompression</em> has to support files which were compressed with
4075 randomization. Here, <code>bsW(s,1,0)</code> always writes a 0 bit to the file.</p>
4076 <p>However, the decompression code <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/b73c6a507de373dbb0cc8deab0767b41967e678b/decompress.c#L251">actually reads the <code>blockRandomised</code>
4077 bit</a> from the file so that it can see whether it is
4078 dealing with an old-format file:</p>
4079 <div class="highlight"><pre><span></span><code><span class="n">GET_BITS</span><span class="p">(</span><span class="n">BZ_X_RANDBIT</span><span class="p">,</span> <span class="n">s</span><span class="o">-&gt;</span><span class="n">blockRandomised</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
4080 </code></pre></div>
4081
4082 <p>Later in the code, this <code>s-&gt;blockRandomised</code> field gets consulted; if
4083 the bit is on, the code calls <code>BZ2_rand_update_mask()</code> and friends as
4084 appropriate. If one is using files compressed with Bzip2 0.9.5 or
4085 later, those randomization functions are not even called.</p>
4086 <p>Talk about preserving compatibility with the past.</p>
4087 <h2>Explanation, or building my headcanon</h2>
4088 <p>Bzip2's compression starts by running a <a href="https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform">Burrows-Wheeler
4089 Transform</a> on a block of data to compress, which is a wonderful
4090 algorithm that I'm trying to fully understand. Part of the BWT
4091 involves sorting all the string rotations of the block in question.</p>
4092 <p>Per <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/rustify/compress.c#L639-648">the comment I cited</a>, really old versions of bzip2 used a
4093 randomization helper to make sorting perform well in extreme cases,
4094 but not-so-old versions fixed this.</p>
4095 <p>This explains why the decompression struct <code>DState</code> has a
4096 <code>blockRandomised</code> bit, but the compression struct <code>EState</code> doesn't
4097 need one. The fields that the original macro was pasting into
4098 <code>EState</code> were just a vestige from 1999, which is when Bzip2 0.9.5 was
4099 released.</p></content><category term="misc"></category><category term="bzip2"></category><category term="rust"></category></entry><entry><title>Bzip2 uses Meson and Autotools now — and a plea for help</title><link href="https://people.gnome.org/~federico/blog/bzip2-uses-meson-and-autotools.html" rel="alternate"></link><published>2019-06-07T11:01:44-05:00</published><updated>2019-06-07T11:01:44-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-06-07:/~federico/blog/bzip2-uses-meson-and-autotools.html</id><summary type="html"><p>There is a lot of activity in the <a href="https://gitlab.com/federicomenaquintero/bzip2/activity">bzip2 repository</a>!</p>
4100 <p>Perhaps the most exciting thing is that Dylan Baker made a merge
4101 request to add <a href="https://gitlab.com/federicomenaquintero/bzip2/merge_requests/6">Meson as a build system for bzip2</a>; this is
4102 merged now into the master branch.</p>
4103 <p>The current status is this:</p>
4104 <ul>
4105 <li>Both Meson and Autotools are …</li></ul></summary><content type="html"><p>There is a lot of activity in the <a href="https://gitlab.com/federicomenaquintero/bzip2/activity">bzip2 repository</a>!</p>
4106 <p>Perhaps the most exciting thing is that Dylan Baker made a merge
4107 request to add <a href="https://gitlab.com/federicomenaquintero/bzip2/merge_requests/6">Meson as a build system for bzip2</a>; this is
4108 merged now into the master branch.</p>
4109 <p>The current status is this:</p>
4110 <ul>
4111 <li>Both Meson and Autotools are supported.</li>
4112 <li>We have CI runs for both build systems.</li>
4113 </ul>
4114 <h2>A plea for help: add CI runners for other platforms!</h2>
4115 <p>Do you use *BSD / Windows / Solaris / etc. and know how to make
4116 Gitlab's CI work for them?</p>
4117 <p>The only runners we have now for bzip2 are for well-known Linux
4118 distros. I would really like to keep bzip2 working on non-Linux
4119 platforms. If you know how to make Gitlab CI runners for other
4120 systems, <a href="https://gitlab.com/federicomenaquintero/bzip2/merge_requests">please send a merge request</a>!</p>
4121 <h2>Why two build systems?</h2>
4122 <p>Mainly uncertainty on my part. I haven't used Meson extensively;
4123 people tell me that it works better than Autotools out of the box for
4124 Windows.</p>
4125 <p>Bzip2 runs on all sorts of ancient systems, and I don't know whether
4126 Meson or Autotools will be a better fit for them. Time will tell.
4127 Hopefully in the future we can have only a single supported build
4128 system for bzip2.</p></content><category term="misc"></category><category term="bzip2"></category><category term="meson"></category></entry><entry><title>Bzip2 repository reconstructed</title><link href="https://people.gnome.org/~federico/blog/bzip2-repository-reconstructed.html" rel="alternate"></link><published>2019-06-05T19:13:05-05:00</published><updated>2019-06-05T19:13:05-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-06-05:/~federico/blog/bzip2-repository-reconstructed.html</id><summary type="html"><p>I have just done a <code>git push --force-with-lease</code> to <a href="https://gitlab.com/federicomenaquintero/bzip2/commits/master">bzip2's master
4129 branch</a>, which means that if you had a previous clone of this
4130 repository, you'll have to re-fetch it and rebase any changes you may
4131 have on top.</p>
4132 <p>I apologize for the inconvenience!</p>
4133 <p>But I have a good excuse: Julian …</p></summary><content type="html"><p>I have just done a <code>git push --force-with-lease</code> to <a href="https://gitlab.com/federicomenaquintero/bzip2/commits/master">bzip2's master
4134 branch</a>, which means that if you had a previous clone of this
4135 repository, you'll have to re-fetch it and rebase any changes you may
4136 have on top.</p>
4137 <p>I apologize for the inconvenience!</p>
4138 <p>But I have a good excuse: Julian Seward pointed me to a <a href="https://sourceware.org/git/?p=bzip2.git;a=summary">repository
4139 at sourceware</a> where Mark Wielaard reconstructed a commit
4140 history for bzip2, based on the historical tarballs starting from
4141 bzip2-0.1. Bzip2 was never maintained under revision control, so the
4142 reconstructed repository should be used mostly for historical
4143 reference (go look for <code>bzip2.exe</code> in the initial commit!).</p>
4144 <p>I have rebased all the post-1.0.6 commits on top of Mark's repository;
4145 this is what is in the <a href="https://gitlab.com/federicomenaquintero/bzip2/commits/master">master</a> branch now.</p>
4146 <p>There is a new <a href="https://gitlab.com/federicomenaquintero/bzip2/commits/rustify">rustify</a> branch as well, based on master, which is
4147 where I will do the gradual port to Rust.</p>
4148 <p>I foresee no other force-pushes to the master branch in the future.
4149 Apologies again if this disrupts your workflow.</p>
4150 <p><strong>Update:</strong> <a href="https://gitlab.com/federicomenaquintero/bzip2/issues/7">Someone did another reconstruction</a>. If they
4151 weave the histories together, I'll do another force-push, the very
4152 last one, I promise. If you send merge requests, I'll rebase them
4153 myself if that happens.</p></content><category term="misc"></category><category term="bzip2"></category></entry><entry><title>Maintaining bzip2</title><link href="https://people.gnome.org/~federico/blog/maintaining-bzip2.html" rel="alternate"></link><published>2019-06-04T19:41:57-05:00</published><updated>2019-06-04T19:41:57-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-06-04:/~federico/blog/maintaining-bzip2.html</id><summary type="html"><p>Today I had a very pleasant conversation with Julian Seward, of bzip2
4154 and <a href="http://valgrind.org/">Valgrind</a> fame. Julian has kindly agreed
4155 to cede the maintainership of <a href="https://sourceware.org/bzip2/">bzip2</a>
4156 to me.</p>
4157 <p>Bzip2 has not had a release since 2010. In the meantime, Linux
4158 distros have accumulated a number of bug/security fixes for it …</p></summary><content type="html"><p>Today I had a very pleasant conversation with Julian Seward, of bzip2
4159 and <a href="http://valgrind.org/">Valgrind</a> fame. Julian has kindly agreed
4160 to cede the maintainership of <a href="https://sourceware.org/bzip2/">bzip2</a>
4161 to me.</p>
4162 <p>Bzip2 has not had a release since 2010. In the meantime, Linux
4163 distros have accumulated a number of bug/security fixes for it.
4164 Seemingly every distributor of bzip2 patches its build system. The
4165 documentation generation step is a bit creaky. There is no source
4166 control repository, nor bug tracker. I hope to fix these things
4167 gradually.</p>
4168 <p>This is the new <a href="https://gitlab.com/federicomenaquintero/bzip2">repository for bzip2</a>.</p>
4169 <p>Ways in which you can immediately help by submitting merge requests:</p>
4170 <ul>
4171 <li>
4172 <p>Look at the <a href="https://gitlab.com/federicomenaquintero/bzip2/issues">issues</a>; currently they are around auto-generating the
4173 version number.</p>
4174 </li>
4175 <li>
4176 <p>Create a basic <a href="https://gitlab.com/help/ci/README.md">continuous integration</a> pipeline that at least
4177 builds the code and runs the tests.</p>
4178 </li>
4179 <li>
4180 <p>Test the autotools setup, courtesy of Stanislav Brabec, and improve
4181 it as you see fit.</p>
4182 </li>
4183 </ul>
4184 <p>The <a href="https://people.gnome.org/~federico/blog/bzip2-in-rust-basic-infra.html">rustification</a> will happen in a separate branch for now, at least
4185 until the Autotools setup settles down.</p>
4186 <p>I hope to have a 1.0.7 release soon, but this really needs <em>your</em>
4187 help. Let's revive this awesome little project.</p></content><category term="misc"></category><category term="bzip2"></category></entry><entry><title>Bzip2 in Rust - Basic infrastructure and CRC32 computation</title><link href="https://people.gnome.org/~federico/blog/bzip2-in-rust-basic-infra.html" rel="alternate"></link><published>2019-05-30T10:36:19-05:00</published><updated>2019-05-30T10:36:19-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-05-30:/~federico/blog/bzip2-in-rust-basic-infra.html</id><summary type="html"><p>I have started a little experiment in porting bits of the widely-used
4188 <a href="https://sourceware.org/bzip2/">bzip2/bzlib</a> to Rust. I hope this can
4189 serve to refresh bzip2, which had its last release in 2010 and has
4190 been nominally unmaintained for years.</p>
4191 <p>I hope to make several posts detailing how this port is done …</p></summary><content type="html"><p>I have started a little experiment in porting bits of the widely-used
4192 <a href="https://sourceware.org/bzip2/">bzip2/bzlib</a> to Rust. I hope this can
4193 serve to refresh bzip2, which had its last release in 2010 and has
4194 been nominally unmaintained for years.</p>
4195 <p>I hope to make several posts detailing how this port is done. In this
4196 post, I'll talk about setting up a Rust infrastructure for bzip2 and
4197 my experiments in replacing the C code that does CRC32 computations.</p>
4198 <h2>Super-quick summary of how librsvg was ported to Rust</h2>
4199 <ul>
4200 <li>
4201 <p>Add the necessary autotools infrastructure to build a Rust
4202 sub-library that gets linked into the main public library.</p>
4203 </li>
4204 <li>
4205 <p>Port bit by bit to Rust. Add unit tests as appropriate. Refactor
4206 endlessly.</p>
4207 </li>
4208 <li>
4209 <p><strong>MAINTAIN THE PUBLIC API/ABI AT ALL COSTS</strong> so callers don't
4210 notice that the library is being rewritten under their feet.</p>
4211 </li>
4212 </ul>
4213 <p>I have no idea of how bzip2 works internally, but I do know how to
4214 maintain ABIs, so let's get started.</p>
4215 <h2>Bzip2's source tree</h2>
4216 <p>As a very small project that just builds a library and couple of
4217 executables, bzip2 was structured with all the source files directly
4218 under a toplevel directory.</p>
4219 <p>The only tests in there are three reference files that get compressed,
4220 then uncompressed, and then compared to the original ones.</p>
4221 <p>As the rustification proceeds, I'll move the files around to better
4222 places. The scheme from librsvg worked well in this respect, so I'll
4223 probably be copying many of the techniques and organization from
4224 there.</p>
4225 <h2>Deciding what to port first</h2>
4226 <p>I looked a bit at the bzip2 sources, and the code to do CRC32
4227 computations seemed isolated enough from the rest of the code to port
4228 easily.</p>
4229 <p>The CRC32 code was arranged like this. First, a lookup table in
4230 <code>crc32table.c</code>:</p>
4231 <div class="highlight"><pre><span></span><code><span class="n">UInt32</span> <span class="n">BZ2_crc32Table</span><span class="p">[</span><span class="mi">256</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span>
4232 <span class="mh">0x00000000L</span><span class="p">,</span> <span class="mh">0x04c11db7L</span><span class="p">,</span> <span class="mh">0x09823b6eL</span><span class="p">,</span> <span class="mh">0x0d4326d9L</span><span class="p">,</span>
4233 <span class="mh">0x130476dcL</span><span class="p">,</span> <span class="mh">0x17c56b6bL</span><span class="p">,</span> <span class="mh">0x1a864db2L</span><span class="p">,</span> <span class="mh">0x1e475005L</span><span class="p">,</span>
4234 <span class="p">...</span>
4235 <span class="p">}</span>
4236 </code></pre></div>
4237
4238 <p>And then, three macros in <code>bzlib_private.h</code> which make up all the
4239 CRC32 code in the library:</p>
4240 <div class="highlight"><pre><span></span><code><span class="k">extern</span> <span class="n">UInt32</span> <span class="n">BZ2_crc32Table</span><span class="p">[</span><span class="mi">256</span><span class="p">];</span>
4241
4242 <span class="cp">#define BZ_INITIALISE_CRC(crcVar) \</span>
4243 <span class="cp">{ \</span>
4244 <span class="cp"> crcVar = 0xffffffffL; \</span>
4245 <span class="cp">}</span>
4246
4247 <span class="cp">#define BZ_FINALISE_CRC(crcVar) \</span>
4248 <span class="cp">{ \</span>
4249 <span class="cp"> crcVar = ~(crcVar); \</span>
4250 <span class="cp">}</span>
4251
4252 <span class="cp">#define BZ_UPDATE_CRC(crcVar,cha) \</span>
4253 <span class="cp">{ \</span>
4254 <span class="cp"> crcVar = (crcVar &lt;&lt; 8) ^ \</span>
4255 <span class="cp"> BZ2_crc32Table[(crcVar &gt;&gt; 24) ^ \</span>
4256 <span class="cp"> ((UChar)cha)]; \</span>
4257 <span class="cp">}</span>
4258 </code></pre></div>
4259
4260 <p>Initially I wanted to just remove this code and replace it with one of
4261 the existing Rust crates to do CRC32 computations, but first I needed
4262 to know which variant of CRC32 this is.</p>
4263 <h2>Preparing the CRC32 port so it will not break</h2>
4264 <p>I needed to set up tests for the CRC32 code so the replacement code
4265 would compute exactly the same values as the original:</p>
4266 <ul>
4267 <li><a href="https://gitlab.com/federicomenaquintero/bzip2/commit/bd79e7adf4274eef376111404a40ef3bd6836f06">Rename crc32table.c to
4268 crc32.c</a> -
4269 that file is going to hold all the CRC32 code, not only the lookup table.</li>
4270 <li><a href="https://gitlab.com/federicomenaquintero/bzip2/commit/fb284d38c6f45382c5f6acde69631caed6589f27">Turn the CRC32 macros into
4271 functions</a> -
4272 so I can move them to Rust and have the C code call them.</li>
4273 </ul>
4274 <p>Then I needed a test that computed the CRC32 values of several
4275 strings, so I could capture the results and make them part of the
4276 test.</p>
4277 <div class="highlight"><pre><span></span><code><span class="k">static</span> <span class="k">const</span> <span class="n">UChar</span> <span class="n">buf1</span><span class="p">[]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span><span class="p">;</span>
4278 <span class="k">static</span> <span class="k">const</span> <span class="n">UChar</span> <span class="n">buf2</span><span class="p">[]</span> <span class="o">=</span> <span class="s">&quot; &quot;</span><span class="p">;</span>
4279 <span class="k">static</span> <span class="k">const</span> <span class="n">UChar</span> <span class="n">buf3</span><span class="p">[]</span> <span class="o">=</span> <span class="s">&quot;hello world&quot;</span><span class="p">;</span>
4280 <span class="k">static</span> <span class="k">const</span> <span class="n">UChar</span> <span class="n">buf4</span><span class="p">[]</span> <span class="o">=</span> <span class="s">&quot;Lorem ipsum dolor sit amet, consectetur adipiscing elit, &quot;</span><span class="p">;</span>
4281
4282 <span class="kt">int</span>
4283 <span class="nf">main</span> <span class="p">(</span><span class="kt">void</span><span class="p">)</span>
4284 <span class="p">{</span>
4285 <span class="n">printf</span> <span class="p">(</span><span class="s">&quot;buf1: %x</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">crc32_buffer</span><span class="p">(</span><span class="n">buf1</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">buf1</span><span class="p">)));</span>
4286 <span class="n">printf</span> <span class="p">(</span><span class="s">&quot;buf2: %x</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">crc32_buffer</span><span class="p">(</span><span class="n">buf2</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">buf2</span><span class="p">)));</span>
4287 <span class="n">printf</span> <span class="p">(</span><span class="s">&quot;buf3: %x</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">crc32_buffer</span><span class="p">(</span><span class="n">buf3</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">buf3</span><span class="p">)));</span>
4288 <span class="n">printf</span> <span class="p">(</span><span class="s">&quot;buf4: %x</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">crc32_buffer</span><span class="p">(</span><span class="n">buf4</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">buf4</span><span class="p">)));</span>
4289 <span class="c1">// ...</span>
4290 <span class="p">}</span>
4291 </code></pre></div>
4292
4293 <p>This computes the CRC32 values of some strings using the original
4294 algorithm, and prints their results. Then I could cut&amp;paste those
4295 results, and turn the <code>printf</code> into <code>assert</code> — and that gives me a
4296 test.</p>
4297 <div class="highlight"><pre><span></span><code><span class="kt">int</span>
4298 <span class="nf">main</span> <span class="p">(</span><span class="kt">void</span><span class="p">)</span>
4299 <span class="p">{</span>
4300 <span class="n">assert</span> <span class="p">(</span><span class="n">crc32_buffer</span> <span class="p">(</span><span class="n">buf1</span><span class="p">,</span> <span class="n">strlen</span> <span class="p">(</span><span class="n">buf1</span><span class="p">))</span> <span class="o">==</span> <span class="mh">0x00000000</span><span class="p">);</span>
4301 <span class="n">assert</span> <span class="p">(</span><span class="n">crc32_buffer</span> <span class="p">(</span><span class="n">buf2</span><span class="p">,</span> <span class="n">strlen</span> <span class="p">(</span><span class="n">buf2</span><span class="p">))</span> <span class="o">==</span> <span class="mh">0x29d4f6ab</span><span class="p">);</span>
4302 <span class="n">assert</span> <span class="p">(</span><span class="n">crc32_buffer</span> <span class="p">(</span><span class="n">buf3</span><span class="p">,</span> <span class="n">strlen</span> <span class="p">(</span><span class="n">buf3</span><span class="p">))</span> <span class="o">==</span> <span class="mh">0x44f71378</span><span class="p">);</span>
4303 <span class="n">assert</span> <span class="p">(</span><span class="n">crc32_buffer</span> <span class="p">(</span><span class="n">buf4</span><span class="p">,</span> <span class="n">strlen</span> <span class="p">(</span><span class="n">buf4</span><span class="p">))</span> <span class="o">==</span> <span class="mh">0xd31de6c9</span><span class="p">);</span>
4304 <span class="c1">// ...</span>
4305 <span class="p">}</span>
4306 </code></pre></div>
4307
4308 <h2>Setting up a Rust infrastructure for bzip2</h2>
4309 <p>Two things made this reasonably easy:</p>
4310 <ul>
4311 <li>A patch from Stanislav Brabec, from Suse, to <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/72ece1146d6de1d2e6d2bb2cc96a1c3c0f307d7b">add an autotools
4312 framework to bzip2</a></li>
4313 <li>The existing <a href="https://people.gnome.org/~federico/blog/librsvg-build-infrastructure.html">Autotools + Rust machinery in librsvg</a>.</li>
4314 </ul>
4315 <p>I.e. "copy and paste from somewhere that I know works well".
4316 Wonderful!</p>
4317 <p>This is the <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/dfa0b88df8b518c4b76d0f3660b6487c67e93d52">commit that adds a Rust infrastructure for
4318 bzip2</a>. It does the following:</p>
4319 <ol>
4320 <li>Create a Cargo workspace (a <code>Cargo.toml</code> in the toplevel) with a
4321 single member, a <code>bzlib_rust</code> directory where the Rustified parts
4322 of the code will live.</li>
4323 <li>Create <code>bzlib_rust/Cargo.toml</code> and <code>bzlib_rust/src</code> for the Rust
4324 sources. This will generate a <code>staticlib</code> for <code>libbzlib_rust.a</code>, that
4325 can be linked into the main <code>libbz2.la</code>.</li>
4326 <li>Puts in automake hooks so that <code>make clean</code>, <code>make check</code>, etc. all
4327 do what you expect for the Rust part.</li>
4328 </ol>
4329 <p>As a side benefit, librsvg's Autotools+Rust infrastructure already
4330 handled things like cross-compilation correctly, so I have high hopes
4331 that this will be good enough for bzip2.</p>
4332 <h2>Can I use a Rust crate for CRC32?</h2>
4333 <p>There are <a href="https://crates.io/search?q=crc&amp;sort=downloads">many Rust crates to do CRC computations</a>. I was
4334 hoping especially to be able to use <a href="https://github.com/srijs/rust-crc32fast">crc32fast</a>, which is
4335 SIMD-accelerated.</p>
4336 <p>I wrote a Rust version of the "CRC me a buffer" test from above to see
4337 if crc32fast produced the same values as the C code, and of course it
4338 didn't. Eventually, after <a href="https://mstdn.mx/@federicomena/102170914494787056">asking on Mastodon</a>, Kepstin <a href="https://glitch.social/@kepstin/102171203800281388">figured
4339 out</a> what variant of CRC32 is being used in the original
4340 code.</p>
4341 <p>It turns out that this is directly doable in Rust with the <a href="https://github.com/mrhooray/crc-rs">git version
4342 of the <code>crc</code> crate</a>. This crate lets one configure the CRC32
4343 polynomial and the mode of computation; there are <a href="https://github.com/Michaelangel007/crc32">many variants of
4344 CRC32</a> and I wasn't fully aware of them.</p>
4345 <p>The magic incantation is this:</p>
4346 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">digest</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">crc32</span>::<span class="n">Digest</span>::<span class="n">new_custom</span><span class="p">(</span><span class="n">crc32</span>::<span class="n">IEEE</span><span class="p">,</span><span class="w"> </span><span class="o">!</span><span class="mi">0</span><span class="k">u32</span><span class="p">,</span><span class="w"> </span><span class="o">!</span><span class="mi">0</span><span class="k">u32</span><span class="p">,</span><span class="w"> </span><span class="n">crc</span>::<span class="n">CalcType</span>::<span class="n">Normal</span><span class="p">);</span><span class="w"></span>
4347 </code></pre></div>
4348
4349 <p>With that, <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/6ef10026a74af85437a4deca6957889d9e8bdbf8">the Rust
4350 test</a>
4351 produces the same values as the C code. Yay!</p>
4352 <h2>But it can't be that easy</h2>
4353 <p>Bzlib stores its internal state in the <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/60be65f9/bzlib_private.h#L182-252">EState
4354 struct</a>,
4355 defined in
4356 <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/60be65f9/bzlib_private.h">bzlib_private.h</a>.</p>
4357 <p>That struct stores several running CRC32 computations, and the state
4358 for each one of those is a single <code>UInt32</code> value. However, I cannot
4359 just replace those struct fields with something that comes from Rust,
4360 since the C code does not know the size of a <code>crc32::Digest</code> from
4361 Rust.</p>
4362 <p>The normal way to do this (say, like in librsvg) would be to turn
4363 <code>UInt32 some_crc</code> into <code>void *some_crc</code> and heap-allocate that on the
4364 Rust side, with whatever size it needs.</p>
4365 <p><strong>However!</strong></p>
4366 <p>It turns out that bzlib lets the caller <a href="https://gitlab.com/federicomenaquintero/bzip2/blob/60be65f9/bzlib.h#L62-64">define a custom
4367 allocator</a>
4368 so that bzlib doesn't use plain <code>malloc()</code> by default.</p>
4369 <p>Rust lets one define a <a href="https://doc.rust-lang.org/alloc/alloc/trait.GlobalAlloc.html">global, custom allocator</a>.
4370 However, bzlib's concept of a custom allocator includes a bit of
4371 context:</p>
4372 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
4373 <span class="c1">// ...</span>
4374
4375 <span class="kt">void</span> <span class="o">*</span><span class="p">(</span><span class="o">*</span><span class="n">bzalloc</span><span class="p">)(</span><span class="kt">void</span> <span class="o">*</span><span class="n">opaque</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">,</span> <span class="kt">int</span> <span class="n">m</span><span class="p">);</span>
4376 <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">bzfree</span><span class="p">)(</span><span class="kt">void</span> <span class="o">*</span><span class="n">opaque</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">ptr</span><span class="p">);</span>
4377 <span class="kt">void</span> <span class="o">*</span><span class="n">opaque</span><span class="p">;</span>
4378 <span class="p">}</span> <span class="n">bz_stream</span><span class="p">;</span>
4379 </code></pre></div>
4380
4381 <p>The caller sets up <code>bzalloc/bzfree</code> callbacks and an optional <code>opaque</code>
4382 context for the allocator. However, Rust's <code>GlobalAlloc</code> is set up at
4383 compilation time, and we can't pass that context in a good,
4384 thread-safe fashion to it.</p>
4385 <h2>Who uses the bzlib custom allocator, anyway?</h2>
4386 <p>If one sets <code>bzalloc/bzfree</code> to <code>NULL</code>, bzlib will use the system's
4387 plain <code>malloc()/free()</code> by default. Most software does this.</p>
4388 <p>I am looking in <a href="https://codesearch.debian.net/search?q=bzalloc">Debian's codesearch</a> for where <code>bzalloc</code>
4389 gets set, hoping that I can figure out if that software really needs a
4390 custom allocator, or if they are just dressing up <code>malloc()</code> with
4391 logging code or similar (ImageMagick seems to do this; Python seems to
4392 have a genuine concern about the Global Interpreter Lock). Debian's
4393 codesearch is a fantastic tool!</p>
4394 <h2>The first rustified code</h2>
4395 <p>I <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/2fb746e887d0fff705f5c82c961407b8539d96a2">cut&amp;pasted the CRC32 lookup
4396 table</a>
4397 and fixed it up for Rust's syntax, and also ported the CRC32
4398 computation functions. I gave them the same names as the original C
4399 ones, and exported them, e.g.</p>
4400 <div class="highlight"><pre><span></span><code><span class="k">const</span><span class="w"> </span><span class="n">TABLE</span>: <span class="p">[</span><span class="kt">u32</span><span class="p">;</span><span class="w"> </span><span class="mi">256</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="w"></span>
4401 <span class="w"> </span><span class="mh">0x00000000</span><span class="p">,</span><span class="w"> </span><span class="mh">0x04c11db7</span><span class="p">,</span><span class="w"> </span><span class="mh">0x09823b6e</span><span class="p">,</span><span class="w"> </span><span class="mh">0x0d4326d9</span><span class="p">,</span><span class="w"></span>
4402 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
4403 <span class="p">};</span><span class="w"></span>
4404
4405 <span class="cp">#[no_mangle]</span><span class="w"></span>
4406 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">BZ2_update_crc</span><span class="p">(</span><span class="n">crc_var</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="kt">u32</span><span class="p">,</span><span class="w"> </span><span class="n">cha</span>: <span class="kt">u8</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4407 <span class="w"> </span><span class="o">*</span><span class="n">crc_var</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="o">*</span><span class="n">crc_var</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="mi">8</span><span class="p">)</span><span class="w"> </span><span class="o">^</span><span class="w"> </span><span class="n">TABLE</span><span class="p">[((</span><span class="o">*</span><span class="n">crc_var</span><span class="w"> </span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="mi">24</span><span class="p">)</span><span class="w"> </span><span class="o">^</span><span class="w"> </span><span class="kt">u32</span>::<span class="n">from</span><span class="p">(</span><span class="n">cha</span><span class="p">))</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="p">];</span><span class="w"></span>
4408 <span class="p">}</span><span class="w"></span>
4409 </code></pre></div>
4410
4411 <p>This is a straight port of the C code. Rust is very strict about
4412 integer sizes, and arrays can only be indexed with a <code>usize</code>, not any
4413 random integer — hence the explicit conversions.</p>
4414 <p>And with this, <a href="https://gitlab.com/federicomenaquintero/bzip2/commit/c4c071059f98cfc2e4bb528bde499ba2d41a8b24">and after fixing the
4415 linkage</a>,
4416 the tests pass!</p>
4417 <p>First pass at rustifying CRC32: <strong>done</strong>.</p>
4418 <h2>But that does one byte at a time</h2>
4419 <p>Indeed; the original C code to do CRC32 only handled one byte at a
4420 time. If I replace this with a SIMD-enabled Rust crate, it will want
4421 to process whole buffers at once. I hope the code in bzlib can be
4422 refactored to do that. We'll see!</p>
4423 <h2>How to use an existing Rust crate for this</h2>
4424 <p>I just found out that one does not in fact need to use a complete
4425 <code>crc32::Digest</code> to do equivalent computations; one can call
4426 <a href="https://docs.rs/crc/1.8.1/crc/crc32/fn.update.html">crc32::update()</a>
4427 by hand and maintain a single <code>u32</code> state, just like the original
4428 <code>UInt32</code> from the C code.</p>
4429 <p>So, I may not need to mess around with a custom allocator just yet.
4430 Stay tuned.</p>
4431 <p>In the meantime, I've <a href="https://github.com/srijs/rust-crc32fast/issues/9">filed a bug against
4432 crc32fast</a> to make
4433 it possible to use a custom polynomial and order and still get the
4434 benefits of SIMD.</p></content><category term="misc"></category><category term="rust"></category><category term="bzip2"></category></entry><entry><title>Containing mutability in GObjects</title><link href="https://people.gnome.org/~federico/blog/containing-mutability-in-gobjects.html" rel="alternate"></link><published>2019-04-16T17:04:33-05:00</published><updated>2019-04-16T17:04:40-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-04-16:/~federico/blog/containing-mutability-in-gobjects.html</id><summary type="html"><p>Traditionally, GObject implementations in C are mutable: you
4435 instantiate a GObject and then change its state via method calls.
4436 Sometimes this is expected and desired; a <code>GtkCheckButton</code> widget
4437 certainly can change its internal state from pressed to not pressed,
4438 for example.</p>
4439 <p>Other times, objects are mutable while they are being …</p></summary><content type="html"><p>Traditionally, GObject implementations in C are mutable: you
4440 instantiate a GObject and then change its state via method calls.
4441 Sometimes this is expected and desired; a <code>GtkCheckButton</code> widget
4442 certainly can change its internal state from pressed to not pressed,
4443 for example.</p>
4444 <p>Other times, objects are mutable while they are being "assembled" or
4445 "configured", and only yield a final immutable result until later.
4446 This is the case for <code>RsvgHandle</code> from librsvg.</p>
4447 <p>Please bear with me while I write about the history of the
4448 <code>RsvgHandle</code> API and why it ended up with different ways of doing the
4449 same thing.</p>
4450 <h2>The traditional RsvgHandle API</h2>
4451 <p>The final purpose of an <code>RsvgHandle</code> is to represent an SVG document
4452 loaded in memory. Once it is loaded, the SVG document does not
4453 change, as librsvg does not support animation or creating/removing SVG
4454 elements; it is a static renderer.</p>
4455 <p>However, before an <code>RsvgHandle</code> achieves its immutable state, it has
4456 to be loaded first. Loading can be done in two ways:</p>
4457 <ul>
4458 <li>The historical/deprecated way, using the <a href="https://developer.gnome.org/rsvg/unstable/rsvg-RsvgHandle.html#rsvg-handle-write"><code>rsvg_handle_write()</code></a> and
4459 <code>rsvg_handle_close()</code> APIs. Plenty of code in GNOME used this
4460 <code>write/close</code> idiom before GLib got a good abstraction for
4461 streams; you can see another example in <a href="https://developer.gnome.org/gdk-pixbuf/unstable/GdkPixbufLoader.html"><code>GdkPixbufLoader</code></a>.
4462 The idea is that applications do this:</li>
4463 </ul>
4464 <div class="highlight"><pre><span></span><code><span class="n">file</span> <span class="o">=</span> <span class="n">open</span> <span class="n">a</span> <span class="n">file</span><span class="o">...</span><span class="p">;</span>
4465 <span class="n">handle</span> <span class="o">=</span> <span class="n">rsvg_handle_new</span> <span class="p">();</span>
4466
4467 <span class="k">while</span> <span class="p">(</span><span class="n">file</span> <span class="n">has</span> <span class="n">more</span> <span class="n">data</span><span class="p">)</span> <span class="p">{</span>
4468 <span class="n">rsvg_handle_write</span><span class="p">(</span><span class="n">handle</span><span class="p">,</span> <span class="n">a</span> <span class="n">bit</span> <span class="n">of</span> <span class="n">data</span><span class="p">);</span>
4469 <span class="p">}</span>
4470
4471 <span class="n">rsvg_handle_close</span> <span class="p">(</span><span class="n">handle</span><span class="p">);</span>
4472
4473 <span class="o">//</span> <span class="n">now</span> <span class="n">the</span> <span class="n">handle</span> <span class="k">is</span> <span class="n">fully</span> <span class="n">loaded</span> <span class="ow">and</span> <span class="n">immutable</span>
4474
4475 <span class="n">rsvg_handle_render</span> <span class="p">(</span><span class="n">handle</span><span class="p">,</span> <span class="o">...</span><span class="p">);</span>
4476 </code></pre></div>
4477
4478 <ul>
4479 <li>The streaming way, with <a href="https://developer.gnome.org/rsvg/unstable/rsvg-Using-RSVG-with-GIO.html#rsvg-handle-read-stream-sync"><code>rsvg_handle_read_stream_sync()</code></a>,
4480 which takes a <a href="https://developer.gnome.org/gio/unstable/GInputStream.html"><code>GInputStream</code></a>, or one of the convenience functions
4481 which take a <a href="https://developer.gnome.org/gio/unstable/GFile.html#GFile-struct"><code>GFile</code></a> and produce a stream from it.</li>
4482 </ul>
4483 <div class="highlight"><pre><span></span><code><span class="n">file</span> <span class="o">=</span> <span class="n">g_file_new_for_path</span> <span class="p">(</span><span class="s2">&quot;/foo/bar.svg&quot;</span><span class="p">);</span>
4484 <span class="n">stream</span> <span class="o">=</span> <span class="n">g_file_read</span> <span class="p">(</span><span class="n">file</span><span class="p">,</span> <span class="o">...</span><span class="p">);</span>
4485 <span class="n">handle</span> <span class="o">=</span> <span class="n">rsvg_handle_new</span> <span class="p">();</span>
4486
4487 <span class="n">rsvg_handle_read_stream_sync</span> <span class="p">(</span><span class="n">handle</span><span class="p">,</span> <span class="n">stream</span><span class="p">,</span> <span class="o">...</span><span class="p">);</span>
4488
4489 <span class="o">//</span> <span class="n">now</span> <span class="n">the</span> <span class="n">handle</span> <span class="k">is</span> <span class="n">fully</span> <span class="n">loaded</span> <span class="ow">and</span> <span class="n">immutable</span>
4490
4491 <span class="n">rsvg_handle_render</span> <span class="p">(</span><span class="n">handle</span><span class="p">,</span> <span class="o">...</span><span class="p">);</span>
4492 </code></pre></div>
4493
4494 <h2>A bit of history</h2>
4495 <p>Let's consider a few of <code>RsvgHandle</code>'s functions.</p>
4496 <p><strong>Constructors:</strong></p>
4497 <ul>
4498 <li><code>rsvg_handle_new()</code></li>
4499 <li><code>rsvg_handle_new_with_flags()</code></li>
4500 </ul>
4501 <p><strong>Configure the handle for loading:</strong></p>
4502 <ul>
4503 <li><code>rsvg_handle_set_base_uri()</code></li>
4504 <li><code>rsvg_handle_set_base_gfile()</code></li>
4505 </ul>
4506 <p><strong>Deprecated loading API:</strong></p>
4507 <ul>
4508 <li><code>rsvg_handle_write()</code></li>
4509 <li><code>rsvg_handle_close()</code></li>
4510 </ul>
4511 <p><strong>Streaming API:</strong></p>
4512 <ul>
4513 <li><code>rsvg_handle_read_stream_sync()</code></li>
4514 </ul>
4515 <p>When librsvg first acquired the concept of an <code>RsvgHandle</code>, it just
4516 had <code>rsvg_handle_new()</code> with no arguments. About 9 years later, it
4517 got <code>rsvg_handle_new_with_flags()</code> to allow more options, but it took
4518 another 2 years to actually add some usable flags — the first one was
4519 to configure the parsing limits in the underlying calls to libxml2.</p>
4520 <p>About 3 years after <code>RsvgHandle</code> appeared, it got
4521 <code>rsvg_handle_set_base_uri()</code> to configure the "base URI" against which
4522 relative references in the SVG document get resolved. For example, if
4523 you are reading <code>/foo/bar.svg</code> and it contains an element like <code>&lt;image
4524 xlink:ref="smiley.png"/&gt;</code>, then librsvg needs to be able to produce
4525 the path <code>/foo/smiley.png</code> and that is done relative to the base URI.
4526 (The base URI is implicit when reading from a specific SVG file, but
4527 it needs to be provided when reading from an arbitrary stream that may
4528 not even come from a file.)</p>
4529 <p>Initially <code>RsvgHandle</code> had the <code>write/close</code> APIs, and 8 years later
4530 it got the streaming functions once GIO appeared. Eventually the
4531 streaming API would be the preferred one, instead of just being a
4532 convenience for those brave new apps that started using GIO.</p>
4533 <p>A summary of librsvg's API may be something like:</p>
4534 <ul>
4535 <li>
4536 <p>librsvg gets written initially; it doesn't even have an
4537 <code>RsvgHandle</code>, and just provides a single function which takes a
4538 <code>FILE *</code> and renders it to a <code>GdkPixbuf</code>.</p>
4539 </li>
4540 <li>
4541 <p>That gets replaced with <code>RsvgHandle</code>, its single <code>rsvg_handle_new()</code>
4542 constructor, and the <code>write/close</code> API to feed it data
4543 progressively.</p>
4544 </li>
4545 <li>
4546 <p>GIO appears, we get the first widespread streaming APIs in GNOME,
4547 and <code>RsvgHandle</code> gets the ability to read from streams.</p>
4548 </li>
4549 <li>
4550 <p><code>RsvgHandle</code> gets <code>rsvg_handle_new_with_flags()</code> because now apps
4551 may want to configure extra stuff for libxml2.</p>
4552 </li>
4553 <li>
4554 <p>When Cairo appears and librsvg is ported to it, <code>RsvgHandle</code> gets an
4555 extra flag so that SVGs rendered to PDF can embed image data
4556 efficiently.</p>
4557 </li>
4558 </ul>
4559 <p>It's a convoluted history, but <code>git log -- rsvg.h</code> makes it accessible.</p>
4560 <h2>Where is the mutability?</h2>
4561 <p>An <code>RsvgHandle</code> gets created, with flags or without. It's empty, and
4562 doesn't know if it will be given data with the <code>write/close</code> API or
4563 with the streaming API. Also, someone may call <code>set_base_uri()</code> on
4564 it. So, the handle must remain mutable while it is being populated
4565 with data. After that, it can say, "no more changes, I'm done".</p>
4566 <p>In C, this doesn't even have a name. Everything is mutable by default
4567 all the time. This monster was the private data of <code>RsvgHandle</code>
4568 before it got ported to Rust:</p>
4569 <div class="highlight"><pre><span></span><code><span class="k">struct</span> <span class="nc">RsvgHandlePrivate</span> <span class="p">{</span>
4570 <span class="c1">// set during construction</span>
4571 <span class="n">RsvgHandleFlags</span> <span class="n">flags</span><span class="p">;</span>
4572
4573 <span class="c1">// GObject-ism</span>
4574 <span class="n">gboolean</span> <span class="n">is_disposed</span><span class="p">;</span>
4575
4576 <span class="c1">// Extra crap for a deprecated API</span>
4577 <span class="n">RsvgSizeFunc</span> <span class="n">size_func</span><span class="p">;</span>
4578 <span class="n">gpointer</span> <span class="n">user_data</span><span class="p">;</span>
4579 <span class="n">GDestroyNotify</span> <span class="n">user_data_destroy</span><span class="p">;</span>
4580
4581 <span class="c1">// Data only used while parsing an SVG</span>
4582 <span class="n">RsvgHandleState</span> <span class="n">state</span><span class="p">;</span>
4583 <span class="n">RsvgDefs</span> <span class="o">*</span><span class="n">defs</span><span class="p">;</span>
4584 <span class="n">guint</span> <span class="n">nest_level</span><span class="p">;</span>
4585 <span class="n">RsvgNode</span> <span class="o">*</span><span class="n">currentnode</span><span class="p">;</span>
4586 <span class="n">RsvgNode</span> <span class="o">*</span><span class="n">treebase</span><span class="p">;</span>
4587 <span class="n">GHashTable</span> <span class="o">*</span><span class="n">css_props</span><span class="p">;</span>
4588 <span class="n">RsvgSaxHandler</span> <span class="o">*</span><span class="n">handler</span><span class="p">;</span>
4589 <span class="kt">int</span> <span class="n">handler_nest</span><span class="p">;</span>
4590 <span class="n">GHashTable</span> <span class="o">*</span><span class="n">entities</span><span class="p">;</span>
4591 <span class="n">xmlParserCtxtPtr</span> <span class="n">ctxt</span><span class="p">;</span>
4592 <span class="n">GError</span> <span class="o">**</span><span class="n">error</span><span class="p">;</span>
4593 <span class="n">GCancellable</span> <span class="o">*</span><span class="n">cancellable</span><span class="p">;</span>
4594 <span class="n">GInputStream</span> <span class="o">*</span><span class="n">compressed_input_stream</span><span class="p">;</span>
4595
4596 <span class="c1">// Data only used while rendering</span>
4597 <span class="kt">double</span> <span class="n">dpi_x</span><span class="p">;</span>
4598 <span class="kt">double</span> <span class="n">dpi_y</span><span class="p">;</span>
4599
4600 <span class="c1">// The famous base URI, set before loading</span>
4601 <span class="n">gchar</span> <span class="o">*</span><span class="n">base_uri</span><span class="p">;</span>
4602 <span class="n">GFile</span> <span class="o">*</span><span class="n">base_gfile</span><span class="p">;</span>
4603
4604 <span class="c1">// Some internal stuff</span>
4605 <span class="n">gboolean</span> <span class="n">in_loop</span><span class="p">;</span>
4606 <span class="n">gboolean</span> <span class="n">is_testing</span><span class="p">;</span>
4607 <span class="p">};</span>
4608 </code></pre></div>
4609
4610 <p>"Single responsibility principle"? This is a horror show. That
4611 <code>RsvgHandlePrivate</code> struct has all of these:</p>
4612 <ul>
4613 <li>Data only settable during construction (flags)</li>
4614 <li>Data set after construction, but which may only be set before
4615 loading (base URI)</li>
4616 <li>Highly mutable data used only during the loading stage: state
4617 machines, XML parsers, a stack of XML elements, CSS properties...</li>
4618 <li>The DPI (dots per inch) values only used during rendering.</li>
4619 <li>Assorted fields used at various stages of the handle's life.</li>
4620 </ul>
4621 <p>It took a lot of refactoring to get the code to a point where it was
4622 clear that an <code>RsvgHandle</code> in fact has distinct stages during its
4623 lifetime, and that some of that data should only live during a
4624 particular stage. Before, everything seemed a jumble of fields, used
4625 at various unclear points in the code (for the struct listing above,
4626 I've grouped related fields together — they were somewhat shuffled in
4627 the original code!).</p>
4628 <h2>What would a better separation look like?</h2>
4629 <p>In the <a href="https://gitlab.gnome.org/GNOME/librsvg/">master branch</a>, now librsvg has this:</p>
4630 <div class="highlight"><pre><span></span><code><span class="sd">/// Contains all the interior mutability for a RsvgHandle to be called</span>
4631 <span class="sd">/// from the C API.</span>
4632 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">CHandle</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4633 <span class="w"> </span><span class="n">dpi</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="n">Dpi</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
4634 <span class="w"> </span><span class="n">load_flags</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="n">LoadFlags</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
4635
4636 <span class="w"> </span><span class="n">base_url</span>: <span class="nc">RefCell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Url</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w"></span>
4637 <span class="w"> </span><span class="c1">// needed because the C api returns *const char</span>
4638 <span class="w"> </span><span class="n">base_url_cstring</span>: <span class="nc">RefCell</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">CString</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w"></span>
4639
4640 <span class="w"> </span><span class="n">size_callback</span>: <span class="nc">RefCell</span><span class="o">&lt;</span><span class="n">SizeCallback</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
4641 <span class="w"> </span><span class="n">is_testing</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="kt">bool</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
4642 <span class="w"> </span><span class="n">load_state</span>: <span class="nc">RefCell</span><span class="o">&lt;</span><span class="n">LoadState</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
4643 <span class="p">}</span><span class="w"></span>
4644 </code></pre></div>
4645
4646 <p>Internally, that <code>CHandle</code> struct is now the private data of the
4647 public <code>RsvgHandle</code> object. Note that all of <code>CHandle</code>'s fields are a
4648 <code>Cell&lt;&gt;</code> or <code>RefCell&lt;&gt;</code>: in Rust terms, this means that those fields
4649 allow for "interior mutability" in the <code>CHandle</code> struct: they can be
4650 modified after intialization.</p>
4651 <p>The last field's cell, <code>load_state</code>, contains this type:</p>
4652 <div class="highlight"><pre><span></span><code><span class="k">enum</span> <span class="nc">LoadState</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4653 <span class="w"> </span><span class="n">Start</span><span class="p">,</span><span class="w"></span>
4654
4655 <span class="w"> </span><span class="c1">// Being loaded using the legacy write()/close() API</span>
4656 <span class="w"> </span><span class="n">Loading</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">buffer</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u8</span><span class="o">&gt;</span><span class="w"> </span><span class="p">},</span><span class="w"></span>
4657
4658 <span class="w"> </span><span class="c1">// Fully loaded, with a Handle to an SVG document</span>
4659 <span class="w"> </span><span class="n">ClosedOk</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">handle</span>: <span class="nc">Handle</span><span class="w"> </span><span class="p">},</span><span class="w"></span>
4660
4661 <span class="w"> </span><span class="n">ClosedError</span><span class="p">,</span><span class="w"></span>
4662 <span class="p">}</span><span class="w"></span>
4663 </code></pre></div>
4664
4665 <p>A <code>CHandle</code> starts in the <code>Start</code> state, where it doesn't know if it
4666 will be loaded with a stream, or with the legacy write/close API.</p>
4667 <p>If the caller uses the write/close API, the handle moves to the
4668 <code>Loading</code> state, which has a <code>buffer</code> where it accumulates the data
4669 being fed to it.</p>
4670 <p>But if the caller uses the stream API, the handle tries to parse an
4671 SVG document from the stream, and it moves either to the <code>ClosedOk</code>
4672 state, or to the <code>ClosedError</code> state if there is a parse error.</p>
4673 <p>Correspondingly, when using the write/close API, when the caller
4674 finally calls <code>rsvg_handle_close()</code>, the handle creates a stream for
4675 the <code>buffer</code>, parses it, and also gets either into the <code>ClosedOk</code> or
4676 <code>ClosedError</code> state.</p>
4677 <p>If you look at the variant <code>ClosedOk { handle: Handle }</code>, it contains
4678 a fully loaded <code>Handle</code> inside, which right now is just a wrapper
4679 around a reference-counted <code>Svg</code> object:</p>
4680 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Handle</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4681 <span class="w"> </span><span class="n">svg</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">Svg</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
4682 <span class="p">}</span><span class="w"></span>
4683 </code></pre></div>
4684
4685 <p>The reason why <code>LoadState::ClosedOk</code> does not contain an <code>Rc&lt;Svg&gt;</code>
4686 directly, and instead wraps it with a <code>Handle</code>, is that this is just
4687 the first pass at refactoring. Also, <code>Handle</code> contains some
4688 API-level logic which I'm not completely sure makes sense as a
4689 lower-level <code>Svg</code> object. We'll see.</p>
4690 <h2>Couldn't you move more of <code>CHandle</code>'s fields into <code>LoadState</code>?</h2>
4691 <p>Sort of, kind of, but the public API still lets one do things like
4692 call <code>rsvg_handle_get_base_uri()</code> after the handle is fully loaded,
4693 even though its result will be of little value. So, the fields that
4694 hold the <code>base_uri</code> information are kept in the longer-lived
4695 <code>CHandle</code>, not in the individual variants of <code>LoadState</code>.</p>
4696 <h2>How does this look from the Rust API?</h2>
4697 <p><code>CHandle</code> implements the public C API of librsvg. Internally,
4698 <code>Handle</code> implements the basic "load from stream", "get the geometry of
4699 an SVG element", and "render to a Cairo context" functionality.</p>
4700 <p>This basic functionality gets exported in a cleaner way through the
4701 <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/master/librsvg_crate/src/lib.rs">Rust API</a>, discussed <a href="https://people.gnome.org/~federico/blog/a-rust-api-for-librsvg.html">previously</a>. There is no
4702 interior mutability in there at all; that API uses a builder pattern
4703 to gradually configure an SVG loader, which returns a fully loaded
4704 <code>SvgHandle</code>, out of which you can create a <code>CairoRenderer</code>.</p>
4705 <p>In fact, it may be possible to refactor all of this a bit and
4706 implement <code>CHandle</code> directly in terms of the new Rust API: in effect,
4707 use <code>CHandle</code> as the "holding space" while the SVG loader gets
4708 configured, and later turned into a fully loaded <code>SvgHandle</code>
4709 internally.</p>
4710 <h2>Conclusion</h2>
4711 <p>The C version of <code>RsvgHandle</code>'s private structure used to have a bunch
4712 of fields. Without knowing the code, it was hard to know that they
4713 belonged in groups, and each group corresponded roughtly to a stage in
4714 the handle's lifetime.</p>
4715 <p>It took plenty of refactoring to get the fields split up cleanly in
4716 librsvg's internals. The process of refactoring <code>RsvgHandle</code>'s fields,
4717 and ensuring that the various states of a handle are consistent, in
4718 fact exposed a few bugs where the state was not being checked
4719 appropriately. The public C API remains the same as always, but has
4720 better internal checks now.</p>
4721 <p>GObject APIs tend to allow for a lot of mutability via methods that
4722 change the internal state of objects. For <code>RsvgHandle</code>, it was possible
4723 to change this into a single <code>CHandle</code> that maintains the mutable data
4724 in a contained fashion, and later translates it internally into an
4725 immutable <code>Handle</code> that represents a fully-loaded SVG document. This
4726 scheme ties in well with the new Rust API for librsvg, which keeps
4727 everything immutable after creation.</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category><category term="refactoring"></category></entry><entry><title>A Rust API for librsvg</title><link href="https://people.gnome.org/~federico/blog/a-rust-api-for-librsvg.html" rel="alternate"></link><published>2019-03-15T13:36:47-06:00</published><updated>2019-03-15T13:36:47-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-03-15:/~federico/blog/a-rust-api-for-librsvg.html</id><summary type="html"><p>After the librsvg team <a href="https://people.gnome.org/~federico/blog/librsvg-gobject-in-rust.html">finished the rustification</a> of
4728 librsvg's main library, I wanted to start porting the high-level test
4729 suite to Rust. This is mainly to be able to run tests in parallel,
4730 which <code>cargo test</code> does automatically in order to reduce test times.
4731 However, this meant that librsvg needed …</p></summary><content type="html"><p>After the librsvg team <a href="https://people.gnome.org/~federico/blog/librsvg-gobject-in-rust.html">finished the rustification</a> of
4732 librsvg's main library, I wanted to start porting the high-level test
4733 suite to Rust. This is mainly to be able to run tests in parallel,
4734 which <code>cargo test</code> does automatically in order to reduce test times.
4735 However, this meant that librsvg needed a Rust API that would exercise
4736 the same code paths as the C entry points.</p>
4737 <p>At the same time, I wanted the Rust API to make it impossible to
4738 misuse the library. From the viewpoint of the C API, an <code>RsvgHandle</code>
4739 has different stages:</p>
4740 <ul>
4741 <li>Just initialized</li>
4742 <li>Loading</li>
4743 <li>Loaded, or in an error state after a failed load</li>
4744 <li>Ready to render</li>
4745 </ul>
4746 <p>To ensure consistency, the public API checks that you cannot render an
4747 <code>RsvgHandle</code> that is not completely loaded yet, or one that resulted
4748 in a loading error. But wouldn't it be nice if it were impossible to
4749 call the API functions in the wrong order?</p>
4750 <p>This is exactly what <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/master/librsvg_crate/src/lib.rs">the Rust API</a> does. There is a <code>Loader</code>,
4751 to which you give a filename or a stream, and it will return a
4752 fully-loaded <code>SvgHandle</code> or an error. Then, you can only create a
4753 <code>CairoRenderer</code> if you have an <code>SvgHandle</code>.</p>
4754 <p>For historical reasons, the C API in librsvg is not perfectly
4755 consistent. For example, some functions which return an error will
4756 actually return a proper <a href="https://developer.gnome.org/glib/stable/glib-Error-Reporting.html"><code>GError</code></a>, but some others will just
4757 return a <code>gboolean</code> with no further explanation of what went wrong.
4758 In contrast, all the Rust API functions that can fail will actually
4759 return a <a href="https://doc.rust-lang.org/std/result/index.html"><code>Result</code></a>, and the error case will have a meaningful
4760 error value. In the Rust API, there is no "wrong order" in which the
4761 various API functions and methods can be called; it tries to do the
4762 whole "make invalid states unrepresentable".</p>
4763 <p>To implement <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/master/librsvg_crate/src/lib.rs">the Rust API</a>, I had to do some refactoring of the
4764 <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/master/rsvg_internals/src/handle.rs">internals</a> that hook to the public entry points. This made me
4765 realize that librsvg could be a lot easier to use. The C API has
4766 always forced you to call it in this fashion:</p>
4767 <ol>
4768 <li>Ask the SVG for its dimensions, or how big it is.</li>
4769 <li>Based on that, scale your Cairo context to the size you actually
4770 want.</li>
4771 <li>Render the SVG to that context's current transformation matrix.</li>
4772 </ol>
4773 <p>But first, (1) gives you inadequate information because
4774 <code>rsvg_handle_get_dimensions()</code> returns <a href="https://developer.gnome.org/rsvg/unstable/rsvg-RsvgHandle.html#RsvgDimensionData">a
4775 structure</a> with <code>int</code> fields for the width and
4776 height. The API is similar to gdk-pixbuf's in that it always wants to
4777 think in whole pixels. However, an SVG is not necessarily
4778 integer-sized.</p>
4779 <p>Then, (2) forces you to calculate some geometry in almost all cases,
4780 as most apps want to render SVG content scaled proportionally to a
4781 certain size. This is not hard to do, but it's an inconvenience.</p>
4782 <h2>SVG dimensions</h2>
4783 <p>Let's look at (1) again. The question, "how big is the SVG" is a bit
4784 meaningless when we consider that SVGs <strong>can be scaled to any size</strong>;
4785 that's the whole point of them!</p>
4786 <p>When you ask <code>RsvgHandle</code> how big it is, in reality it should look at
4787 you and whisper in your ear, "how big do you want it to be?".</p>
4788 <p>And that's the thing. The HTML/CSS/SVG model is that one embeds
4789 content into <strong>viewports</strong> of a given size. The software is
4790 responsible for scaling the content to fit into that viewport.</p>
4791 <p>In the end, what we want is a rendering function that takes a Cairo
4792 context and a Rectangle for a viewport, and that's it. The function
4793 should take care of fitting the SVG's contents within that viewport.</p>
4794 <p>There is now <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/448">an open bug</a> about exactly this sort of API. In
4795 the end, programs should just have to load their SVG handle, and
4796 directly ask it to render at whatever size they need, instead of doing
4797 the size computations by hand.</p>
4798 <h2>When will this be available?</h2>
4799 <p>I'm in the middle of a <a href="https://gitlab.gnome.org/federico/librsvg/commits/viewport-with-offsets">rather large refactor</a>
4800 to make this <em>viewport</em> concept really work. So far this involves:</p>
4801 <ul>
4802 <li>
4803 <p>Defining APIs that take a viewport.</p>
4804 </li>
4805 <li>
4806 <p>Refactoring all the geometry computation to support the semantics of the
4807 C API, plus the new <code>with_viewport</code> semantics.</p>
4808 </li>
4809 <li>
4810 <p>Fixing the code that kept track of an internal offset for all
4811 temporary images.</p>
4812 </li>
4813 <li>
4814 <p>Refactoring all the code that mucks around with the Cairo context's
4815 affine transformation matrix, which is a big mutable mess.</p>
4816 </li>
4817 <li>
4818 <p>Tests, examples, documentation.</p>
4819 </li>
4820 </ul>
4821 <p>I want to make the Rust API available for the 2.46 release, which is
4822 hopefully not too far off. It should be ready for the next GNOME
4823 release. In the meantime, you can check out the open bugs for the
4824 <a href="https://gitlab.gnome.org/GNOME/librsvg/milestones/20">2.46.0 milestone</a>. <strong>Help is appreciated; the deadline
4825 for the first 3.33 tarballs is approximately one month from now!</strong></p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Rust build scripts vs. Meson</title><link href="https://people.gnome.org/~federico/blog/rust-build-scripts.html" rel="alternate"></link><published>2019-02-27T12:14:12-06:00</published><updated>2019-02-27T12:14:12-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-02-27:/~federico/blog/rust-build-scripts.html</id><summary type="html"><p>One of the pain points in trying to make the <a href="https://mesonbuild.com/">Meson</a> build system work
4826 with Rust and Cargo is Cargo's use of build scripts, i.e. the
4827 <code>build.rs</code> that many Rust programs use for doing things before the
4828 main build. This post is about my exploration of what <code>build …</code></p></summary><content type="html"><p>One of the pain points in trying to make the <a href="https://mesonbuild.com/">Meson</a> build system work
4829 with Rust and Cargo is Cargo's use of build scripts, i.e. the
4830 <code>build.rs</code> that many Rust programs use for doing things before the
4831 main build. This post is about my exploration of what <code>build.rs</code>
4832 does.</p>
4833 <p>Thanks to <a href="https://nirbheek.in/">Nirbheek Chauhan</a> for his comments
4834 and additions to a draft of this article!</p>
4835 <p>TL;DR: <code>build.rs</code> is pretty ad-hoc and somewhat primitive, when
4836 compared to Meson's very nice, high-level patterns for build-time
4837 things.</p>
4838 <p>I have the intuition that giving names to the things that are
4839 usually done in <code>build.rs</code> scripts, and creating abstractions for
4840 them, can make it easier later to implement those abstractions in
4841 terms of Meson. Maybe we can eliminate <code>build.rs</code> in most cases?
4842 Maybe Cargo can acquire higher-level concepts that plug well to Meson?</p>
4843 <p>(That is... I think we can refactor our way out of this mess.)</p>
4844 <h2>What does <code>build.rs</code> do?</h2>
4845 <p>The first paragraph in the <a href="https://doc.rust-lang.org/cargo/reference/build-scripts.html">documentation for Cargo build
4846 scripts</a> tells us this:</p>
4847 <blockquote>
4848 <p>Some packages need to compile third-party non-Rust code, for example
4849 C libraries. Other packages need to link to C libraries which can
4850 either be located on the system or possibly need to be built from
4851 source. Others still need facilities for functionality such as code
4852 generation before building (think parser generators).</p>
4853 </blockquote>
4854 <p>That is,</p>
4855 <ul>
4856 <li>
4857 <p>Compiling third-party non-Rust code. For example, maybe there is a
4858 C sub-library that the Rust crate needs.</p>
4859 </li>
4860 <li>
4861 <p>Link to C libraries... located on the system... or built from
4862 source. For example, in <a href="https://gtk-rs.org">gtk-rs</a>, the <a href="https://github.com/gtk-rs/sys">sys</a> crates link to
4863 <code>libgtk-3.so</code>, <code>libcairo.so</code>, etc. and need to find a way to locate
4864 those libraries with <code>pkg-config</code>.</p>
4865 </li>
4866 <li>
4867 <p>Code generation. In the C world this could be generating a parser
4868 with <code>yacc</code>; in the Rust world there are many utilities to generate
4869 code that is later used in your actual program.</p>
4870 </li>
4871 </ul>
4872 <p>In the next sections I'll look briefly at each of these cases, but in
4873 a different order.</p>
4874 <h2>Code generation</h2>
4875 <p>Here is an example, in <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/a5c8a9ca/rsvg_internals/build.rs">how librsvg generates code</a>
4876 for a couple of things that get autogenerated before compiling the
4877 main library:</p>
4878 <ul>
4879 <li>A perfect hash function (PHF) of attributes and CSS property names.</li>
4880 <li>A pair of lookup tables for SRGB linearization and un-linearization.</li>
4881 </ul>
4882 <p>For example, this is <code>main()</code> in <code>build.rs</code>:</p>
4883 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4884 <span class="w"> </span><span class="n">generate_phf_of_svg_attributes</span><span class="p">();</span><span class="w"></span>
4885 <span class="w"> </span><span class="n">generate_srgb_tables</span><span class="p">();</span><span class="w"></span>
4886 <span class="p">}</span><span class="w"></span>
4887 </code></pre></div>
4888
4889 <p>And this is the first few lines of of the first function:</p>
4890 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">generate_phf_of_svg_attributes</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4891 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Path</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">env</span>::<span class="n">var</span><span class="p">(</span><span class="s">&quot;OUT_DIR&quot;</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()).</span><span class="n">join</span><span class="p">(</span><span class="s">&quot;attributes-codegen.rs&quot;</span><span class="p">);</span><span class="w"></span>
4892 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">BufWriter</span>::<span class="n">new</span><span class="p">(</span><span class="n">File</span>::<span class="n">create</span><span class="p">(</span><span class="o">&amp;</span><span class="n">path</span><span class="p">).</span><span class="n">unwrap</span><span class="p">());</span><span class="w"></span>
4893
4894 <span class="w"> </span><span class="fm">writeln!</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">file</span><span class="p">,</span><span class="w"> </span><span class="s">&quot;#[repr(C)]&quot;</span><span class="p">).</span><span class="n">unwrap</span><span class="p">();</span><span class="w"></span>
4895
4896 <span class="w"> </span><span class="c1">// ... etc</span>
4897 <span class="p">}</span><span class="w"></span>
4898 </code></pre></div>
4899
4900 <p>Generate a path like <code>$OUT_DIR/attributes-codegen.rs</code>, create a file
4901 with that name, a <code>BufWriter</code> for the file, and start outputting code
4902 to it.</p>
4903 <p>Similarly, the second function:</p>
4904 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">generate_srgb_tables</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
4905 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">linearize_table</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">compute_table</span><span class="p">(</span><span class="n">linearize</span><span class="p">);</span><span class="w"></span>
4906 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">unlinearize_table</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">compute_table</span><span class="p">(</span><span class="n">unlinearize</span><span class="p">);</span><span class="w"></span>
4907
4908 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Path</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">env</span>::<span class="n">var</span><span class="p">(</span><span class="s">&quot;OUT_DIR&quot;</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()).</span><span class="n">join</span><span class="p">(</span><span class="s">&quot;srgb-codegen.rs&quot;</span><span class="p">);</span><span class="w"></span>
4909 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">BufWriter</span>::<span class="n">new</span><span class="p">(</span><span class="n">File</span>::<span class="n">create</span><span class="p">(</span><span class="o">&amp;</span><span class="n">path</span><span class="p">).</span><span class="n">unwrap</span><span class="p">());</span><span class="w"></span>
4910
4911 <span class="w"> </span><span class="c1">// ...</span>
4912
4913 <span class="w"> </span><span class="n">print_table</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">file</span><span class="p">,</span><span class="w"> </span><span class="s">&quot;LINEARIZE&quot;</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">linearize_table</span><span class="p">);</span><span class="w"></span>
4914 <span class="w"> </span><span class="n">print_table</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">file</span><span class="p">,</span><span class="w"> </span><span class="s">&quot;UNLINEARIZE&quot;</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">unlinearize_table</span><span class="p">);</span><span class="w"></span>
4915 <span class="p">}</span><span class="w"></span>
4916 </code></pre></div>
4917
4918 <p>Compute two lookup tables, create a file named
4919 <code>$OUT_DIR/srgb-codegen.rs</code>, and write the lookup tables to the file.</p>
4920 <p>Later in the actual librsvg code, the generated files get included
4921 into the source code using the <code>include!</code> macro. For example, here is
4922 where <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/a5c8a9ca/rsvg_internals/src/attributes.rs#L6"><code>attributes-codegen.rs</code> gets included</a>:</p>
4923 <div class="highlight"><pre><span></span><code><span class="c1">// attributes.rs</span>
4924
4925 <span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">phf</span><span class="p">;</span><span class="w"> </span><span class="c1">// crate for perfect hash function</span>
4926
4927 <span class="c1">// the generated file has the declaration for enum Attribute</span>
4928 <span class="fm">include!</span><span class="p">(</span><span class="fm">concat!</span><span class="p">(</span><span class="fm">env!</span><span class="p">(</span><span class="s">&quot;OUT_DIR&quot;</span><span class="p">),</span><span class="w"> </span><span class="s">&quot;/attributes-codegen.rs&quot;</span><span class="p">));</span><span class="w"></span>
4929 </code></pre></div>
4930
4931 <p>One thing to note here is that the generated source files
4932 (<code>attributes-codegen.rs</code>, <code>srgb-codegen.rs</code>) get put in <code>$OUT_DIR</code>, a
4933 directory that Cargo creates for the compilation artifacts. The files <strong>do
4934 not</strong> get put into the original source directories with the rest of
4935 the library's code; the idea is to keep the source directories
4936 pristine.</p>
4937 <p>At least in those terms, Meson and Cargo agree that source directories
4938 should be kept clean of autogenerated files.</p>
4939 <p>The <a href="https://doc.rust-lang.org/cargo/reference/build-scripts.html#case-study-code-generation">Code Generation</a> section of Cargo's documentation agrees:</p>
4940 <blockquote>
4941 <p>In general, build scripts should not modify any files outside of
4942 OUT_DIR. It may seem fine on the first blush, but it does cause
4943 problems when you use such crate as a dependency, because there's an
4944 implicit invariant that sources in .cargo/registry should be
4945 immutable. cargo won't allow such scripts when packaging.</p>
4946 </blockquote>
4947 <p>Now, some things to note here:</p>
4948 <ul>
4949 <li>
4950 <p>Both the <code>build.rs</code> program and the actual library sources look at
4951 the <code>$OUT_DIR</code> environment variable for the location of the
4952 generated sources.</p>
4953 </li>
4954 <li>
4955 <p>The Cargo docs say that if the code generator needs input files, it
4956 can look for them based on its current directory, which will be the
4957 toplevel of your source package (i.e. your toplevel <code>Cargo.toml</code>).</p>
4958 </li>
4959 </ul>
4960 <p><strong>Meson hates this scheme of things</strong>. In particular, Meson is very
4961 systematic about where it finds input files and sources, and where
4962 things like code generators are allowed to place their output.</p>
4963 <p>The way Meson communicates these paths to code generators is via
4964 command-line arguments to <a href="https://mesonbuild.com/Reference-manual.html#custom_target">"custom targets"</a>. Here is
4965 <a href="https://github.com/mesonbuild/meson/blob/master/test%20cases/common/145%20custom%20target%20multiple%20outputs/meson.build">an example</a> that is easier to read than the
4966 documentation:</p>
4967 <div class="highlight"><pre><span></span><code><span class="n">gen</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">find_program</span><span class="p">(</span><span class="s1">&#39;generator.py&#39;</span><span class="p">)</span><span class="w"></span>
4968
4969 <span class="n">outputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">custom_target</span><span class="p">(</span><span class="s1">&#39;generated&#39;</span><span class="p">,</span><span class="w"></span>
4970 <span class="w"> </span><span class="k">output</span><span class="w"> </span><span class="err">:</span><span class="w"> </span><span class="o">[</span><span class="n">&#39;foo.h&#39;, &#39;foo.c&#39;</span><span class="o">]</span><span class="p">,</span><span class="w"></span>
4971 <span class="w"> </span><span class="n">command</span><span class="w"> </span><span class="err">:</span><span class="w"> </span><span class="o">[</span><span class="n">gen, &#39;@OUTDIR@&#39;</span><span class="o">]</span><span class="p">,</span><span class="w"></span>
4972 <span class="w"> </span><span class="p">...</span><span class="w"></span>
4973 <span class="p">)</span><span class="w"></span>
4974 </code></pre></div>
4975
4976 <p>This defines a target named <code>'generated'</code>, which will use the
4977 <code>generator.py</code> program to output two files, <code>foo.h</code> and <code>foo.c</code>. That
4978 Python program will get called with <code>@OUTDIR@</code> as a command-line
4979 argument; in effect, meson will call
4980 <code>/full/path/to/generator.py @OUTDIR@</code> explicitly, without any magic
4981 passed through environment variables.</p>
4982 <p>If this looks similar to what Cargo does above with <code>build.rs</code>, it's
4983 because it <strong>is</strong> similar. It's just that <strong>Meson gives a name</strong> to
4984 the concept of generating code at build time (Meson's name for this is
4985 a <strong>custom target</strong>), and provides a mechanism to say which program is
4986 the generator, which files it is expected to generate, and how to call
4987 the program with appropriate arguments to put files in the right
4988 place.</p>
4989 <p>In contrast, Cargo assumes that all of that information can be
4990 inferred from an environment variable.</p>
4991 <p>In addition, if the custom target takes other files as input (say, so
4992 it can call <code>yacc my-grammar.y</code>), the <code>custom_target()</code> command can
4993 take an <code>input:</code> argument. This way, Meson can add a dependency on
4994 those input files, so that the appropriate things will be rebuilt if
4995 the input files change.</p>
4996 <p>Now, Cargo could very well provide a small utility crate that build
4997 scripts could use to figure out all that information. Meson would
4998 tell Cargo to use its scheme of things, and pass it down to build
4999 scripts via that utility crate. I.e. to have</p>
5000 <div class="highlight"><pre><span></span><code><span class="c1">// build.rs</span>
5001
5002 <span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">cargo_high_level</span><span class="p">;</span><span class="w"></span>
5003
5004 <span class="kd">let</span><span class="w"> </span><span class="n">output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Path</span>::<span class="n">new</span><span class="p">(</span><span class="n">cargo_high_level</span>::<span class="n">get_output_path</span><span class="p">()).</span><span class="n">join</span><span class="p">(</span><span class="s">&quot;codegen.rs&quot;</span><span class="p">);</span><span class="w"></span>
5005 <span class="c1">// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this, instead of:</span>
5006
5007 <span class="kd">let</span><span class="w"> </span><span class="n">output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Path</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">env</span>::<span class="n">var</span><span class="p">(</span><span class="s">&quot;OUT_DIR&quot;</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()).</span><span class="n">join</span><span class="p">(</span><span class="s">&quot;codegen.rs&quot;</span><span class="p">);</span><span class="w"></span>
5008
5009 <span class="c1">// let the build system know about generated dependencies</span>
5010 <span class="n">cargo_high_level</span>::<span class="n">add_output</span><span class="p">(</span><span class="n">output</span><span class="p">);</span><span class="w"></span>
5011 </code></pre></div>
5012
5013 <p>A similar mechanism could be used for the way Meson likes to pass
5014 command-line arguments to the programs that deal with custom targets.</p>
5015 <h2>Linking to C libraries on the system</h2>
5016 <p>Some Rust crates need to link to lower-level C libraries that actually
5017 do the work. For example, in <a href="https://gtk-rs.org">gtk-rs</a>, there are high-level binding
5018 crates called <code>gtk</code>, <code>gdk</code>, <code>cairo</code>, etc. These use low-level crates
5019 called <code>gtk-sys</code>, <code>gdk-sys</code>, <code>cairo-sys</code>. Those <code>-sys</code> crates are
5020 just direct wrappers on top of the C functions of the respective
5021 system libraries: <code>gtk-sys</code> makes almost every function in
5022 <code>libgtk-3.so</code> available as a Rust-callable function.</p>
5023 <p>System libraries sometimes live in a well-known part of the filesystem
5024 (<code>/usr/lib64</code>, for example); other times, like in Windows and MacOS,
5025 they could be anywhere. To find that location plus other related
5026 metadata (include paths for C header files, library version), many
5027 system libraries use <a href="https://www.freedesktop.org/wiki/Software/pkg-config/"><code>pkg-config</code></a>. At the highest
5028 level, one can run <code>pkg-config</code> on the command line, or from build
5029 scripts, to query some things about libraries. For example:</p>
5030 <div class="highlight"><pre><span></span><code># <span class="nv">what</span><span class="s1">&#39;</span><span class="s">s the system</span><span class="s1">&#39;</span><span class="nv">s</span> <span class="nv">installed</span> <span class="nv">version</span> <span class="nv">of</span> <span class="nv">GTK</span>?
5031 $ <span class="nv">pkg</span><span class="o">-</span><span class="nv">config</span> <span class="o">--</span><span class="nv">modversion</span> <span class="nv">gtk</span><span class="o">+-</span><span class="mi">3</span>.<span class="mi">0</span>
5032 <span class="mi">3</span>.<span class="mi">24</span>.<span class="mi">4</span>
5033
5034 # <span class="nv">what</span> <span class="nv">compiler</span> <span class="nv">flags</span> <span class="nv">would</span> <span class="nv">a</span> <span class="nv">C</span> <span class="nv">compiler</span> <span class="nv">need</span> <span class="k">for</span> <span class="nv">GTK</span>?
5035 $ <span class="nv">pkg</span><span class="o">-</span><span class="nv">config</span> <span class="o">--</span><span class="nv">cflags</span> <span class="nv">gtk</span><span class="o">+-</span><span class="mi">3</span>.<span class="mi">0</span>
5036 <span class="o">-</span><span class="nv">pthread</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">gtk</span><span class="o">-</span><span class="mi">3</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">at</span><span class="o">-</span><span class="nv">spi2</span><span class="o">-</span><span class="nv">atk</span><span class="o">/</span><span class="mi">2</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">at</span><span class="o">-</span><span class="nv">spi</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">dbus</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="nv">lib64</span><span class="o">/</span><span class="nv">dbus</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span><span class="o">/</span><span class="k">include</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">gtk</span><span class="o">-</span><span class="mi">3</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">gio</span><span class="o">-</span><span class="nv">unix</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span><span class="o">/</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">libxkbcommon</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">wayland</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">cairo</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">pango</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">harfbuzz</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">pango</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">fribidi</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">atk</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">cairo</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">pixman</span><span class="o">-</span><span class="mi">1</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">freetype2</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">libdrm</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">libpng16</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">gdk</span><span class="o">-</span><span class="nv">pixbuf</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">libmount</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">blkid</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">uuid</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="k">include</span><span class="o">/</span><span class="nv">glib</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">I</span><span class="o">/</span><span class="nv">usr</span><span class="o">/</span><span class="nv">lib64</span><span class="o">/</span><span class="nv">glib</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span><span class="o">/</span><span class="k">include</span>
5037
5038 # <span class="nv">and</span> <span class="nv">which</span> <span class="nv">libraries</span>?
5039 $ <span class="nv">pkg</span><span class="o">-</span><span class="nv">config</span> <span class="o">--</span><span class="nv">libs</span> <span class="nv">gtk</span><span class="o">+-</span><span class="mi">3</span>.<span class="mi">0</span>
5040 <span class="o">-</span><span class="nv">lgtk</span><span class="o">-</span><span class="mi">3</span> <span class="o">-</span><span class="nv">lgdk</span><span class="o">-</span><span class="mi">3</span> <span class="o">-</span><span class="nv">lpangocairo</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">lpango</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">latk</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">lcairo</span><span class="o">-</span><span class="nv">gobject</span> <span class="o">-</span><span class="nv">lcairo</span> <span class="o">-</span><span class="nv">lgdk_pixbuf</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">lgio</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">lgobject</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span> <span class="o">-</span><span class="nv">lglib</span><span class="o">-</span><span class="mi">2</span>.<span class="mi">0</span>
5041 </code></pre></div>
5042
5043 <p>There is a <a href="https://docs.rs/pkg-config/0.3.14/pkg_config/"><code>pkg-config</code> crate</a> which <code>build.rs</code> can
5044 use to call this, and communicate that information to Cargo. The
5045 example in the crate's documentation is for asking pkg-config for the
5046 <code>foo</code> package, with version at least <code>1.2.3</code>:</p>
5047 <div class="highlight"><pre><span></span><code><span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">pkg_config</span><span class="p">;</span><span class="w"></span>
5048
5049 <span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5050 <span class="w"> </span><span class="n">pkg_config</span>::<span class="n">Config</span>::<span class="n">new</span><span class="p">().</span><span class="n">atleast_version</span><span class="p">(</span><span class="s">&quot;1.2.3&quot;</span><span class="p">).</span><span class="n">probe</span><span class="p">(</span><span class="s">&quot;foo&quot;</span><span class="p">).</span><span class="n">unwrap</span><span class="p">();</span><span class="w"></span>
5051 <span class="p">}</span><span class="w"></span>
5052 </code></pre></div>
5053
5054 <p>And the documentation says,</p>
5055 <blockquote>
5056 <p>After running pkg-config all appropriate Cargo metadata will be
5057 printed on stdout if the search was successful.</p>
5058 </blockquote>
5059 <p>Wait, what?</p>
5060 <p>Indeed, printing specially-formated stuff on stdout is how <code>build.rs</code>
5061 scripts communicate back to Cargo about their findings. To quote <a href="https://doc.rust-lang.org/cargo/reference/build-scripts.html#outputs-of-the-build-script">Cargo's docs
5062 on build scripts</a>; the following is talking
5063 about the stdout of <code>build.rs</code>:</p>
5064 <blockquote>
5065 <p>Any line that starts with cargo: is interpreted directly by
5066 Cargo. This line must be of the form cargo:key=value, like the
5067 examples below:</p>
5068 </blockquote>
5069 <div class="highlight"><pre><span></span><code># <span class="nv">specially</span> <span class="nv">recognized</span> <span class="nv">by</span> <span class="nv">Cargo</span>
5070 <span class="nv">cargo</span>:<span class="nv">rustc</span><span class="o">-</span><span class="nv">link</span><span class="o">-</span><span class="nv">lib</span><span class="o">=</span><span class="nv">static</span><span class="o">=</span><span class="nv">foo</span>
5071 <span class="nv">cargo</span>:<span class="nv">rustc</span><span class="o">-</span><span class="nv">link</span><span class="o">-</span><span class="nv">search</span><span class="o">=</span><span class="nv">native</span><span class="o">=/</span><span class="nv">path</span><span class="o">/</span><span class="nv">to</span><span class="o">/</span><span class="nv">foo</span>
5072 <span class="nv">cargo</span>:<span class="nv">rustc</span><span class="o">-</span><span class="nv">cfg</span><span class="o">=</span><span class="nv">foo</span>
5073 <span class="nv">cargo</span>:<span class="nv">rustc</span><span class="o">-</span><span class="nv">env</span><span class="o">=</span><span class="nv">FOO</span><span class="o">=</span><span class="nv">bar</span>
5074 # <span class="nv">arbitrary</span> <span class="nv">user</span><span class="o">-</span><span class="nv">defined</span> <span class="nv">metadata</span>
5075 <span class="nv">cargo</span>:<span class="nv">root</span><span class="o">=/</span><span class="nv">path</span><span class="o">/</span><span class="nv">to</span><span class="o">/</span><span class="nv">foo</span>
5076 <span class="nv">cargo</span>:<span class="nv">libdir</span><span class="o">=/</span><span class="nv">path</span><span class="o">/</span><span class="nv">to</span><span class="o">/</span><span class="nv">foo</span><span class="o">/</span><span class="nv">lib</span>
5077 <span class="nv">cargo</span>:<span class="k">include</span><span class="o">=/</span><span class="nv">path</span><span class="o">/</span><span class="nv">to</span><span class="o">/</span><span class="nv">foo</span><span class="o">/</span><span class="k">include</span>
5078 </code></pre></div>
5079
5080 <p>One can use the stdout of a <code>build.rs</code> program to add additional
5081 command-line options for <code>rustc</code>, or set environment variables for it,
5082 or add library paths, or specific libraries.</p>
5083 <p><strong>Meson hates this scheme of things</strong>. I suppose it would prefer to
5084 do the pkg-config calls itself, and then pass that information down to
5085 Cargo, you guessed it, via command-line options or something
5086 well-defined like that. Again, the example <code>cargo_high_level</code> crate I
5087 proposed above could be used to communicate this information from
5088 Meson to Cargo scripts. Meson also doesn't like this because it would
5089 prefer to know about <code>pkg-config</code>-based libraries in a declarative
5090 fashion, without having to run a random script like <code>build.rs</code>.</p>
5091 <h2>Building C code from Rust</h2>
5092 <p>Finally, some Rust crates build a bit of C code and then link that
5093 into the compiled Rust code. I have no experience with that, but
5094 the respective build scripts generally use the <a href="https://docs.rs/cc/1.0.29/cc/"><code>cc</code> crate</a> to
5095 call a C compiler and pass options to it conveniently. I suppose
5096 Meson would prefer to do this instead, or at least to have a
5097 high-level way of passing down information to Cargo.</p>
5098 <p>In effect, Meson has to be in charge of picking the C compiler.
5099 Having the thing-to-be-built pick on its own has caused big problems
5100 in the past: GObject-Introspection made the same mistake years ago
5101 when it decided to use distutils to detect the C compiler; gtk-doc did
5102 as well. When those tools are used, we still deal with problems with
5103 cross-compilation and when the system has more than one C compiler in
5104 it.</p>
5105 <h2>Snarky comments about the Unix philosophy</h2>
5106 <p>If part of the Unix philosophy is that shit can be glued together with
5107 environment variables and stringly-typed stdout... it's a pretty bad
5108 philosophy. All the cases above boil down to having a well-defined,
5109 more or less strongly-typed way to pass information between programs
5110 instead of shaking proverbial tree of the filesystem and the
5111 environment and seeing if something usable falls down.</p>
5112 <h2>Would we really have to modify all <code>build.rs</code> scripts for this?</h2>
5113 <p>Probably. Why not? Meson already has a lot of very well-structured
5114 knowledge of how to deal with multi-platform compilation and
5115 installation. Re-creating this knowledge in ad-hoc ways in <code>build.rs</code>
5116 is not very pleasant or maintainable.</p>
5117 <h3>Related work</h3>
5118 <ul>
5119 <li>
5120 <p><a href="https://internals.rust-lang.org/t/external-dependencies-in-declarative-format/9372">Cargo internals
5121 thread on using a declarative format to specify external dependencies</a></p>
5122 </li>
5123 <li>
5124 <p><a href="https://github.com/joshtriplett/metadeps">Run pkg-config from declarative dependencies in Cargo.toml</a></p>
5125 </li>
5126 </ul></content><category term="misc"></category><category term="rust"></category><category term="meson"></category></entry><entry><title>Who wrote librsvg?</title><link href="https://people.gnome.org/~federico/blog/who-wrote-librsvg.html" rel="alternate"></link><published>2019-02-15T13:12:19-06:00</published><updated>2019-02-15T13:12:19-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-02-15:/~federico/blog/who-wrote-librsvg.html</id><summary type="html"><p>Authors by lines of code, each year:</p>
5127 <p><img alt="Librsvg authors by lines of code by year" src="https://people.gnome.org/~federico/blog/images/librsvg-authors-2019-02.png"></p>
5128 <p>Authors by percentage of lines of code, each year:</p>
5129 <p><img alt="Librsvg authors by percentage of lines of code by year" src="https://people.gnome.org/~federico/blog/images/librsvg-authors-normalized-2019-02.png"></p>
5130 <p>Which lines of code remain each year?</p>
5131 <p><img alt="Lines of code that remain each year" src="https://people.gnome.org/~federico/blog/images/librsvg-lines-of-code-2019-02.png"></p>
5132 <p>The shitty thing about a gradual rewrite is that a few people end up
5133 "owning" all the lines of source code. Hopefully this post is a little …</p></summary><content type="html"><p>Authors by lines of code, each year:</p>
5134 <p><img alt="Librsvg authors by lines of code by year" src="https://people.gnome.org/~federico/blog/images/librsvg-authors-2019-02.png"></p>
5135 <p>Authors by percentage of lines of code, each year:</p>
5136 <p><img alt="Librsvg authors by percentage of lines of code by year" src="https://people.gnome.org/~federico/blog/images/librsvg-authors-normalized-2019-02.png"></p>
5137 <p>Which lines of code remain each year?</p>
5138 <p><img alt="Lines of code that remain each year" src="https://people.gnome.org/~federico/blog/images/librsvg-lines-of-code-2019-02.png"></p>
5139 <p>The shitty thing about a gradual rewrite is that a few people end up
5140 "owning" all the lines of source code. Hopefully this post is a little
5141 acknowledgment of the people that made librsvg possible.</p>
5142 <p>The charts are made with the incredible tool
5143 <a href="https://github.com/erikbern/git-of-theseus">git-of-theseus</a> — thanks
5144 to <a href="https://mastodon.art/@norwin">@norwin@mastodon.art</a> for digging it
5145 up! Its README also points to a
5146 <a href="https://github.com/src-d/hercules">Hercules</a> plotter with awesome
5147 graphs. You know, for if you needed something to keep your computer
5148 busy during the weekend.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category></entry><entry><title>Librsvg's GObject boilerplate is in Rust now</title><link href="https://people.gnome.org/~federico/blog/librsvg-gobject-in-rust.html" rel="alternate"></link><published>2019-01-23T18:12:44-06:00</published><updated>2019-01-23T18:12:44-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-01-23:/~federico/blog/librsvg-gobject-in-rust.html</id><summary type="html"><p>The other day I wrote about how <a href="https://people.gnome.org/~federico/blog/librsvg-is-almost-rustified.html">most of librsvg's library code is in
5149 Rust now</a>. </p>
5150 <p>Today I finished porting the GObject boilerplate for the main
5151 <code>RsvgHandle</code> object into Rust. This means that the C code no longer
5152 calls things like <code>g_type_register_static()</code>, nor implements
5153 <code>rsvg_handle_class_init()</code> and such; all those are …</p></summary><content type="html"><p>The other day I wrote about how <a href="https://people.gnome.org/~federico/blog/librsvg-is-almost-rustified.html">most of librsvg's library code is in
5154 Rust now</a>. </p>
5155 <p>Today I finished porting the GObject boilerplate for the main
5156 <code>RsvgHandle</code> object into Rust. This means that the C code no longer
5157 calls things like <code>g_type_register_static()</code>, nor implements
5158 <code>rsvg_handle_class_init()</code> and such; all those are in Rust now. How
5159 is this done?</p>
5160 <h2>The life-changing magic of glib::subclass</h2>
5161 <p><a href="https://coaxion.net/blog/">Sebastian Dröge</a> has been working for many months on refining
5162 utilities to make it possible to subclass GObjects in Rust, with
5163 little or no unsafe code. This <a href="https://github.com/gtk-rs/glib/tree/master/src/subclass">subclass</a> module is now part of
5164 <a href="https://github.com/gtk-rs/glib">glib-rs</a>, the Rust bindings to GLib. </p>
5165 <p>Librsvg now uses the subclassing functionality in glib-rs, which takes
5166 care of some things automatically:</p>
5167 <ul>
5168 <li>Registering your GObject types at runtime.</li>
5169 <li>Creating safe traits on which you can implement <code>class_init</code>,
5170 <code>instance_init</code>, <code>set_property</code>, <code>get_property</code>, and all the usual
5171 GObject paraphernalia.</li>
5172 </ul>
5173 <p>Check this out:</p>
5174 <div class="highlight"><pre><span></span><code><span class="k">use</span><span class="w"> </span><span class="n">glib</span>::<span class="n">subclass</span>::<span class="n">prelude</span>::<span class="o">*</span><span class="p">;</span><span class="w"></span>
5175
5176 <span class="k">impl</span><span class="w"> </span><span class="n">ObjectSubclass</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Handle</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5177 <span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">NAME</span>: <span class="kp">&amp;</span><span class="o">&#39;</span><span class="nb">static</span> <span class="kt">str</span> <span class="o">=</span><span class="w"> </span><span class="s">&quot;RsvgHandle&quot;</span><span class="p">;</span><span class="w"></span>
5178
5179 <span class="w"> </span><span class="k">type</span> <span class="nc">ParentType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">glib</span>::<span class="n">Object</span><span class="p">;</span><span class="w"></span>
5180
5181 <span class="w"> </span><span class="k">type</span> <span class="nc">Instance</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">RsvgHandle</span><span class="p">;</span><span class="w"></span>
5182 <span class="w"> </span><span class="k">type</span> <span class="nc">Class</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">RsvgHandleClass</span><span class="p">;</span><span class="w"></span>
5183
5184 <span class="w"> </span><span class="n">glib_object_subclass</span><span class="o">!</span><span class="p">();</span><span class="w"></span>
5185
5186 <span class="w"> </span><span class="k">fn</span> <span class="nf">class_init</span><span class="p">(</span><span class="n">klass</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">RsvgHandleClass</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5187 <span class="w"> </span><span class="n">klass</span><span class="p">.</span><span class="n">install_properties</span><span class="p">(</span><span class="o">&amp;</span><span class="n">PROPERTIES</span><span class="p">);</span><span class="w"></span>
5188 <span class="w"> </span><span class="p">}</span><span class="w"></span>
5189
5190 <span class="w"> </span><span class="k">fn</span> <span class="nf">new</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5191 <span class="w"> </span><span class="n">Handle</span>::<span class="n">new</span><span class="p">()</span><span class="w"></span>
5192 <span class="w"> </span><span class="p">}</span><span class="w"></span>
5193 <span class="p">}</span><span class="w"></span>
5194 </code></pre></div>
5195
5196 <p>In the <code>impl</code> line, <code>Handle</code> is librsvg's internals object — what used
5197 to be <code>RsvgHandlePrivate</code> in the C code.</p>
5198 <p>The following lines say this:</p>
5199 <ul>
5200 <li>
5201 <p><code>const NAME: &amp;'static str = "RsvgHandle";</code> - the name of the type,
5202 for GType's perusal.</p>
5203 </li>
5204 <li>
5205 <p><code>type ParentType = glib::Object;</code> - Parent class.</p>
5206 </li>
5207 <li>
5208 <p><code>type Instance</code>, <code>type Class</code> - Structs with <code>#[repr(C)]</code>,
5209 equivalent to GObject's class and instance structs.</p>
5210 </li>
5211 <li>
5212 <p><code>glib_object_subclass!();</code> - All the boilerplate happens here
5213 automatically.</p>
5214 </li>
5215 <li>
5216 <p><code>fn class_init</code> - Should be familiar to anyone who implements
5217 GObjects!</p>
5218 </li>
5219 </ul>
5220 <p>And then, a couple of the property declarations:</p>
5221 <div class="highlight"><pre><span></span><code><span class="k">static</span><span class="w"> </span><span class="n">PROPERTIES</span>: <span class="p">[</span><span class="n">subclass</span>::<span class="n">Property</span><span class="p">;</span><span class="w"> </span><span class="mi">11</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="w"></span>
5222 <span class="w"> </span><span class="n">subclass</span>::<span class="n">Property</span><span class="p">(</span><span class="s">&quot;flags&quot;</span><span class="p">,</span><span class="w"> </span><span class="o">|</span><span class="n">name</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5223 <span class="w"> </span><span class="n">ParamSpec</span>::<span class="n">flags</span><span class="p">(</span><span class="w"></span>
5224 <span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"></span>
5225 <span class="w"> </span><span class="s">&quot;Flags&quot;</span><span class="p">,</span><span class="w"></span>
5226 <span class="w"> </span><span class="s">&quot;Loading flags&quot;</span><span class="p">,</span><span class="w"></span>
5227 <span class="w"> </span><span class="n">HandleFlags</span>::<span class="n">static_type</span><span class="p">(),</span><span class="w"></span>
5228 <span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
5229 <span class="w"> </span><span class="n">ParamFlags</span>::<span class="n">READWRITE</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">ParamFlags</span>::<span class="n">CONSTRUCT_ONLY</span><span class="p">,</span><span class="w"></span>
5230 <span class="w"> </span><span class="p">)</span><span class="w"></span>
5231 <span class="w"> </span><span class="p">}),</span><span class="w"></span>
5232 <span class="w"> </span><span class="n">subclass</span>::<span class="n">Property</span><span class="p">(</span><span class="s">&quot;dpi-x&quot;</span><span class="p">,</span><span class="w"> </span><span class="o">|</span><span class="n">name</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5233 <span class="w"> </span><span class="n">ParamSpec</span>::<span class="n">double</span><span class="p">(</span><span class="w"></span>
5234 <span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"></span>
5235 <span class="w"> </span><span class="s">&quot;Horizontal DPI&quot;</span><span class="p">,</span><span class="w"></span>
5236 <span class="w"> </span><span class="s">&quot;Horizontal resolution in dots per inch&quot;</span><span class="p">,</span><span class="w"></span>
5237 <span class="w"> </span><span class="mf">0.0</span><span class="p">,</span><span class="w"></span>
5238 <span class="w"> </span><span class="kt">f64</span>::<span class="n">MAX</span><span class="p">,</span><span class="w"></span>
5239 <span class="w"> </span><span class="mf">0.0</span><span class="p">,</span><span class="w"></span>
5240 <span class="w"> </span><span class="n">ParamFlags</span>::<span class="n">READWRITE</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">ParamFlags</span>::<span class="n">CONSTRUCT</span><span class="p">,</span><span class="w"></span>
5241 <span class="w"> </span><span class="p">)</span><span class="w"></span>
5242 <span class="w"> </span><span class="p">}),</span><span class="w"></span>
5243 <span class="w"> </span><span class="c1">// ... etcetera</span>
5244 <span class="p">];</span><span class="w"></span>
5245 </code></pre></div>
5246
5247 <p>This is quite similar to the way C code usually registers properties
5248 for new GObject subclasses.</p>
5249 <p>The moment at which a new GObject subclass gets registered against the
5250 GType system is in the <code>foo_get_type()</code> call. This is the C code in
5251 librsvg for that:</p>
5252 <div class="highlight"><pre><span></span><code><span class="k">extern</span> <span class="n">GType</span> <span class="nf">rsvg_handle_rust_get_type</span> <span class="p">(</span><span class="kt">void</span><span class="p">);</span>
5253
5254 <span class="n">GType</span>
5255 <span class="nf">rsvg_handle_get_type</span> <span class="p">(</span><span class="kt">void</span><span class="p">)</span>
5256 <span class="p">{</span>
5257 <span class="k">return</span> <span class="n">rsvg_handle_rust_get_type</span> <span class="p">();</span>
5258 <span class="p">}</span>
5259 </code></pre></div>
5260
5261 <p>And the Rust function that actually implements this:</p>
5262 <div class="highlight"><pre><span></span><code><span class="cp">#[no_mangle]</span><span class="w"></span>
5263 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_handle_rust_get_type</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">glib_sys</span>::<span class="n">GType</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5264 <span class="w"> </span><span class="n">Handle</span>::<span class="n">get_type</span><span class="p">().</span><span class="n">to_glib</span><span class="p">()</span><span class="w"></span>
5265 <span class="p">}</span><span class="w"></span>
5266 </code></pre></div>
5267
5268 <p>Here, <code>Handle::get_type()</code> gets implemented automatically by
5269 Sebastian's <a href="https://github.com/gtk-rs/glib/tree/master/src/subclass">subclass</a> traits. It gets things like the type name and
5270 the parent class from the <code>impl ObjectSubclass for Handle</code> we saw
5271 above, and calls <code>g_type_register_static()</code> internally.</p>
5272 <p>I can confirm now that implementing GObjects in Rust in this way, and
5273 exposing them to C, really works and is actually quite pleasant to
5274 do. <a href="https://gitlab.gnome.org/federico/librsvg/blob/subclass/rsvg_internals/src/c_api.rs">You can look at librsvg's Rust code for GObject here</a>.</p>
5275 <h2>Further work</h2>
5276 <p>There is some auto-generated C code to register librsvg's error enum
5277 and a flags type against GType; I'll move those to Rust over the next
5278 few days.</p>
5279 <p>Then, I think I'll try to actually remove all of the library's entry
5280 points from the C code and implement them in Rust. Right now each C
5281 function is really just a single call to a Rust function, so this
5282 should be trivial-ish to do.</p>
5283 <p>I'm waiting for a glib-rs release, the first one that will have the
5284 <code>glib::subclass</code> code in it, before merging all of the above into
5285 librsvg's master branch.</p>
5286 <h2>A new Rust API for librsvg?</h2>
5287 <p>Finally, this got me thinking about what to do about the Rust bindings
5288 to librsvg itself. The <a href="https://github.com/selaux/rsvg-rs">rsvg crate</a> uses the gtk-rs
5289 machinery to generate the binding: it reads the <a href="https://people.gnome.org/~federico/blog/magic-of-gobject-introspection.html">GObject
5290 Introspection</a> data from <code>Rsvg.gir</code> and generates a Rust binding
5291 for it.</p>
5292 <p>However, the resulting API is mostly identical to the C API. There is
5293 an <code>rsvg::Handle</code> with the same methods as the ones from C's
5294 <code>RsvgHandle</code>... and that API is not particularly Rusty.</p>
5295 <p>At some point I had an unfinished branch to <a href="https://gitlab.gnome.org/GNOME/librsvg/commits/import-rsvg-rs">merge rsvg-rs into
5296 librsvg</a>. The intention was that librsvg's build procedure
5297 would first build <code>librsvg.so</code> itself, then generate <code>Rsvg.gir</code> as
5298 usual, and <strong>then</strong> generate rsvg-rs from that. But I got tired of
5299 fucking with Autotools, and didn't finish integrating the projects.</p>
5300 <p>Rsvg-rs is an <em>okay</em> Rust API for using librsvg. It still works
5301 perfectly well from the <a href="https://github.com/selaux/rsvg-rs">standalone crate</a>. However, now
5302 that all the functionality of librsvg is in Rust, I would like to take
5303 this opportunity to experiment with a better API for loading and
5304 rendering SVGs from Rust. This may make it more clear how to refactor
5305 the toplevel of the library. Maybe the <code>librsvg</code> project can provide
5306 its own Rust crate for public consumption, in addition to the usual
5307 <code>librsvg.so</code> and <code>Rsvg.gir</code> which need to remain with a stable API and
5308 ABI.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category></entry><entry><title>Librsvg is almost rustified now</title><link href="https://people.gnome.org/~federico/blog/librsvg-is-almost-rustified.html" rel="alternate"></link><published>2019-01-10T12:28:11-06:00</published><updated>2019-01-10T12:28:11-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2019-01-10:/~federico/blog/librsvg-is-almost-rustified.html</id><summary type="html"><p>Since a few days ago, librsvg's library implementation is almost 100%
5309 Rust code. Paolo Borelli's and Carlos Martín Nieto's latest commits
5310 made it possible.</p>
5311 <p>What does "almost 100% Rust code" mean here?</p>
5312 <ul>
5313 <li>
5314 <p>The C code no longer has struct fields that refer to the library's
5315 real work. The only field …</p></li></ul></summary><content type="html"><p>Since a few days ago, librsvg's library implementation is almost 100%
5316 Rust code. Paolo Borelli's and Carlos Martín Nieto's latest commits
5317 made it possible.</p>
5318 <p>What does "almost 100% Rust code" mean here?</p>
5319 <ul>
5320 <li>
5321 <p>The C code no longer has struct fields that refer to the library's
5322 real work. The only field in <code>RsvgHandlePrivate</code> is an opaque
5323 pointer to a Rust-side structure. All the rest of the library's
5324 data lives in Rust structs.</p>
5325 </li>
5326 <li>
5327 <p>The public API is implemented in C, but it is just stubs that
5328 immediately call into Rust functions. For example:</p>
5329 </li>
5330 </ul>
5331 <div class="highlight"><pre><span></span><code><span class="n">gboolean</span>
5332 <span class="nf">rsvg_handle_render_cairo_sub</span> <span class="p">(</span><span class="n">RsvgHandle</span> <span class="o">*</span> <span class="n">handle</span><span class="p">,</span> <span class="n">cairo_t</span> <span class="o">*</span> <span class="n">cr</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">id</span><span class="p">)</span>
5333 <span class="p">{</span>
5334 <span class="n">g_return_val_if_fail</span> <span class="p">(</span><span class="n">RSVG_IS_HANDLE</span> <span class="p">(</span><span class="n">handle</span><span class="p">),</span> <span class="n">FALSE</span><span class="p">);</span>
5335 <span class="n">g_return_val_if_fail</span> <span class="p">(</span><span class="n">cr</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">FALSE</span><span class="p">);</span>
5336
5337 <span class="k">return</span> <span class="n">rsvg_handle_rust_render_cairo_sub</span> <span class="p">(</span><span class="n">handle</span><span class="p">,</span> <span class="n">cr</span><span class="p">,</span> <span class="n">id</span><span class="p">);</span>
5338 <span class="p">}</span>
5339 </code></pre></div>
5340
5341 <ul>
5342 <li>
5343 <p>The GObject boilerplate and supporting code is still in C:
5344 <code>rsvg_handle_class_init</code> and <code>set_property</code> and friends.</p>
5345 </li>
5346 <li>
5347 <p>All the high-level tests are still done in C.</p>
5348 </li>
5349 <li>
5350 <p>The gdk-pixbuf loader for SVG files is done in C.</p>
5351 </li>
5352 </ul>
5353 <p>Someone posted a <a href="https://www.reddit.com/r/rust/comments/ae5xwd/librsvg_oxidation/">chart on Reddit about the rustification of librsvg</a>,
5354 comparing lines of code in each language vs. time.</p>
5355 <h2>Rustifying the remaining C code</h2>
5356 <p>There is only a handful of very small functions from the public API
5357 still implemented in C, and I am converting them one by one to Rust.
5358 These are just helper functions built on top of other public API that
5359 does the real work.</p>
5360 <p>Converting the gdk-pixbuf loader to Rust seems like writing a little
5361 glue code for the loadable module; the actual loading is just a couple
5362 of calls to librsvg's API.</p>
5363 <h3>Rsvg-rs in rsvg?</h3>
5364 <p>Converting the tests to Rust... ideally this would use the <a href="https://github.com/selaux/rsvg-rs">rsvg-rs</a>
5365 bindings; for example, it is what I already use for <a href="https://gitlab.gnome.org/federico/rsvg-bench">rsvg-bench</a>, a
5366 benchmarking program for librsvg.</p>
5367 <p>I have an <a href="https://gitlab.gnome.org/federico/librsvg/commits/import-rsvg-rs">unfinished branch to merge the rsvg-rs repository</a>
5368 into librsvg's own repository. This is because...</p>
5369 <ol>
5370 <li>Librsvg builds its library, <code>librsvg.so</code></li>
5371 <li>Gobject-introspection runs on <code>librsvg.so</code> and the source code, and
5372 produces <code>librsvg.gir</code></li>
5373 <li>Rsvg-rs's build system calls <a href="https://github.com/gtk-rs/gir/">gir</a> on <code>librsvg.gir</code> to generate the
5374 Rust binding's code.</li>
5375 </ol>
5376 <p>As you can imagine, doing all of this with Autotools is... rather
5377 convoluted. It gives me a lot of anxiety to think that there is also
5378 an <a href="https://gitlab.gnome.org/GNOME/librsvg/commits/wip/meson">unfinished branch to port the build system to Meson</a>, where
5379 <em>probably</em> doing the .so→.gir→rs chain would be easier, but who
5380 knows. Help in this area is <strong>much</strong> appreciated!</p>
5381 <h3>An alternative?</h3>
5382 <p>Rustified tests could, of course, call the C API of librsvg by hand,
5383 in <code>unsafe</code> code. This may not be idiomatic, but sounds like it could
5384 be done relatively quickly.</p>
5385 <h2>Future work</h2>
5386 <p>There are two options to get rid of all the C code in the library, and
5387 just leave C header files for public consumption:</p>
5388 <ol>
5389 <li>
5390 <p>Do the GObject implementation in Rust, using Sebastian Dröge's work
5391 from GStreamer to do this easily.</p>
5392 </li>
5393 <li>
5394 <p>Work on making <a href="https://gitlab.gnome.org/federico/gnome-class">gnome-class</a> powerful enough to implement the librsvg
5395 API directly, and in an ABI-compatible fashion to what there is
5396 right now.</p>
5397 </li>
5398 </ol>
5399 <p>The second case will probably build upon the first one, since one of
5400 my plans for gnome-class is to make it generate code that uses
5401 Sebastian's, instead of generating all the GObject boilerplate by
5402 hand.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category></entry><entry><title>In support of Coraline Ada Ehmke</title><link href="https://people.gnome.org/~federico/blog/in-support-of-coraline.html" rel="alternate"></link><published>2018-12-07T14:00:06-06:00</published><updated>2018-12-07T14:00:06-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-12-07:/~federico/blog/in-support-of-coraline.html</id><summary type="html"><p>Last night, the linux.org DNS was hijacked and redirected to a page
5403 that doxed her. Coraline is doing extremely valuable work with the
5404 <a href="https://www.contributor-covenant.org/">Contributor Covenant</a> code of conduct, which many free software
5405 projects have <a href="https://www.contributor-covenant.org/adopters">adopted</a> already.</p>
5406 <p>Coraline has been working for years in making free software, and
5407 computer technology …</p></summary><content type="html"><p>Last night, the linux.org DNS was hijacked and redirected to a page
5408 that doxed her. Coraline is doing extremely valuable work with the
5409 <a href="https://www.contributor-covenant.org/">Contributor Covenant</a> code of conduct, which many free software
5410 projects have <a href="https://www.contributor-covenant.org/adopters">adopted</a> already.</p>
5411 <p>Coraline has been working for years in making free software, and
5412 computer technology circles in general, a welcome place for
5413 underrepresented groups.</p>
5414 <p>I hope Coraline stays safe and strong. You can <a href="https://www.patreon.com/coraline">support her directly
5415 on Patreon</a>.</p></content><category term="misc"></category><category term="code-of-conduct"></category></entry><entry><title>My GUADEC 2018 presentation</title><link href="https://people.gnome.org/~federico/blog/guadec-2018-presentation.html" rel="alternate"></link><published>2018-12-04T18:57:00-06:00</published><updated>2018-12-06T15:09:43-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-12-04:/~federico/blog/guadec-2018-presentation.html</id><summary type="html"><p>I just realized that I forgot to publish my presentation from this
5416 year's GUADEC. Sorry, here it is!</p>
5417 <p><a href="https://people.gnome.org/~federico/blog/docs/fmq-refactoring-c-to-rust.pdf"><img alt="Patterns of refactoring C to Rust - link to PDF" src="https://people.gnome.org/~federico/blog/images/fmq-refactoring-c-to-rust.png"></a></p>
5418 <p>You can also get the <a href="https://people.gnome.org/~federico/blog/docs/fmq-refactoring-c-to-rust.odp">ODP file</a> for the presentation. This is
5419 released under a <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC-BY-SA license</a>.</p>
5420 <p>This is the <a href="http://videos.guadec.org/2018/GUADEC%202018%20-%20Federico%20Mena%20Quintero%20-%20Patterns%20of%20refactoring%20C%20to%20Rust-5mVMycYmoWE.mp4">video of the presentation</a>.</p>
5421 <p><strong><em>Update Dec/06:</em></strong> Keen readers spotted an <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/391">incorrect …</a></p></summary><content type="html"><p>I just realized that I forgot to publish my presentation from this
5422 year's GUADEC. Sorry, here it is!</p>
5423 <p><a href="https://people.gnome.org/~federico/blog/docs/fmq-refactoring-c-to-rust.pdf"><img alt="Patterns of refactoring C to Rust - link to PDF" src="https://people.gnome.org/~federico/blog/images/fmq-refactoring-c-to-rust.png"></a></p>
5424 <p>You can also get the <a href="https://people.gnome.org/~federico/blog/docs/fmq-refactoring-c-to-rust.odp">ODP file</a> for the presentation. This is
5425 released under a <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC-BY-SA license</a>.</p>
5426 <p>This is the <a href="http://videos.guadec.org/2018/GUADEC%202018%20-%20Federico%20Mena%20Quintero%20-%20Patterns%20of%20refactoring%20C%20to%20Rust-5mVMycYmoWE.mp4">video of the presentation</a>.</p>
5427 <p><strong><em>Update Dec/06:</em></strong> Keen readers spotted an <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/391">incorrect use of opaque
5428 pointers</a>; I've updated the example code in the presentation to
5429 match <a href="https://gitlab.gnome.org/GNOME/librsvg/merge_requests/161">Jordan's fix with the recommended usage</a>. That merge
5430 request has an interesting conversation on FFI esoterica, too.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category><category term="talks"></category></entry><entry><title>Refactoring allowed URLs in librsvg</title><link href="https://people.gnome.org/~federico/blog/refactoring-allowed-urls-in-librsvg.html" rel="alternate"></link><published>2018-11-29T11:31:37-06:00</published><updated>2018-11-29T11:31:37-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-11-29:/~federico/blog/refactoring-allowed-urls-in-librsvg.html</id><summary type="html"><p>While in the middle of converting librsvg's code that processes XML from C
5431 to Rust, I went into a digression that has to do with the way librsvg
5432 decides which files are allowed to be referenced from within an SVG.</p>
5433 <h1>Resource references in SVG</h1>
5434 <p>SVG files can reference other files …</p></summary><content type="html"><p>While in the middle of converting librsvg's code that processes XML from C
5435 to Rust, I went into a digression that has to do with the way librsvg
5436 decides which files are allowed to be referenced from within an SVG.</p>
5437 <h1>Resource references in SVG</h1>
5438 <p>SVG files can reference other files, i.e. they are not
5439 self-contained. For example, there can be an element like <code>&lt;image
5440 xlink:href="foo.png"&gt;</code>, or one can request that a sub-element of
5441 another SVG be included with <code>&lt;use xlink:href="secondary.svg#foo"&gt;</code>.
5442 Finally, there is the <code>xi:include</code> mechanism to include chunks of text
5443 or XML into another XML file.</p>
5444 <p>Since librsvg is sometimes used to render untrusted files that come from
5445 the internet, it needs to be careful not to allow those files to
5446 reference any random resource on the filesystem. We don't want
5447 something like
5448 <code>&lt;text&gt;&lt;xi:include href="/etc/passwd" parse="text"/&gt;&lt;/text&gt;</code>
5449 or something equally nefarious that would exfiltrate a random file
5450 into the rendered output.</p>
5451 <p>Also, want to catch malicious SVGs that want to "phone home" by
5452 referencing a network resource like
5453 <code>&lt;image xlink:href="http://evil.com/pingback.jpg"&gt;</code>.</p>
5454 <p>So, librsvg is careful to have a single place where it can load
5455 secondary resources, and first it validates the resource's URL to see
5456 if it is allowed.</p>
5457 <p>The actual validation rules are not very important for this
5458 discussion; they are something like "no absolute URLs allowed" (so you
5459 can't request <code>/etc/passwd</code>, "only siblings or (grand)children of
5460 siblings allowed" (so <code>foo.svg</code> can request <code>bar.svg</code> and
5461 <code>subdir/bar.svg</code>, but not <code>../../bar.svg</code>).</p>
5462 <h1>The code</h1>
5463 <p>There was a central function <code>rsvg_io_acquire_stream()</code> which took a
5464 URL as a string. The code assumed that that URL had been first
5465 validated with a function called <code>allow_load(url)</code>. While the code's
5466 structure guaranteed that all the places that may acquire a stream
5467 would actually go through <code>allow_load()</code> first, the structure of the
5468 code in Rust made it possible to actually make it impossible to
5469 acquire a disallowed URL.</p>
5470 <p>Before:</p>
5471 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">allow_load</span><span class="p">(</span><span class="n">url</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">;</span><span class="w"></span>
5472
5473 <span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">acquire_stream</span><span class="p">(</span><span class="n">url</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">gio</span>::<span class="n">InputStream</span><span class="p">,</span><span class="w"> </span><span class="n">glib</span>::<span class="n">Error</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
5474
5475 <span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_acquire_stream</span><span class="p">(</span><span class="n">url</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">gio</span>::<span class="n">InputStream</span><span class="p">,</span><span class="w"> </span><span class="n">LoadingError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5476 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">allow_load</span><span class="p">(</span><span class="n">url</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5477 <span class="w"> </span><span class="n">acquire_stream</span><span class="p">(</span><span class="n">url</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.)</span><span class="o">?</span><span class="w"></span>
5478 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5479 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">LoadingError</span>::<span class="n">NotAllowed</span><span class="p">)</span><span class="w"></span>
5480 <span class="w"> </span><span class="p">}</span><span class="w"></span>
5481 <span class="p">}</span><span class="w"></span>
5482 </code></pre></div>
5483
5484 <p>The refactored code now has an <code>AllowedUrl</code> type that encapsulates a
5485 URL, plus the promise that it <strong>has</strong> gone through these steps:</p>
5486 <ul>
5487 <li>The URL has been run through a URL well-formedness parser.</li>
5488 <li>The resource is allowed to be loaded following librsvg's rules.</li>
5489 </ul>
5490 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">AllowedUrl</span><span class="p">(</span><span class="n">Url</span><span class="p">);</span><span class="w"> </span><span class="c1">// from the Url parsing crate</span>
5491
5492 <span class="k">impl</span><span class="w"> </span><span class="n">AllowedUrl</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5493 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_href</span><span class="p">(</span><span class="n">href</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">AllowedUrl</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5494 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">parsed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Url</span>::<span class="n">parse</span><span class="p">(</span><span class="n">href</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"> </span><span class="c1">// may return LoadingError::InvalidUrl</span>
5495
5496 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">allow_load</span><span class="p">(</span><span class="n">parsed</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5497 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">AllowedUrl</span><span class="p">(</span><span class="n">parsed</span><span class="p">))</span><span class="w"></span>
5498 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5499 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">LoadingError</span>::<span class="n">NotAllowed</span><span class="p">)</span><span class="w"></span>
5500 <span class="w"> </span><span class="p">}</span><span class="w"></span>
5501 <span class="w"> </span><span class="p">}</span><span class="w"></span>
5502 <span class="p">}</span><span class="w"></span>
5503
5504 <span class="c1">// new prototype</span>
5505 <span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">acquire_stream</span><span class="p">(</span><span class="n">url</span>: <span class="kp">&amp;</span><span class="nc">AllowedUrl</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">gio</span>::<span class="n">InputStream</span><span class="p">,</span><span class="w"> </span><span class="n">glib</span>::<span class="n">Error</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
5506 </code></pre></div>
5507
5508 <p>This forces callers to validate the URLs as soon as possible, right
5509 after they get them from the SVG file. Now it is not possible to
5510 request a stream unless the URL has been validated first.</p>
5511 <h1>Plain URIs vs. fragment identifiers</h1>
5512 <p>Some of the elements in SVG that reference other data require full
5513 files:</p>
5514 <div class="highlight"><pre><span></span><code><span class="o">&lt;</span><span class="n">image</span> <span class="n">xlink</span><span class="o">:</span><span class="n">href</span><span class="o">=</span><span class="s">&quot;foo.png&quot;</span> <span class="p">...</span><span class="o">&gt;</span> <span class="o">&lt;!--</span> <span class="n">no</span> <span class="n">fragments</span> <span class="n">allowed</span> <span class="o">--&gt;</span>
5515 </code></pre></div>
5516
5517 <p>And some others, that reference particular elements in secondary SVGs,
5518 require a fragment ID:</p>
5519 <div class="highlight"><pre><span></span><code><span class="o">&lt;</span><span class="n">use</span> <span class="n">xlink</span><span class="o">:</span><span class="n">href</span><span class="o">=</span><span class="s">&quot;icons.svg#app_name&quot;</span> <span class="p">...</span><span class="o">&gt;</span> <span class="o">&lt;!--</span> <span class="n">fragment</span> <span class="n">id</span> <span class="n">required</span> <span class="o">--&gt;</span>
5520 </code></pre></div>
5521
5522 <p>And finally, the <code>feImage</code> element, used to paste an image as part of
5523 a filter effects pipeline, allows either:</p>
5524 <div class="highlight"><pre><span></span><code><span class="o">&lt;!--</span> <span class="n">will</span> <span class="n">use</span> <span class="n">that</span> <span class="n">image</span> <span class="o">--&gt;</span>
5525 <span class="o">&lt;</span><span class="n">feImage</span> <span class="n">xlink</span><span class="o">:</span><span class="n">href</span><span class="o">=</span><span class="s">&quot;foo.png&quot;</span> <span class="p">...</span><span class="o">&gt;</span>
5526
5527 <span class="o">&lt;!--</span> <span class="n">will</span> <span class="n">render</span> <span class="n">just</span> <span class="n">this</span> <span class="n">element</span> <span class="n">from</span> <span class="n">an</span> <span class="n">SVG</span> <span class="kr">and</span> <span class="n">use</span> <span class="n">it</span> <span class="kr">as</span> <span class="n">an</span> <span class="n">image</span> <span class="o">--&gt;</span>
5528 <span class="o">&lt;</span><span class="n">feImage</span> <span class="n">xlink</span><span class="o">:</span><span class="n">href</span><span class="o">=</span><span class="s">&quot;foo.svg#element&quot;</span><span class="o">&gt;</span>
5529 </code></pre></div>
5530
5531 <p>So, I introduced a general <code>Href</code> parser :</p>
5532 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Href</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5533 <span class="w"> </span><span class="n">PlainUri</span><span class="p">(</span><span class="nb">String</span><span class="p">),</span><span class="w"></span>
5534 <span class="w"> </span><span class="n">WithFragment</span><span class="p">(</span><span class="n">Fragment</span><span class="p">),</span><span class="w"></span>
5535 <span class="p">}</span><span class="w"></span>
5536
5537 <span class="sd">/// Optional URI, mandatory fragment id</span>
5538 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Fragment</span><span class="p">(</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="nb">String</span><span class="p">);</span><span class="w"></span>
5539 </code></pre></div>
5540
5541 <p>The parts of the code that absolutely require a fragment id now take a
5542 <code>Fragment</code>. Parts which require a <code>PlainUri</code> can unwrap that case.</p>
5543 <p>The next step is making those structs contain an <code>AllowedUrl</code>
5544 directly, instead of just strings, so that for callers, obtaining a
5545 fully validated name is a one-step operation.</p>
5546 <p>In general, the code is moving towards a scheme where all file I/O is
5547 done at loading time. Right now, some of those external references
5548 get resolved at rendering time, which is somewhat awkward (for
5549 example, at rendering time the caller has no chance to use a
5550 <code>GCancellable</code> to cancel loading). This refactoring to do early
5551 validation is leaving the code in a very nice state.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category></entry><entry><title>Thessaloniki GNOME+Rust Hackfest 2018</title><link href="https://people.gnome.org/~federico/blog/thessaloniki-gnome-rust-2018.html" rel="alternate"></link><published>2018-11-27T17:37:31-06:00</published><updated>2018-11-27T17:37:31-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-11-27:/~federico/blog/thessaloniki-gnome-rust-2018.html</id><summary type="html"><p>A couple of weeks ago we had the <a href="https://wiki.gnome.org/Hackfests/Rust2018-2">fourth GNOME+Rust hackfest</a>, this time
5552 in Thessaloniki, Greece. This is the beautiful city that will host
5553 next year's GUADEC, but fortunately GUADEC will be in summertime!</p>
5554 <p>We held the hackfest at the <a href="http://coho.gr/">CoHo</a> coworking space, a small, cozy
5555 office between the …</p></summary><content type="html"><p>A couple of weeks ago we had the <a href="https://wiki.gnome.org/Hackfests/Rust2018-2">fourth GNOME+Rust hackfest</a>, this time
5556 in Thessaloniki, Greece. This is the beautiful city that will host
5557 next year's GUADEC, but fortunately GUADEC will be in summertime!</p>
5558 <p>We held the hackfest at the <a href="http://coho.gr/">CoHo</a> coworking space, a small, cozy
5559 office between the University and the sea.</p>
5560 <p>Every such hackfest I am overwhelmed by the kind hackers who work on
5561 [gnome-class], the code generator for GObject implementations in
5562 Rust.</p>
5563 <p>Mredlek has been working on generalizing the code generators in
5564 gnome-class, so that we can have the following from the same run:</p>
5565 <ul>
5566 <li>
5567 <p>Rust code generation, for the GObject implementations themselves.
5568 Thanks to mredlek, this is much cleaner than it was before; now both
5569 classes and interfaces share the same code for most of the
5570 boilerplate.</p>
5571 </li>
5572 <li>
5573 <p>GObject Introspection (<code>.gir</code>) generation, so that language bindings
5574 can be generated automatically.</p>
5575 </li>
5576 <li>
5577 <p>C header files (<code>.h</code>), so the generated GObjects can be called from
5578 C code as usual.</p>
5579 </li>
5580 </ul>
5581 <p>So far, Rust and GIR work; C header files are not generated yet.</p>
5582 <p>Mredlek is a new contributor to gnome-class, but unfortunately was not
5583 able to attend the hackfest. Not only did he rewrite the gnome-class
5584 parser using the new version of <a href="https://docs.rs/syn/0.15.22/syn/">syn</a>; he also added support for
5585 passing owned types to GObject methods, such as <code>String</code> and
5586 <code>Variant</code>. But the biggest thing is probably that mredlek made it a
5587 lot easier to debug the generated Rust source; see <a href="https://federico.pages.gitlab.gnome.org/gnome-class/doc/gobject_gen/#debugging-aids-and-examining-generated-code">the documentation
5588 on debugging</a> for details.</p>
5589 <p>Speaking of which, thanks to Jordan Petridis for making the
5590 documentation be published automatically from Gitlab's Continuous
5591 Integration pipelines.</p>
5592 <p>Alex Crichton kindly refactored our error propagation code, and <a href="https://federico.pages.gitlab.gnome.org/gnome-class/book/errors.html">even
5593 wrote docs on it</a>! Along with Jordan, they updated the
5594 code for the Rust 2018 edition, and generally wrangled the build
5595 process to conform with the lastest Rust nightlies. Alex also made
5596 code generation a lot faster, by offloading auto-indentation to an
5597 external <code>rustfmt</code> process, instead of using it as a crate: using the
5598 <code>rustfmt</code> crate meant that the compiler had a lot more work to do.
5599 During the whole hackfest, Alex was very helpful with Rust questions
5600 in general. While my strategy to see what the compiler does is to
5601 examine the disassembly in gdb, his strategy seems to be to look at
5602 the LLVM intermediate representation instead... OMG.</p>
5603 <h1>And we can derive very simple GtkWidgets now!</h1>
5604 <p>Saving the best for last... Antoni Boucher, the author of <a href="http://relm.ml/">relm</a>, has
5605 been working on making it possible to derive from <code>gtk::Widget</code>. Once
5606 <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/40">this merge request</a> is done, we'll have an example of
5607 deriving from <code>gtk::DrawingArea</code> from Rust with very little code.</p>
5608 <p>Normally, the <a href="https://gtk-rs.org/">gtk-rs</a> bindings work as a statically-generated binding
5609 for GObject, which really is a type hierarchy defined at runtime. The
5610 static binding really wants to know what is a subclass of what: it
5611 needs to know in advance that <code>Button</code>'s hierarchy is <code>Button → Bin →
5612 Container → Widget → Object</code>, plus all the <code>GTypeInterface</code>s supported
5613 by any of those classes. Antoni has been working on making
5614 gnome-class extract that information automatically from GIR files, so
5615 that the gtk-rs macros that define new types will get all the
5616 necessary information.</p>
5617 <h1>Future work</h1>
5618 <p>There are still <a href="https://gitlab.gnome.org/GNOME/gtk/merge_requests/415">bugs</a> in the GIR pipeline that prevent us
5619 from deriving, say, from <code>gtk::Container</code>, but hopefully these will be
5620 resolved soon.</p>
5621 <p>Sebastian Dröge has been refactoring his Rust tools to create GObject
5622 subclasses with very idiomatic and refined Rust code. This is now at
5623 a state where gnome-class itself could generate that sort of code,
5624 instead of generating all the boilerplate from scratch. So, we'll
5625 start doing that, and integrating the necessary bits into gtk-rs as
5626 well.</p>
5627 <p>Finally, during the last day I took a little break from gnome-class to
5628 work on librsvg. Julian Sparber has been updating the code to use new
5629 bindings in cairo-rs, and is also adding a <a href="https://gitlab.gnome.org/GNOME/librsvg/merge_requests/152">new API</a> to
5630 fetch an SVG element's geometry precisely.</p>
5631 <h1>Thessaloniki</h1>
5632 <p>Oh, boy, I wish the weather had been warmer. The city looks
5633 delightful to walk around, especially in the narrow streets on the
5634 hills. Can't wait to see it in summer during GUADEC.</p>
5635 <h1>Thanks</h1>
5636 <p>Finally, thanks to <a href="http://coho.gr/">CoHo</a> for hosting the hackfest, and to the GNOME
5637 Foundation for sponsoring my travel and accomodation. And to
5638 <a href="https://www.centricular.com/">Centricular</a> for taking us all to dinner!</p>
5639 <p>Special thanks to Jordan Petridis for being on top of everything
5640 build-wise all the time.</p>
5641 <p><img alt="Sponsored by the GNOME Foundation" src="https://people.gnome.org/~federico/blog/images/sponsored-by-foundation.png"></p></content><category term="misc"></category><category term="gnome"></category><category term="rust"></category></entry><entry><title>Propagating Errors</title><link href="https://people.gnome.org/~federico/blog/propagating-errors.html" rel="alternate"></link><published>2018-11-21T13:58:12-06:00</published><updated>2018-11-21T13:58:12-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-11-21:/~federico/blog/propagating-errors.html</id><summary type="html"><p>Lately, I have been converting the code in librsvg that handles XML
5642 from C to Rust. For many technical reasons, the library still uses
5643 libxml2, GNOME's historic XML parsing library, but some of the
5644 callbacks to handle XML events like <code>start_element</code>, <code>end_element</code>,
5645 <code>characters</code>, are now implemented in Rust. This has …</p></summary><content type="html"><p>Lately, I have been converting the code in librsvg that handles XML
5646 from C to Rust. For many technical reasons, the library still uses
5647 libxml2, GNOME's historic XML parsing library, but some of the
5648 callbacks to handle XML events like <code>start_element</code>, <code>end_element</code>,
5649 <code>characters</code>, are now implemented in Rust. This has meant that I'm
5650 running into all the cases where the original C code in librsvg failed
5651 to handle errors properly; Rust really makes it obvious when that
5652 happens.</p>
5653 <p>In this post I want to talk a bit about propagating errors. You call
5654 a function, it returns an error, and then what?</p>
5655 <h2>What can fail?</h2>
5656 <p>It turns out that this question is highly context-dependent. Let's
5657 say a program is starting up and tries to read a configuration file.
5658 What could go wrong?</p>
5659 <ul>
5660 <li>
5661 <p>The file doesn't exist. Maybe it is the very first time the program
5662 is run, and so there <em>isn't</em> a configuration file at all? Can the
5663 program provide a default configuration in this case? Or does it
5664 absolutely need a pre-written configuration file to be somewhere?</p>
5665 </li>
5666 <li>
5667 <p>The file can't be parsed. Should the program warn the user and
5668 exit, or should it revert to a default configuration (should it
5669 overwrite the file with valid, default values)? <em>Can</em>
5670 the program warn the user, or is it a user-less program that at best
5671 can just shout into the void of a server-side log file?</p>
5672 </li>
5673 <li>
5674 <p>The file can be parsed, but the values are invalid. Same questions
5675 as the case above.</p>
5676 </li>
5677 <li>
5678 <p>Etcetera.</p>
5679 </li>
5680 </ul>
5681 <p>At each stage, the code will probably see very low-level errors ("file
5682 not found", "I/O error", "parsing failed", "value is out of range").
5683 What the code decides to do, or what it is able to do at any
5684 particular stage, depends both on the semantics you want from the
5685 program, and from the code structure itself.</p>
5686 <h2>Structuring the problem</h2>
5687 <p>This is an easy, but very coarse way of handling things:</p>
5688 <div class="highlight"><pre><span></span><code><span class="n">gboolean</span>
5689 <span class="nf">read_configuration</span> <span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">config_file_name</span><span class="p">)</span>
5690 <span class="p">{</span>
5691 <span class="cm">/* open the file */</span>
5692
5693 <span class="cm">/* parse it */</span>
5694
5695 <span class="cm">/* set global variables to the configuration values */</span>
5696
5697 <span class="cm">/* return true if success, or false if failure */</span>
5698 <span class="p">}</span>
5699 </code></pre></div>
5700
5701 <p>What is bad about this? Let's see:</p>
5702 <ul>
5703 <li>
5704 <p>The calling code just gets a success/failure condition. In the case
5705 of failure, it doesn't get to know why things failed.</p>
5706 </li>
5707 <li>
5708 <p>If the function sets global variables with configuration values as
5709 they get read... and something goes wrong and the function returns
5710 an error... the caller ends up possibly in an inconsistent state,
5711 with a set of configuration variables that are only halfway-set.</p>
5712 </li>
5713 <li>
5714 <p>If the function finds parse errors, well, do you really want to call
5715 UI code from inside it? The caller might be a better place to make
5716 that decision.</p>
5717 </li>
5718 </ul>
5719 <h2>A slightly better structure</h2>
5720 <p>Let's add an enumeration to indicate the possible errors, and a
5721 structure of configuration values.</p>
5722 <div class="highlight"><pre><span></span><code><span class="k">enum</span> <span class="nc">ConfigError</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5723 <span class="w"> </span><span class="n">ConfigFileDoesntExist</span><span class="p">,</span><span class="w"></span>
5724 <span class="w"> </span><span class="n">ParseError</span><span class="p">,</span><span class="w"> </span><span class="c1">// config file has bad syntax or something</span>
5725 <span class="w"> </span><span class="n">ValueError</span><span class="p">,</span><span class="w"> </span><span class="c1">// config file has an invalid value</span>
5726 <span class="p">}</span><span class="w"></span>
5727
5728 <span class="k">struct</span> <span class="nc">ConfigValues</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5729 <span class="w"> </span><span class="c1">// a bunch of fields here with the program&#39;s configuration</span>
5730 <span class="p">}</span><span class="w"></span>
5731
5732 <span class="k">fn</span> <span class="nf">read_configuration</span><span class="p">(</span><span class="n">filename</span>: <span class="kp">&amp;</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">ConfigValues</span><span class="p">,</span><span class="w"> </span><span class="n">ConfigError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5733 <span class="w"> </span><span class="c1">// open the file, or return Err(ConfigError::ConfigFileDoesntExist)</span>
5734
5735 <span class="w"> </span><span class="c1">// parse the file; or return Err(ConfigError::ParseError)</span>
5736
5737 <span class="w"> </span><span class="c1">// validate the values, or return Err(ConfigError::ValueError)</span>
5738
5739 <span class="w"> </span><span class="c1">// if everything succeeds, return Ok(ConfigValues)</span>
5740 <span class="p">}</span><span class="w"></span>
5741 </code></pre></div>
5742
5743 <p>This is better, in that the caller decides what to do with the
5744 validated <code>ConfigValues</code>: maybe it can just copy them to the
5745 program's global variables for configuration.</p>
5746 <p>However, this scheme doesn't give the caller all the information it
5747 would like to present a really good error message. For example, the
5748 caller will get to know if there is a parse error, but it doesn't know
5749 specifically what failed during parsing. Similarly, it will just get
5750 to know if there was an invalid value, but not which one.</p>
5751 <h2>Ah, so the problem is fractal</h2>
5752 <p>We could have new structs to represent the little errors, and then
5753 make them part of the original error enum:</p>
5754 <div class="highlight"><pre><span></span><code><span class="k">struct</span> <span class="nc">ParseError</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5755 <span class="w"> </span><span class="n">line</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"></span>
5756 <span class="w"> </span><span class="n">column</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"></span>
5757 <span class="w"> </span><span class="n">error_reason</span>: <span class="nb">String</span><span class="p">,</span><span class="w"></span>
5758 <span class="p">}</span><span class="w"></span>
5759
5760 <span class="k">struct</span> <span class="nc">ValueError</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5761 <span class="w"> </span><span class="n">config_key</span>: <span class="nb">String</span><span class="p">,</span><span class="w"></span>
5762 <span class="w"> </span><span class="n">error_reason</span>: <span class="nb">String</span><span class="p">,</span><span class="w"></span>
5763 <span class="p">}</span><span class="w"></span>
5764
5765 <span class="k">enum</span> <span class="nc">ConfigError</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5766 <span class="w"> </span><span class="n">ConfigFileDoesntExist</span><span class="p">,</span><span class="w"></span>
5767 <span class="w"> </span><span class="n">ParseError</span><span class="p">(</span><span class="n">ParseError</span><span class="p">),</span><span class="w"> </span><span class="c1">// we put those structs in here</span>
5768 <span class="w"> </span><span class="n">ValueError</span><span class="p">(</span><span class="n">ValueError</span><span class="p">),</span><span class="w"></span>
5769 <span class="p">}</span><span class="w"></span>
5770 </code></pre></div>
5771
5772 <p>Is that enough? It depends.</p>
5773 <p>The <code>ParseError</code> and <code>ValueError</code> structs have individual
5774 <code>error_reason</code> fields, which are strings. Presumably, one could have
5775 a <code>ParseError</code> with <code>error_reason = "unexpected token"</code>, or a
5776 <code>ValueError</code> with <code>error_reason = "cannot be a negative number"</code>.</p>
5777 <p>One problem with this is that if the low-level errors come with error
5778 messages in English, then the caller has to know how to localize them
5779 to the user's language. Also, if they don't have a machine-readable
5780 error code, then the calling code may not have enough information to
5781 decide what do do with the error.</p>
5782 <p>Let's say we had a <code>ParseErrorKind</code> enum with variants like
5783 <code>UnexpectedToken</code>, <code>EndOfFile</code>, etc. This is fine; it lets the
5784 calling code know the <em>reason</em> for the error. Also, there can be a
5785 <code>gimme_localized_error_message()</code> method for that particular type of
5786 error.</p>
5787 <div class="highlight"><pre><span></span><code><span class="k">enum</span> <span class="nc">ParseErrorKind</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5788 <span class="w"> </span><span class="n">UnexpectedToken</span><span class="p">,</span><span class="w"></span>
5789 <span class="w"> </span><span class="n">EndOfFile</span><span class="p">,</span><span class="w"></span>
5790 <span class="w"> </span><span class="n">MissingComma</span><span class="p">,</span><span class="w"></span>
5791 <span class="w"> </span><span class="c1">// ... etc.</span>
5792 <span class="p">}</span><span class="w"></span>
5793
5794 <span class="k">struct</span> <span class="nc">ParseError</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5795 <span class="w"> </span><span class="n">line</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"></span>
5796 <span class="w"> </span><span class="n">column</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"></span>
5797 <span class="w"> </span><span class="n">kind</span>: <span class="nc">ParseErrorKind</span><span class="p">,</span><span class="w"></span>
5798 <span class="p">}</span><span class="w"></span>
5799 </code></pre></div>
5800
5801 <p>How can we expand this? Maybe the <code>ParseErrorKind::UnexpectedToken</code>
5802 variant wants to contain data that indicates <em>which</em> token it got that
5803 was wrong, so it would be <code>UnexpectedToken(String)</code> or something
5804 similar.</p>
5805 <p>But is <em>that</em> useful to the calling code? For our example program,
5806 which is reading a configuration file... it probably only needs to
5807 know if it could parse the file, but maybe it doesn't really need any
5808 additional details on the reason for the parse error, other than
5809 having something useful to present to the user. Whether it is
5810 appropriate to burden the user with the actual details... does the app
5811 expect to make it the user's job to fix broken configuration files?
5812 Yes for a web server, where the user is a sysadmin; probably not for a
5813 random end-user graphical app, where people shouldn't need to write
5814 configuration files by hand in the first place (should <em>those</em> have a
5815 "Details" section in the error message window? I don't know!).</p>
5816 <p>Maybe the low-level parsing/validation code <em>can</em> emit those detailed
5817 errors. But how can we propagate them to something more useful to the
5818 upper layers of the code?</p>
5819 <h2>Translation and propagation</h2>
5820 <p>Maybe our original <code>read_configuration()</code> function can translate the
5821 low-level errors into high-level ones:</p>
5822 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">read_configuration</span><span class="p">(</span><span class="n">filename</span>: <span class="kp">&amp;</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">ConfigValues</span><span class="p">,</span><span class="w"> </span><span class="n">ConfigError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5823 <span class="w"> </span><span class="c1">// open file</span>
5824
5825 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">cannot_open_file</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
5826 <span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">ConfigError</span>::<span class="n">ConfigFileDoesntExist</span><span class="p">);</span><span class="w"></span>
5827 <span class="w"> </span><span class="p">}</span><span class="w"></span>
5828
5829 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">contents</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">read_the_file</span><span class="p">().</span><span class="n">map_err</span><span class="p">(</span><span class="o">|</span><span class="n">e</span><span class="o">|</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="n">oops</span><span class="p">,</span><span class="w"> </span><span class="n">maybe</span><span class="w"> </span><span class="n">we</span><span class="w"> </span><span class="n">need</span><span class="w"> </span><span class="n">an</span><span class="w"> </span><span class="n">IoError</span><span class="w"> </span><span class="n">case</span><span class="p">,</span><span class="w"> </span><span class="n">too</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
5830
5831 <span class="w"> </span><span class="c1">// parse file</span>
5832
5833 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">parsed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse</span><span class="p">(</span><span class="n">contents</span><span class="p">).</span><span class="n">map_err</span><span class="p">(</span><span class="o">|</span><span class="n">e</span><span class="o">|</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="n">translate</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">higher</span><span class="o">-</span><span class="n">level</span><span class="w"> </span><span class="n">error</span><span class="p">)</span><span class="o">?</span><span class="w"></span>
5834
5835 <span class="w"> </span><span class="c1">// validate</span>
5836
5837 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">validated</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">validate</span><span class="p">(</span><span class="n">parsed</span><span class="p">).</span><span class="n">map_err</span><span class="p">(</span><span class="o">|</span><span class="n">e</span><span class="o">|</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="n">translate</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">higher</span><span class="o">-</span><span class="n">level</span><span class="w"> </span><span class="n">error</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
5838
5839 <span class="w"> </span><span class="c1">// yay!</span>
5840 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">ConfigValues</span>::<span class="n">from</span><span class="p">(</span><span class="n">validated</span><span class="p">))</span><span class="w"></span>
5841 <span class="p">}</span><span class="w"></span>
5842 </code></pre></div>
5843
5844 <p>Etcetera. It is up to each part of the code to decide what do do with
5845 lower-level errors. Can it recover from them? Should it fail the
5846 whole operation and return a higher-level error? Should it warn the
5847 user right there?</p>
5848 <h2>Language facilities</h2>
5849 <p>C makes it really easy to ignore errors, and pretty hard to present
5850 detailed errors like the above. One could mimic what Rust is actually
5851 doing with a collection of <code>union</code> and <code>struct</code> and <code>enum</code>, but this
5852 gets very awkward very fast.</p>
5853 <p>Rust provides these facilities at the language level, and the idioms
5854 around <code>Result</code> and error handling are very nice to use. There are
5855 even crates like <a href="https://boats.gitlab.io/failure/intro.html"><code>failure</code></a> that go a long way towards
5856 automating error translation, propagation, and conversion to strings
5857 for presenting to users.</p>
5858 <h2>Infinite details</h2>
5859 <p>I've been recommending <a href="http://joeduffyblog.com/2016/02/07/the-error-model/">The Error Model</a> to anyone who
5860 comes into a discussion of error handling in programming languages.
5861 It's a long, detailed, but very enlightening read on recoverable
5862 vs. unrecoverable errors, simple error codes vs. exceptions
5863 vs. monadic results, the performance/reliability/ease of use of each
5864 model... Definitely worth a read.</p></content><category term="misc"></category><category term="rust"></category></entry><entry><title>My gdk-pixbuf braindump</title><link href="https://people.gnome.org/~federico/blog/my-gdk-pixbuf-braindump.html" rel="alternate"></link><published>2018-09-05T21:35:41-05:00</published><updated>2018-09-06T07:49:38-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-09-05:/~federico/blog/my-gdk-pixbuf-braindump.html</id><summary type="html"><p>I want to write a braindump on the stuff that I remember from
5865 gdk-pixbuf's history. There is some talk about replacing it with
5866 something newer; hopefully this history will show some things that
5867 worked, some that didn't, and why.</p>
5868 <h2>The beginnings</h2>
5869 <p>Gdk-pixbuf started as a replacement for Imlib, the image …</p></summary><content type="html"><p>I want to write a braindump on the stuff that I remember from
5870 gdk-pixbuf's history. There is some talk about replacing it with
5871 something newer; hopefully this history will show some things that
5872 worked, some that didn't, and why.</p>
5873 <h2>The beginnings</h2>
5874 <p>Gdk-pixbuf started as a replacement for Imlib, the image loading and
5875 rendering library that GNOME used in its earliest versions. Imlib
5876 came from the Enlightenment project; it provided an easy API around
5877 the idiosyncratic libungif, libjpeg, libpng, etc., and it maintained
5878 decoded images in memory with a uniform representation. Imlib also
5879 worked as an image cache for the Enlightenment window manager, which
5880 made memory management very inconvenient for GNOME.</p>
5881 <p>Imlib worked well as a "just load me an image" library. It showed
5882 that a small, uniform API to load various image formats into a common
5883 representation was desirable. And in those days, hiding all the
5884 complexities of displaying images in X was very important indeed.</p>
5885 <h2>The initial API</h2>
5886 <p>Gdk-pixbuf replaced Imlib, and added two important features:
5887 reference counting for image data, and support for an alpha channel.</p>
5888 <p>Gdk-pixbuf appeared with support for RGB(A) images. And although in
5889 theory it was possible to grow the API to support other
5890 representations, <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-core.h#L132-144"><code>GdkColorspace</code></a> never acquired anything other than
5891 <code>GDK_COLORSPACE_RGB</code>, and the <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-core.h#L269-271"><code>bits_per_sample</code></a> argument to some
5892 functions <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf.c#L478">only ever supported being <code>8</code></a>. The presence or absence of an alpha
5893 channel was done with a <code>gboolean</code> argument in conjunction with that
5894 single <code>GDK_COLORSPACE_RGB</code> value; we didn't have something like
5895 <a href="https://gitlab.freedesktop.org/cairo/cairo/blob/201791a5/src/cairo.h#L385-424"><code>cairo_format_t</code></a> which actually specifies the pixel format in single
5896 enum values.</p>
5897 <p>While all the code in gdk-pixbuf carefully checks that those
5898 conditions are met — RGBA at 8 bits per channel —, some applications
5899 inadvertently assume that <em>that</em> is the only possible case, and would get
5900 into trouble really fast if gdk-pixbuf ever started returning pixbufs
5901 with different color spaces or depths.</p>
5902 <p>One can still see the battle between bilevel-alpha
5903 vs. continuous-alpha in <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-core.h#L108-130">this enum</a>:</p>
5904 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">enum</span>
5905 <span class="p">{</span>
5906 <span class="n">GDK_PIXBUF_ALPHA_BILEVEL</span><span class="p">,</span>
5907 <span class="n">GDK_PIXBUF_ALPHA_FULL</span>
5908 <span class="p">}</span> <span class="n">GdkPixbufAlphaMode</span><span class="p">;</span>
5909 </code></pre></div>
5910
5911 <p>Fortunately, only the "<a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/contrib/gdk-pixbuf-xlib/gdk-pixbuf-xlib.h#L62-70">render this pixbuf with alpha to an Xlib
5912 drawable</a>" functions take values of this type: before the Xrender
5913 days, it was a Big Deal to draw an image with alpha to an X window,
5914 and applications often opted to use a bitmask instead, even if they
5915 had jagged edges as a result.</p>
5916 <h2>Pixel formats</h2>
5917 <p>The only pixel format that ever got implemented was unpremultiplied
5918 RGBA on all platforms. Back then I didn't understand <a href="https://keithp.com/~keithp/porterduff/p253-porter.pdf">premultiplied
5919 alpha</a>! Also, the GIMP followed that scheme, and copying
5920 it seemed like the easiest thing.</p>
5921 <p>After gdk-pixbuf, libart also copied that pixel format, I think.</p>
5922 <p>But later we got Cairo, Pixman, and all the Xrender stack. These
5923 prefer premultiplied ARGB. Moreover, Cairo prefers it if each pixel
5924 is actually a 32-bit value, with the ARGB values inside it in
5925 platform-endian order. So if you look at a memory dump, a Cairo pixel
5926 looks like BGRA on a little-endian box, while it looks like ARGB on a
5927 big-endian box.</p>
5928 <p>Every time we paint a <code>GdkPixbuf</code> to a <code>cairo_t</code>, there is a
5929 conversion from unpremultiplied RGBA to premultiplied, platform-endian
5930 ARGB. I talked a bit about this in <a href="https://people.gnome.org/~federico/blog/reducing-image-copies.html">Reducing the number of image
5931 copies in GNOME</a>.</p>
5932 <h2>The loading API</h2>
5933 <p>The public loading API in gdk-pixbuf, and its relationship to loader
5934 plug-ins, evolved in interesting ways.</p>
5935 <p>At first the public API and loaders only implemented <code>load_from_file</code>:
5936 you gave the library a <code>FILE *</code> and it gave you back a <code>GdkPixbuf</code>.
5937 Back then we didn't have a robust MIME sniffing framework in the form
5938 of a library, so gdk-pixbuf got its own. This lives in the
5939 mostly-obsolete <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-io.h#L329-359"><code>GdkPixbufFormat</code></a> machinery; it
5940 even has its own <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-io.c#L125-175">little language</a> for sniffing file headers!
5941 Nowadays we do most MIME sniffing with GIO.</p>
5942 <p>After the intial <code>load_from_file</code> API... I think we got progressive
5943 loading first, and animation support aftewards.</p>
5944 <h2>Progressive loading</h2>
5945 <p>This where the calling program feeds chunks of bytes to the library,
5946 and at the end a fully-formed <code>GdkPixbuf</code> comes out, instead of having
5947 a single "read a whole file" operation.</p>
5948 <p>We conflated this with a way to get <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-loader.h#L72-77">updates on how the image area gets
5949 modified</a> as the data gets parsed. I think we wanted to support the
5950 case of a web browser, which downloads images slowly over the network,
5951 and gradually displays them as they are downloaded. In 1998, images
5952 downloading slowly over the network was a real concern!</p>
5953 <p>It took a lot of very careful work to convert the image loaders, which
5954 parsed a whole file at a time, into loaders that could maintain some
5955 state between each time that they got handed an extra bit of buffer.</p>
5956 <p>It also sounded easy to implement the progressive updating API by
5957 simply emitting a signal that said, "this rectangular area got updated
5958 from the last read". It could handle the case of reading whole
5959 scanlines, or a few pixels, or even area-based updates for progressive
5960 JPEGs and PNGs.</p>
5961 <p>The internal API for the image format loaders still keeps a
5962 distinction between the "load a whole file" API and the "load an image
5963 in chunks". Not all loaders got redone to simply just use the second
5964 one: <code>io-jpeg.c</code> <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/io-jpeg.c#L554-722">still implements loading whole files</a> by calling the
5965 corresponding libjpeg functions. I think it could remove that code
5966 and use the progressive loading functions instead.</p>
5967 <h2>Animations</h2>
5968 <p>Animations: we followed the GIF model for animations, in which each
5969 frame overlays the previous one, and there's a delay set between each
5970 frame. This is not a video file; it's a hacky flipbook.</p>
5971 <p>However, animations presented the problem that the whole gdk-pixbuf
5972 API was meant for static images, and now we needed to support
5973 multi-frame images as well.</p>
5974 <p>We defined the "correct" way to use the gdk-pixbuf library as to
5975 actually try to load an animation, and then see if it is a
5976 single-frame image, in which case you can just get a <code>GdkPixbuf</code> for
5977 the only frame and use it.</p>
5978 <p>Or, if you got an animation, that would be a <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-animation.h"><code>GdkPixbufAnimation</code></a>
5979 object, from which you could ask for an iterator to get each frame as
5980 a separate <code>GdkPixbuf</code>.</p>
5981 <p>However, the progressive updating API never got extended to really
5982 support animations. So, we have awkward functions like
5983 <code>gdk_pixbuf_animation_iter_on_currently_loading_frame()</code> instead.</p>
5984 <h2>Necessary accretion</h2>
5985 <p>Gdk-pixbuf got support for saving just a few formats: JPEG, PNG,
5986 TIFF, ICO, and some of the formats that are implemented with the
5987 Windows-native loaders.</p>
5988 <p>Over time gdk-pixbuf got support for preserving some metadata-ish
5989 chunks from formats that provide it: DPI, color profiles, image
5990 comments, hotspots for cursors/icons...</p>
5991 <p>While an image is being loaded with the progressive loaders, there is
5992 a clunky way to specify that one doesn't want the actual size of the
5993 image, but another size instead. The loader can handle that situation
5994 itself, hopefully if an image format actually embeds different sizes
5995 in it. Or if not, the main loading code will rescale the full loaded
5996 image into the size specified by the application.</p>
5997 <h2>Historical cruft</h2>
5998 <p><a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixdata.h"><code>GdkPixdata</code></a> - a way to embed binary image data in executables, with a
5999 funky encoding. Nowadays it's just easier to directly store a PNG or
6000 JPEG or whatever in a <code>GResource</code>.</p>
6001 <p><a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/contrib/gdk-pixbuf-xlib/"><code>contrib/gdk-pixbuf-xlib</code></a> - to deal with old-style X drawables.
6002 Hopefully mostly unused now, but there's a good number of mostly old,
6003 third-party software that still uses gdk-pixbuf as an image loader and
6004 renderer to X drawables.</p>
6005 <p><a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf-transform.h"><code>gdk-pixbuf-transform.h</code></a> - Gdk-pixbuf had some very high-quality
6006 scaling functions, which the original versions of EOG used for the
6007 core of the image viewer. Nowadays Cairo is the preferred way of
6008 doing this, since it not only does scaling, but general affine
6009 transformations as well. Did you know that
6010 <code>gdk_pixbuf_composite_color</code> takes 17 arguments, and it can composite
6011 an image with alpha on top of a checkerboard? Yes, that used to be
6012 the core of EOG.</p>
6013 <h2>Debatable historical cruft</h2>
6014 <p><code>gdk_pixbuf_get_pixels()</code>. This lets the program look into the actual
6015 pixels of a loaded pixbuf, and modify them. Gdk-pixbuf just did not
6016 have a concept of immutability.</p>
6017 <p>Back in GNOME 1.x / 2.x, when it was fashionable to put icons beside
6018 menu items, or in toolbar buttons, applications would load their icon
6019 images, and modify them in various ways before setting them onto the
6020 corresponding widgets. Some things they did: load a colorful icon,
6021 desaturate it for "insensitive" command buttons or menu items, or
6022 simulate desaturation by compositing a 1x1-pixel checkerboard on the
6023 icon image. Or lighten the icon and set it as the "prelight" one onto
6024 widgets.</p>
6025 <p>The concept of "decode an image and just give me the pixels" is of
6026 course useful. Image viewers, image processing programs, and all
6027 those, of course need this functionality.</p>
6028 <p>However, these days GTK would prefer to have a way to decode an image,
6029 and ship it as fast as possible ot the GPU, without intermediaries.
6030 There is all sorts of awkward machinery in the GTK widgets that
6031 can consume either an icon from an icon theme, or a user-supplied
6032 image, or one of the various schemes for providing icons that GTK has
6033 acquired over the years.</p>
6034 <p>It is interesting to note that <code>gdk_pixbuf_get_pixels()</code> was available
6035 pretty much since the beginning, but it was only until much later that
6036 we got <code>gdk_pixbuf_get_pixels_with_length()</code>, the "give me the <code>guchar
6037 *</code> buffer and also its length" function, so that calling code has a
6038 chance of actually checking for buffer overruns. (... and it is one
6039 of the broken "give me a length" functions that returns a <code>guint</code>
6040 rather than a <code>gsize</code>. There is a better
6041 <code>gdk_pixbuf_get_byte_length()</code> which actually returns a <code>gsize</code>,
6042 though.)</p>
6043 <h2>Problems with mutable pixbufs</h2>
6044 <p>The main problem is that as things are right now, we have no
6045 flexibility in changing the internal representation of image data to
6046 make it better for current idioms: GPU-specific pixel formats may not
6047 be unpremultiplied RGBA data.</p>
6048 <p>We have no API to say, "this pixbuf has been modified", akin to
6049 <code>cairo_surface_mark_dirty()</code>: once an application calls
6050 <code>gdk_pixbuf_get_pixels()</code>, gdk-pixbuf or GTK have to assume that the
6051 data <em>will</em> be changed and they have to re-run the pipeline to send
6052 the image to the GPU (format conversions? caching? creating a
6053 texture?).</p>
6054 <p>Also, ever since the beginnings of the gdk-pixbuf API, we had a way to
6055 create pixbufs from arbitrary user-supplied RGBA buffers: the
6056 <code>gdk_pixbuf_new_from_data</code> functions. One problem with this scheme is
6057 that memory management of the buffer is up to the calling application,
6058 so the resulting pixbuf isn't free to handle those resources as it
6059 pleases.</p>
6060 <p>A relatively recent addition is <code>gdk_pixbuf_new_from_bytes()</code>, which
6061 takes a <code>GBytes</code> buffer instead of a random <code>guchar *</code>. When a pixbuf
6062 is created that way, it is <em>assumed</em> to be immutable, since a <code>GBytes</code>
6063 is basically a shared reference into a byte buffer, and it's just
6064 easier to think of it as immutable. (Nothing in C actually enforces
6065 immutability, but the API indicates that convention.)</p>
6066 <p>Internally, <code>GdkPixbuf</code> actually prefers to be created from a
6067 <code>GBytes</code>. It will <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/blob/36939ecb/gdk-pixbuf/gdk-pixbuf.c#L719-743">downgrade itself</a> to a <code>guchar *</code> buffer if
6068 something calls the old <code>gdk_pixbuf_get_pixels()</code>; in the best case,
6069 that will just take ownership of the internal buffer from the
6070 <code>GBytes</code> (if the <code>GBytes</code> has a single reference count); in the worst
6071 case, it will copy the buffer from the <code>GBytes</code> and retain ownership
6072 of that copy. In either case, when the pixbuf downgrades itself to
6073 pixels, it is assumed that the calling application will modify the
6074 pixel data.</p>
6075 <h2>What would immutable pixbufs look like?</h2>
6076 <p>I mentioned this a bit in "<a href="https://people.gnome.org/~federico/blog/reducing-image-copies.html">Reducing Copies</a>". The
6077 loaders in gdk-pixbuf would create immutable pixbufs, with an internal
6078 representation that is friendly to GPUs. In the proposed scheme, that
6079 internal representation would be a Cairo image surface; it can be
6080 something else if GTK/GDK eventually prefer a different way of
6081 shipping image data into the toolkit.</p>
6082 <p>Those pixbufs would be immutable. In true C fashion we can call it
6083 undefined behavior to change the pixel data (say, an app could request
6084 <code>gimme_the_cairo_surface</code> and tweak it, but that would not be
6085 supported).</p>
6086 <p>I think we could also have a "just give me the pixels" API, and a
6087 "create a pixbuf from these pixels" one, but those would be one-time
6088 conversions at the edge of the API. Internally, the pixel data that
6089 actually lives inside a <code>GdkPixbuf</code> would remain immutable, in some
6090 preferred representation, which is not necessarily what the
6091 application sees.</p>
6092 <h2>What worked well</h2>
6093 <p>A small API to load multiple image formats, and paint the images
6094 easily to the screen, while handling most of the X awkwardness
6095 semi-automatically, was very useful!</p>
6096 <p>A way to get and modify pixel data: applications clearly like doing
6097 this. We can formalize it as an application-side thing only, and keep
6098 the internal representation immutable and in a format that can evolve
6099 according to the needs of the internal API.</p>
6100 <p>Pluggable loaders, up to a point. Gdk-pixbuf doesn't support all the
6101 image formats in the world out of the box, but it is relatively easy
6102 for third-parties to provide loaders that, once installed, are
6103 automatically usable for all applications.</p>
6104 <h2>What didn't work well</h2>
6105 <p>Having effectively two pixel formats supported, and nothing else:
6106 gdk-pixbuf does packed RGB and unpremultiplied RGBA, and that's it.
6107 This isn't completely terrible: applications which really want to
6108 know about indexed or grayscale images, or high bit-depth ones, are
6109 <em>probably</em> specialized enough that they can afford to have their own
6110 custom loaders with all the functionality they need.</p>
6111 <p>Pluggable loaders, up to a point. While it is relatively easy to
6112 create third-party loaders, installation is awkward from a system's
6113 perspective: one has to run the script to regenerate the loader cache,
6114 there are more shared libraries running around, and the loaders are
6115 not sandboxed by default.</p>
6116 <p>I'm not sure if it's worthwhile to let any application read "any"
6117 image format if gdk-pixbuf supports it. If your word processor lets
6118 you paste an image into the document... do you want it to use
6119 gdk-pixbuf's limited view of things and include a high bit-depth image
6120 with its probably inadequate conversions? Or would you rather do some
6121 processing by hand to ensure that the image looks as good as it can,
6122 in the format that your word processor actually supports? I don't
6123 know.</p>
6124 <p>The API for animations is very awkward. We don't even support
6125 APNG... but honestly I don't recall actually seeing one of those in
6126 the wild.</p>
6127 <p>The progressive loading API is awkward. The "feed some bytes into the
6128 loader" part is mostly okay; the "notify me about changes to the pixel
6129 data" is questionable nowadays. Web browsers don't use it; they
6130 implement their own loaders. Even EOG doesn't use it.</p>
6131 <p>I think most code that actually connects to <code>GdkPixbufLoader</code>'s
6132 signals only uses the <code>size-prepared</code> signal — the one that gets
6133 emitted soon after reading the image headers, when the loader gets to
6134 know the dimensions of the image. Apps sometimes use this to say,
6135 "this image is W*H pixels in size", but don't actually decode the
6136 rest of the image.</p>
6137 <p>The gdk-pixbuf model of static images, or GIF animations, doesn't work
6138 well for multi-page TIFFs. I'm not sure if this is actualy a problem.
6139 Again, applications with actual needs for multi-page TIFFs are
6140 probably specialized enough that they will want a full-featured TIFF
6141 loader of their own.</p>
6142 <h2>Awkward architectures</h2>
6143 <h3>Thumbnailers</h3>
6144 <p>The thumbnailing system has slowly been moving towards a model where
6145 we actually have thumbnailers specific to each file format, instead of
6146 just assuming that we can dump any image into a gdk-pixbuf loader.</p>
6147 <p>If we take this all the way, we would be able to remove some weird
6148 code in, for example, the JPEG pixbuf loader. Right now it supports
6149 loading images at a size that the calling code requests, not only at
6150 the "natural" size of the JPEG. The thumbnailer can say, "I want to
6151 load this JPEG at 128x128 pixels" or whatever, and <em>in theory</em> the
6152 JPEG loader will do the minimal amount of work required to do that.
6153 It's not 100% clear to me if this is actually working as intended, or
6154 if we downscale the whole image anyway.</p>
6155 <p>We had a distinction between in-process and out-of-process
6156 thumbnailers, and it had to do with the way pixbuf loaders are used;
6157 I'm not sure if they are all out-of-process and sandboxed now.</p>
6158 <h3>Non-raster data</h3>
6159 <p>There is a gdk-pixbuf loader for SVG images which uses librsvg
6160 internally, but only in a very basic way: it simply loads the SVG at
6161 its preferred size. Librsvg jumps through some hoops to compute a
6162 "preferred size" for SVGs, as not all of them actually indicate one.
6163 The SVG model would rather have the renderer say that the SVG is to be
6164 inserted into a rectangle of certain width/height, and
6165 scaled/positioned inside the rectangle according to some other
6166 parameters (i.e. like one would put it inside an HTML document, with a
6167 <code>preserveAspectRatio</code> attribute and all that). GNOME applications
6168 historically operated with a different model, one of "load me an
6169 image, I'll scale it to whatever size, and paint it".</p>
6170 <p>This gdk-pixbuf loader for SVG files gets used for the SVG
6171 thumbnailer, or more accurately, the "throw random images into a
6172 gdk-pixbuf loader" thumbnailer. It may be better/cleaner to have a
6173 specific thumbnailer for SVGs instead.</p>
6174 <p>Even EOG, our by-default image viewer, doesn't use the gdk-pixbuf
6175 loader for SVGs: it actually special-cases them and uses librsvg
6176 directly, to be able to load an SVG once and re-render it at different
6177 sizes if one changes the zoom factor, for example.</p>
6178 <p>GTK reads its SVG icons... without using librsvg... by assuming that
6179 librsvg installed its gdk-pixbuf loader, so it loads them as any
6180 normal raster image. This kind of dirty, but I can't quite pinpoint
6181 why. I'm sure it would be convenient for icon themes to ship a single
6182 SVG with tons of icons, and some metadata on their <code>id</code>s, so that GTK
6183 could pick them out of the SVG file with <code>rsvg_render_cairo_sub()</code> or
6184 something. Right now icon theme authors are responsible for splitting
6185 out those huge SVGs into many little ones, one for each icon, and I
6186 don't think that's their favorite thing in the world to do :)</p>
6187 <h3>Exotic raster data</h3>
6188 <p>High bit-depth images... would you expect EOG to be able to load them?
6189 Certainly; maybe not with all the fancy conversions from a real RAW
6190 photo editor. But maybe this can be done as EOG-specific plugins,
6191 rather than as low in the platform as the gdk-pixbuf loaders?</p>
6192 <p>(Same thing for thumbnailing high bit-depth images: the loading code
6193 should just provide its own thumbnailer program for those.)</p>
6194 <h3>Non-image metadata</h3>
6195 <p>The <code>gdk_pixbuf_set_option</code> / <code>gdk_pixbuf_get_option</code> family of
6196 functions is so that pixbuf loaders can set key/value pairs of strings
6197 onto a pixbuf. Loaders use this for <code>comment</code> blocks, or ICC profiles
6198 for color calibration, or DPI information for images that have it, or
6199 EXIF data from photos. It is up to applications to actually use this
6200 information.</p>
6201 <p>It's a bit uncomfortable that gdk-pixbuf makes no promises about the
6202 kind of raster data it gives to the caller: right now it is raw
6203 RGB(A) data that is not gamma-corrected nor in any particular color
6204 space. It is up to the caller to see if the pixbuf has an ICC profile
6205 attached to it as an <code>option</code>. Effectively, this means that
6206 applications don't know if they are getting SRGB, or linear RGB, or
6207 what... unless they specifically care to look.</p>
6208 <p>The gdk-pixbuf API could probably make promises: if you call <em>this
6209 function</em> you will get SRGB data; if you call <em>this other function</em>,
6210 you'll get the raw RGBA data and we'll tell you its
6211 colorspace/gamma/etc.</p>
6212 <p>The various <code>set_option</code> / <code>get_option</code> pairs are also usable by the
6213 gdk-pixbuf <em>saving</em> code (up to now we have just talked about
6214 loaders). I don't know enough about how applications use the saving
6215 code in gdk-pixbuf... the thumbnailers use it to save PNGs or JPEGs,
6216 but other apps? No idea.</p>
6217 <h2>What I would like to see</h2>
6218 <p><strong>Immutable pixbufs in a useful format.</strong> I've started <a href="https://gitlab.gnome.org/GNOME/gdk-pixbuf/merge_requests/6">work on
6219 this</a> in a merge request; the internal code is now ready
6220 to take in different internal representations of pixel data. My goal
6221 is to make Cairo image surfaces the preferred, immutable, internal
6222 representation. This would give us a
6223 <code>gdk_pixbuf_get_cairo_surface()</code>, which pretty much everything that
6224 needs one reimplements by hand.</p>
6225 <p><strong>Find places that assume mutable pixbufs.</strong> To gradually deprecate
6226 mutable pixbufs, I think we would need to audit applications and
6227 libraries to find places that cause <code>GdkPixbuf</code> structures to degrade
6228 into mutable ones: basically, find callers of
6229 <code>gdk_pixbuf_get_pixels()</code> and related functions, see what they do, and
6230 reimplement them differently. Maybe they don't need to tint icons by
6231 hand anymore? Maybe they <em>don't need icons</em> anymore, given our
6232 changing UI paradigms? Maybe they are using gdk-pixbuf as an image
6233 loader only?</p>
6234 <p><strong>Reconsider the loading-updates API.</strong> Do we need the
6235 <code>GdkPixbufLoader::area-updated</code> signal at all? Does anything break
6236 if we just... not emit it, or just emit it once at the end of the
6237 loading process? (Caveat: keeping it unchanged more or less means
6238 that "immutable pixbufs" as loaded by gdk-pixbuf actually mutate while
6239 being loaded, and this mutation is exposed to applications.)</p>
6240 <p><strong>Sandboxed loaders.</strong> While these days gdk-pixbuf loaders prefer the
6241 progressive feed-it-bytes API, sandboxed loaders would maybe prefer a
6242 read-a-whole-file approach. I don't know enough about memfd or how
6243 sandboxes pass data around to know how either would work.</p>
6244 <p><strong>Move loaders to Rust.</strong> Yes, really. Loaders are
6245 security-sensitive, and while we <em>do</em> need to sandbox them, it would
6246 certainly be better to do them in a memory-safe language. There are
6247 already pure Rust-based image loaders: <a href="https://crates.io/crates/jpeg-decoder">JPEG</a>,
6248 <a href="https://crates.io/crates/png">PNG</a>, <a href="https://github.com/PistonDevelopers/image-tiff">TIFF</a>, <a href="https://github.com/PistonDevelopers/image-gif">GIF</a>, <a href="https://github.com/PistonDevelopers/image/tree/master/src/ico">ICO</a>.
6249 I have no idea how featureful they are. We can certainly try them
6250 with gdk-pixbuf's own suite of test images. We can modify them to add
6251 hooks for things like a <code>size-prepared</code> notification, if they don't
6252 already have a way to read "just the image headers".</p>
6253 <p>Rust makes it very easy to plug in <a href="https://crates.io/crates/criterion">micro-benchmarks</a>,
6254 <a href="https://crates.io/crates/afl">fuzz testing</a>, and other modern amenities. These would be
6255 perfect for improving the loaders.</p>
6256 <p>I started <a href="https://gitlab.gnome.org/federico/gdk-pixbuf/tree/rust-loader/gdk-pixbuf/rust">sketching a Rust backend for gdk-pixbuf
6257 loaders</a> some months ago, but there's nothing useful
6258 yet. One mismatch between gdk-pixbuf's model for loaders, and the
6259 existing Rust codecs, is that Rust codecs generally take something
6260 that implements the <code>Read</code> trait: a blocking API to read bytes from
6261 abstract sources; it's a pull API. The gdk-pixbuf model is a push
6262 API: the calling code creates a loader object, and then pushes bytes
6263 into it. The gdk-pixbuf convenience functions that take a
6264 <code>GInputStream</code> basically do this:</p>
6265 <div class="highlight"><pre><span></span><code><span class="n">loader</span> <span class="o">=</span> <span class="n">gdk_pixbuf_loader_new</span> <span class="p">(...);</span>
6266
6267 <span class="k">while</span> <span class="p">(</span><span class="n">more_bytes</span><span class="p">)</span> <span class="p">{</span>
6268 <span class="n">n_read</span> <span class="o">=</span> <span class="n">g_input_stream_read</span> <span class="p">(</span><span class="n">stream</span><span class="p">,</span> <span class="n">buffer</span><span class="p">,</span> <span class="p">...);</span>
6269 <span class="n">gdk_pixbuf_loader_write</span><span class="p">(</span><span class="n">loader</span><span class="p">,</span> <span class="n">buffer</span><span class="p">,</span> <span class="n">n_read</span><span class="p">,</span> <span class="p">...);</span>
6270 <span class="p">}</span>
6271
6272 <span class="n">gdk_pixbuf_loader_close</span> <span class="p">(</span><span class="n">loader</span><span class="p">);</span>
6273 </code></pre></div>
6274
6275 <p>However, this cannot be flipped around easily. We could probably use
6276 a second thread (easy, safe to do in Rust) to make the reader/decoder
6277 thread block while the main thread pushes bytes into it.</p>
6278 <p>Also, I don't know how the Rust bindings for GIO present things like
6279 <code>GInputStream</code> and friends, with our nice async cancellables and all
6280 that.</p>
6281 <p><strong>Deprecate animations?</strong> Move that code to EOG, just so one can look
6282 at memes in it? Do any "real apps" actually use GIF animations for
6283 their UI?</p>
6284 <p><strong>Formalize promises around returned color profiles, gamma, etc.</strong> As
6285 mentioned above: have an "easy API" that returns SRGB, and a "raw API"
6286 that returns the ARGB data from the image, plus info on its ICC
6287 profile, gamma, or any other info needed to turn this into a
6288 "good enough to be universal" representation. (I <em>think</em> all the
6289 Apple APIs that pass colors around do so with an ICC profile attached,
6290 which seems... pretty much necessary for correctness.)</p>
6291 <p><strong>Remove the internal MIME-sniffing machinery.</strong> And just use GIO's.</p>
6292 <p><strong>Deprecate the crufty/old APIs in gdk-pixbuf.</strong>
6293 Scaling/transformation, compositing, <code>GdkPixdata</code>,
6294 <code>gdk-pixbuf-csource</code>, all those. Pixel crunching can be done by
6295 Cairo; the others are better done with <code>GResource</code> these days.</p>
6296 <p><strong>Figure out if we want blessed codecs; fix thumbnailers.</strong> Link those
6297 loaders statically, unconditionally. Exotic formats can go in their
6298 own custom thumbnailers. Figure out if we want sandboxed loaders for
6299 everything, or just for user-side images (not ones read from the
6300 trusted system installation).</p>
6301 <p><strong>Have GTK4 communicate clearly about its drawing model.</strong> I think we
6302 are having a disconnect between the GUI chrome, which is CSS/GPU
6303 friendly, and graphical content generated by applications, which by
6304 default right now is done via Cairo. And having Cairo as a to-screen
6305 and to-printer API is certainly very convenient! You Wouldn't Print a
6306 GUI, but certainly you would print a displayed document.</p>
6307 <p>It would also be useful for GTK4 to actually define what its preferred
6308 image format is if it wants to ship it to the GPU with as little work
6309 as possible. Maybe it's a Cairo image surface? Maybe something else?</p>
6310 <h2>Conclusion</h2>
6311 <p>We seem to change imaging models every ten years or so. Xlib, then
6312 Xrender with Cairo, then GPUs and CSS-based drawing for widgets.
6313 We've gone from trusted data on your local machine, to potentially malicious data that
6314 rains from the Internet. Gdk-pixbuf has spanned all of these periods
6315 so far, and it is due for a big change.</p></content><category term="misc"></category><category term="gnome"></category><category term="gdk-pixbuf"></category></entry><entry><title>Debugging an Rc<T> reference leak in Rust</title><link href="https://people.gnome.org/~federico/blog/debugging-reference-leak-in-rust.html" rel="alternate"></link><published>2018-08-29T16:47:13-05:00</published><updated>2018-08-29T16:47:13-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-08-29:/~federico/blog/debugging-reference-leak-in-rust.html</id><summary type="html"><p>The <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/325">bug</a> that caused two brown-paper-bag released in librsvg —
6316 because it was leaking all the SVG nodes — has been interesting.</p>
6317 <p><em>Memory leaks in Rust? Isn't it supposed to prevent that?</em></p>
6318 <p>Well, yeah, but the leaks were caused by the C side of things, and by
6319 <code>unsafe</code> code in Rust, which …</p></summary><content type="html"><p>The <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/325">bug</a> that caused two brown-paper-bag released in librsvg —
6320 because it was leaking all the SVG nodes — has been interesting.</p>
6321 <p><em>Memory leaks in Rust? Isn't it supposed to prevent that?</em></p>
6322 <p>Well, yeah, but the leaks were caused by the C side of things, and by
6323 <code>unsafe</code> code in Rust, which does not prevent leaks.</p>
6324 <p><a href="https://gitlab.gnome.org/federico/librsvg/commit/29af8b19ea103f754c318cfbf8b03c31265f0394">The first part of the bug</a> was easy: C code started calling a
6325 function implemented in Rust, which returns a newly-acquired reference
6326 to an SVG node. The old code simply got a pointer to the node,
6327 without acquiring a reference. The new code was forgetting to
6328 <code>rsvg_node_unref()</code>. No biggie.</p>
6329 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/commit/2d3ddca130d7d023daedf77a6ab58fefec510292">The second part of the bug</a> was trickier to find. The C code
6330 was apparently calling all the functions to unref nodes as
6331 appropriate, and even calling the <code>rsvg_tree_free()</code> function in the
6332 end; this is the "free the whole SVG tree" function.</p>
6333 <p>There are these types:</p>
6334 <div class="highlight"><pre><span></span><code><span class="c1">// We take a pointer to this and expose it as an opaque pointer to C</span>
6335 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">RsvgTree</span><span class="w"> </span><span class="p">{}</span><span class="w"></span>
6336
6337 <span class="c1">// This is the real structure we care about</span>
6338 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Tree</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6339 <span class="w"> </span><span class="c1">// This is the Rc that was getting leaked</span>
6340 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">root</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">Node</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
6341 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
6342 <span class="p">}</span><span class="w"></span>
6343 </code></pre></div>
6344
6345 <p><code>Tree</code> is the real struct that holds the root of the SVG tree and some
6346 other data. Each node is an <code>Rc&lt;Node&gt;</code>; the root node was getting
6347 leaked (... and all the children, recursively) because its reference
6348 count never went down from 1.</p>
6349 <p><code>RsvgTree</code> is just an empty type. The code does an unsafe cast of
6350 <code>*const Tree</code> as <code>*const RsvgTree</code> in order to expose a raw pointer to
6351 the C code.</p>
6352 <p>The <code>rsvg_tree_free()</code> function, callable from C, looked like this:</p>
6353 <div class="highlight"><pre><span></span><code><span class="cp">#[no_mangle]</span><span class="w"></span>
6354 <span class="k">pub</span><span class="w"> </span><span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="k">fn</span> <span class="nf">rsvg_tree_free</span><span class="p">(</span><span class="n">tree</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RsvgTree</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6355 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="o">!</span><span class="n">tree</span><span class="p">.</span><span class="n">is_null</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6356 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">from_raw</span><span class="p">(</span><span class="n">tree</span><span class="p">)</span><span class="w"> </span><span class="p">};</span><span class="w"></span>
6357 <span class="w"> </span><span class="c1">// ^ this returns a Box&lt;RsvgTree&gt; which is an empty type!</span>
6358 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6359 <span class="p">}</span><span class="w"></span>
6360 </code></pre></div>
6361
6362 <p>When we call <code>Box::from_raw()</code> on a <code>*mut RsvgTree</code>, it gives us back
6363 a <code>Box&lt;RsvgTree&gt;</code>... which is a box of a zero-sized type. So, the program
6364 frees zero memory when the box gets dropped.</p>
6365 <p>The code was missing this cast:</p>
6366 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">tree</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="p">(</span><span class="n">tree</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">Tree</span><span class="p">)</span><span class="w"> </span><span class="p">};</span><span class="w"></span>
6367 <span class="w"> </span><span class="c1">// ^ this cast to the actual type inside the Box</span>
6368 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">from_raw</span><span class="p">(</span><span class="n">tree</span><span class="p">)</span><span class="w"> </span><span class="p">};</span><span class="w"></span>
6369 </code></pre></div>
6370
6371 <p>So, <code>tree as *mut Tree</code> gives us a value which will cause
6372 <code>Box::from_raw()</code> to return a <code>Box&lt;Tree&gt;</code>, which is what we intended.
6373 Dropping the box will drop the <code>Tree</code>, reduce the last reference count
6374 on the root node, and free all the nodes recursively.</p>
6375 <h2>Monitoring an <code>Rc&lt;T&gt;</code>'s reference count in gdb</h2>
6376 <p>So, how does one set a gdb watchpoint on the reference count?</p>
6377 <p>First I set a breakpoint on a function which I knew would get passed
6378 the <code>Rc&lt;Node&gt;</code> I care about:</p>
6379 <div class="highlight"><pre><span></span><code>(gdb) b &lt;rsvg_internals::structure::NodeSvg as rsvg_internals::node::NodeTrait&gt;::set_atts
6380 Breakpoint 3 at 0x7ffff71f3aaa: file rsvg_internals/src/structure.rs, line 131.
6381
6382 (gdb) c
6383 Continuing.
6384
6385 Thread 1 &quot;rsvg-convert&quot; hit Breakpoint 3, &lt;rsvg_internals::structure::NodeSvg as rsvg_internals::node::NodeTrait&gt;::set_atts (self=0x646c60, node=0x64c890, pbag=0x64c820) at rsvg_internals/src/structure.rs:131
6386
6387 (gdb) p node
6388 $5 = (alloc::rc::Rc&lt;rsvg_internals::node::Node&gt; *) 0x64c890
6389 </code></pre></div>
6390
6391 <p>Okay, <code>node</code> is a reference to an <code>Rc&lt;Node&gt;</code>. What's inside?</p>
6392 <div class="highlight"><pre><span></span><code>(gdb) p *node
6393 $6 = {ptr = {pointer = {__0 = 0x625800}}, phantom = {&lt;No data fields&gt;}}
6394 </code></pre></div>
6395
6396 <p>Why, a <code>pointer</code> to the actual contents of the <code>Rc</code>. Look inside
6397 again:</p>
6398 <div class="highlight"><pre><span></span><code>(gdb) p *node.ptr.pointer.__0
6399 $9 = {strong = {value = {value = 3}}, weak = {value = {value = 1}}, ... and lots of extra crap ...
6400 </code></pre></div>
6401
6402 <p>Aha! There are the <code>strong</code> and <code>weak</code> reference counts. So, set a
6403 watchpoint on the strong reference count:</p>
6404 <div class="highlight"><pre><span></span><code>(gdb) set $ptr = &amp;node.ptr.pointer.__0.strong.value.value
6405 (gdb) watch *$ptr
6406 Hardware watchpoint 4: *$ptr
6407 </code></pre></div>
6408
6409 <p>Continue running the program until the reference count changes:</p>
6410 <div class="highlight"><pre><span></span><code><span class="ss">(</span><span class="nv">gdb</span><span class="ss">)</span> <span class="k">continue</span>
6411 <span class="nv">Thread</span> <span class="mi">1</span> <span class="s2">&quot;</span><span class="s">rsvg-convert</span><span class="s2">&quot;</span> <span class="nv">hit</span> <span class="nv">Hardware</span> <span class="nv">watchpoint</span> <span class="mi">4</span>: <span class="o">*</span>$<span class="nv">ptr</span>
6412
6413 <span class="nv">Old</span> <span class="nv">value</span> <span class="o">=</span> <span class="mi">3</span>
6414 <span class="nv">New</span> <span class="nv">value</span> <span class="o">=</span> <span class="mi">2</span>
6415 </code></pre></div>
6416
6417 <p>At this point I can print a stack trace and see if it makes sense,
6418 check that the refs/unrefs are matched, etc.</p>
6419 <p>TL;DR: dig into the <code>Rc&lt;T&gt;</code> until you find the reference count, and
6420 watch it. It's wrapped in several layers of Rust-y types; <code>NonNull</code>
6421 pointers, an <code>RcBox</code> for the actual container of the refcount plus the
6422 object it's wrapping, and <code>Cell</code>s for the refcount values. Just dig
6423 until you reach the refcount values and they are there.</p>
6424 <h2>So, how did I find the missing cast?</h2>
6425 <p>Using that gdb recipe, I watched the reference count of the toplevel
6426 SVG node change until the program exited. When the program
6427 terminated, the reference count was 1 — it should have dropped to 0 if
6428 there was no memory leak.</p>
6429 <p>The last place where the toplevel node loses a reference is in
6430 <code>rsvg_tree_free()</code>. I ran the program again and checked if that
6431 function was being called; it <em>was</em> being called correctly. So I knew
6432 that the problem must lie in that function. After a little
6433 head-scratching, I found the missing cast. Other functions of the
6434 form <code>rsvg_tree_whatever()</code> had that cast, but <code>rsvg_tree_free()</code> was
6435 missing it.</p>
6436 <p>I think Rust now has better facilities to tag structs that are exposed
6437 as raw pointers to <code>extern</code> code, to avoid this kind of perilous
6438 casting. We'll see.</p>
6439 <p>In the meantime, apologies for the buggy releases!</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category><category term="librsvg"></category></entry><entry><title>Logging from Rust in librsvg</title><link href="https://people.gnome.org/~federico/blog/logging-in-librsvg.html" rel="alternate"></link><published>2018-08-03T19:29:43-05:00</published><updated>2018-08-03T19:29:43-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-08-03:/~federico/blog/logging-in-librsvg.html</id><summary type="html"><p>Over in <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/281">this issue</a> we are discussing how to add debug logging
6440 for librsvg.</p>
6441 <p>A popular way to add logging to Rust code is to use the <a href="https://crates.io/crates/log">log</a> crate.
6442 This lets you sprinkle simple messages in your code:</p>
6443 <div class="highlight"><pre><span></span><code><span class="n">error</span><span class="o">!</span><span class="p">(</span><span class="s">&quot;something bad happened: {}&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">foo</span><span class="p">);</span><span class="w"></span>
6444 <span class="n">debug</span><span class="o">!</span><span class="p">(</span><span class="s">&quot;a debug message&quot;</span><span class="p">);</span><span class="w"></span>
6445 </code></pre></div>
6446
6447 <p>However, the <a href="https://crates.io/crates/log">log …</a></p></summary><content type="html"><p>Over in <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/281">this issue</a> we are discussing how to add debug logging
6448 for librsvg.</p>
6449 <p>A popular way to add logging to Rust code is to use the <a href="https://crates.io/crates/log">log</a> crate.
6450 This lets you sprinkle simple messages in your code:</p>
6451 <div class="highlight"><pre><span></span><code><span class="n">error</span><span class="o">!</span><span class="p">(</span><span class="s">&quot;something bad happened: {}&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">foo</span><span class="p">);</span><span class="w"></span>
6452 <span class="n">debug</span><span class="o">!</span><span class="p">(</span><span class="s">&quot;a debug message&quot;</span><span class="p">);</span><span class="w"></span>
6453 </code></pre></div>
6454
6455 <p>However, the <a href="https://crates.io/crates/log">log</a> create is just a facade, and by default the
6456 messages do not get emitted anywhere. The calling code has to set up
6457 a logger. Crates like <a href="https://crates.io/crates/env_logger">env_logger</a> let one set up a logger, during
6458 program initialization, that gets configured through an environment
6459 variable.</p>
6460 <p>And this is a problem for librsvg: we are <em>not</em> the program's
6461 initialization! Librsvg is a library; it doesn't have a <code>main()</code>
6462 function. And since most of the calling code is not Rust, we can't
6463 assume that they can call code that can initialize the logging
6464 framework.</p>
6465 <h2>Why not use glib's logging stuff?</h2>
6466 <p>Currently this is a bit clunky to use from Rust, since glib's
6467 structured logging functions are not bound yet in <a href="http://gtk-rs.org/docs/glib/">glib-rs</a>. Maybe it
6468 would be good to bind them and get this over with.</p>
6469 <h2>What user experience do we want?</h2>
6470 <p>In the past, what has worked well for me to do logging from libraries
6471 is to allow the user to set an environment variable to control the
6472 logging, or to drop a log configuration file in their $HOME. The
6473 former works well when the user is in control of running the program
6474 that will print the logs; the latter is useful when the user is not
6475 directly in control, like for gnome-shell, which gets launched through
6476 a lot of magic during session startup.</p>
6477 <p>For librsvg, it's probably enough to just use an environment
6478 variable. Set <code>RSVG_LOG=parse_errors</code>, run your program, and get
6479 useful output. <a href="https://makezine.com/2016/06/10/push-button-receive-bacon/">Push button, receive bacon</a>.</p>
6480 <h2>Other options in Rust?</h2>
6481 <p>There is a <a href="https://crates.io/crates/slog">slog</a> crate which looks promising. Instead of using
6482 context-less macros which depend on a single global logger, it
6483 provides logging macros to which you pass a logger object.</p>
6484 <p>For librsvg, this means that the basic <code>RsvgHandle</code> could create its
6485 own logger, based on an environment variable or whatever, and pass it
6486 around to all its child functions for when they need to log something.</p>
6487 <p>Slog supports structured logging, and seems to have some fancy output
6488 modes. We'll see.</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Three big things happening in librsvg</title><link href="https://people.gnome.org/~federico/blog/three-big-things-happening-in-librsvg.html" rel="alternate"></link><published>2018-05-21T19:21:08-05:00</published><updated>2018-05-21T19:21:08-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-05-21:/~federico/blog/three-big-things-happening-in-librsvg.html</id><summary type="html"><p>I am incredibly happy because of three <strong>big</strong> things that are going
6489 on in librsvg right now:</p>
6490 <ol>
6491 <li>
6492 <p>Paolo Borelli finished porting all the CSS properties to Rust.
6493 What was once a gigantic <code>RsvgState</code> struct in C is totally gone,
6494 along with all the janky C code to parse individual properties …</p></li></ol></summary><content type="html"><p>I am incredibly happy because of three <strong>big</strong> things that are going
6495 on in librsvg right now:</p>
6496 <ol>
6497 <li>
6498 <p>Paolo Borelli finished porting all the CSS properties to Rust.
6499 What was once a gigantic <code>RsvgState</code> struct in C is totally gone,
6500 along with all the janky C code to parse individual properties.
6501 The process of porting <code>RsvgState</code> to Rust has been going on <a href="https://gitlab.gnome.org/GNOME/librsvg/merge_requests/39">since
6502 about two months ago</a>, and has involved many multi-commit
6503 merge requests and refactorings. This is a tremendous amount of
6504 really good work! The result is all in Rust now in a <code>State</code>
6505 struct, which is opaque from C's viewpoint. The only places in C
6506 that still require accessors to the <code>State</code> are in the filter
6507 effects code. Which brings me to...</p>
6508 </li>
6509 <li>
6510 <p>Ivan Molodetskikh, my Summer of Code student, submitted his <a href="https://gitlab.gnome.org/GNOME/librsvg/merge_requests/64">first
6511 merge request</a> and it's merged to master now. This ports
6512 the bookkeeping infrastructure for SVG filters to Rust, and also
6513 the <code>feOffset</code> filter is ported now. Right now the code doesn't do
6514 anything fancy to iterate over the pixels of Cairo image surfaces;
6515 that will come later. I am very happy that filters, which were a
6516 huge barrier, are now starting to get chipped away into nicer code.</p>
6517 </li>
6518 <li>
6519 <p>I have started to move librsvg's old representation of CSS
6520 properties into something that can really represent properties that
6521 are not specified, or explicitly set to <code>inherit</code> from an SVG
6522 element's parent, or set to a normal value. Librsvg never had a
6523 representation of property values that actually matched the SVG/CSS
6524 specs; it just knew whether a property was specified or not for an
6525 element. This worked fine for properties which the spec mandates
6526 that they should inherit automatically, but those that <em>don't</em>,
6527 were handled through special hacks. The new code makes this a lot
6528 cleaner. It should also make it easier to copy Servo's idioms for
6529 property inheritance.</p>
6530 </li>
6531 </ol></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Reducing the number of image copies in GNOME</title><link href="https://people.gnome.org/~federico/blog/reducing-image-copies.html" rel="alternate"></link><published>2018-05-14T07:45:11-05:00</published><updated>2018-05-14T07:45:11-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-05-14:/~federico/blog/reducing-image-copies.html</id><summary type="html"><p><a href="https://magcius.github.io/xplain/article/">Our graphics stack</a> that deals with images has evolved a lot
6532 over the years.</p>
6533 <h1>In ye olden days</h1>
6534 <p>In the context of GIMP/GNOME, the only thing that knew how to draw RGB
6535 images to X11 windows (doing palette mapping for 256-color graphics
6536 cards and dithering if necessary) was the …</p></summary><content type="html"><p><a href="https://magcius.github.io/xplain/article/">Our graphics stack</a> that deals with images has evolved a lot
6537 over the years.</p>
6538 <h1>In ye olden days</h1>
6539 <p>In the context of GIMP/GNOME, the only thing that knew how to draw RGB
6540 images to X11 windows (doing palette mapping for 256-color graphics
6541 cards and dithering if necessary) was the GIMP. Later, when GTK+ was
6542 written, it exported a <code>GtkPreview</code> widget, which could take an RGB
6543 image buffer supplied by the application and render it to an X window
6544 — this was what GIMP plug-ins could use in their user interface to
6545 show, well, previews of what they were about to do with the user's
6546 images. Later we got some obscure magic in a <code>GdkColorContext</code>
6547 object, which helped allocate X11 colors for the X drawing primitives.
6548 In turn, <code>GdkColorContext</code> came from the port that Miguel and I did of
6549 XmHTML's color context object (and for those that remember, XmHTML
6550 became the first version of GtkHtml; later it was rewritten as a port
6551 of KDE's HTML widget). Thankfully all that stuff is gone now; we can
6552 now assume that video cards are 24-bit RGB or better everywhere, and
6553 there is no need to worry about limited color palettes and color
6554 allocation.</p>
6555 <p>Later, we started using the Imlib library, from the Enlightenment
6556 project, as an easy API to load images — the APIs from libungif,
6557 libjpeg, libpng, etc. were not something one really wanted to use
6558 directly — and also to keep images in memory with a uniform
6559 representation. Unfortunately, Imlib's memory management was
6560 peculiar, as it was tied to Enlightenment's model for caching and
6561 rendering loaded/scaled images.</p>
6562 <p>A bunch of people worked to write GdkPixbuf: it kept Imlib's concepts
6563 of a unified representation for image data, and an easy API to load
6564 various image formats. It added support for an alpha channel (we only
6565 had 1-bit masks before), and it put memory management in the hands of
6566 the calling application, in the form of reference counting. GdkPixbuf
6567 obtained some high-quality scaling functions, mainly for use by Eye Of
6568 Gnome (our image viewer) and by applications that just needed scaling
6569 instead of arbitrary transformations.</p>
6570 <p>Later, we got libart, the first library in GNOME to do antialiased
6571 vector rendering and affine transformations. Libart was more or less
6572 compatible with GdkPixbuf: they both had the same internal
6573 representation for pixel data, but one had to pass the
6574 pixels/width/height/rowstride around by hand.</p>
6575 <h1>Mea culpa</h1>
6576 <p>Back then I didn't understand <a href="https://keithp.com/~keithp/porterduff/p253-porter.pdf">premultiplied alpha</a>,
6577 which is now ubiquitous. The GIMP made the decision to use
6578 non-premultiplied alpha when it introduced layers with transparency,
6579 probably to "avoid losing data" from transparent pixels. GdkPixbuf
6580 follows the same scheme.</p>
6581 <p>(Now that the GIMP uses GEGL for its internal representation of
6582 images... I have no idea what it does with respect to alpha.)</p>
6583 <h1>Cairo and afterwards</h1>
6584 <p>Some time after the libart days, we got Cairo and pixman. Cairo had a
6585 different representation of images than GdkPixbuf's, and it supported
6586 more pixel formats and color models.</p>
6587 <p>GTK2 got patched to use Cairo in the toplevel API. We still had a
6588 dichotomy between Cairo's image surfaces, which are ARGB premultiplied
6589 data in memory, and GdkPixbufs, which are RGBA non-premultiplied.
6590 There are utilities in GTK+ to do these translations, but they are
6591 inconvenient: every time a program loads an image with GdkPixbuf's
6592 easy API, a translation has to happen from non-premul RGBA to premul
6593 ARGB.</p>
6594 <p>Having two formats means that we inevitably do translations back and
6595 forth of practically the same data. For example, when one embeds a
6596 JPEG inside an SVG, librsvg will read that JPEG using GdkPixbuf,
6597 translate it to Cairo's representation, composite it with Cairo onto
6598 the final result, and finally translate the whole thing back to a
6599 GdkPixbuf... if someone uses librsvg's legacy APIs to output pixbufs
6600 instead of rendering directly to a Cairo surface.</p>
6601 <p>Who uses that legacy API? GTK+, of course! GTK+ loads scalable SVG
6602 icons with GdkPixbuf's loader API, which dynamically links librsvg at
6603 runtime: in effect, GTK+ doesn't use librsvg directly. And the SVG
6604 pixbuf loader uses the "gimme a pixbuf" API in librsvg.</p>
6605 <h1>GPUs</h1>
6606 <p>Then, we got GPUs everywhere. Each GPU has its own preferred pixel
6607 format. Image data has to be copied to the GPU at some point.
6608 Cairo's ARGB needs to be translated to the GPU's preferred format and
6609 alignment.</p>
6610 <h1>Summary so far</h1>
6611 <ul>
6612 <li>
6613 <p>Libraries that load images from standard formats have different
6614 output formats. Generally they can be coaxed into spitting ARGB or
6615 RGBA, but we don't expect them to support any random representation
6616 that a GPU may want.</p>
6617 </li>
6618 <li>
6619 <p>GdkPixbuf uses non-premultiplied RGBA data, always in that order.</p>
6620 </li>
6621 <li>
6622 <p>Cairo uses premultiplied ARGB in platform-endian 32-bit chunks: if
6623 each pixel is 0xaarrggbb, then the bytes are shuffled around
6624 depending on whether the platform is little-endian or big-endian.</p>
6625 </li>
6626 <li>
6627 <p>Cairo internally uses a subset of the formats supported by pixman.</p>
6628 </li>
6629 <li>
6630 <p>GPUs use whatever they damn well please.</p>
6631 </li>
6632 <li>
6633 <p>Hilarity ensues.</p>
6634 </li>
6635 </ul>
6636 <h1>What would we like to do?</h1>
6637 <p>We would like to reduce the number of translations between image
6638 formats along the loading-processing-display pipeline. Here is a
6639 plan:</p>
6640 <ul>
6641 <li>
6642 <p>Make sure Cairo/pixman support the image formats that GPUs generally
6643 prefer. Have them do the necessary conversions if the rest of the
6644 program passes an unsupported format. Ensure that a Cairo image
6645 surface can be created with the GPU's preferred format.</p>
6646 </li>
6647 <li>
6648 <p>Make GdkPixbuf just be a wrapper around a Cairo image surface.
6649 <code>GdkPixbuf</code> is already an opaque structure, and it already knows how
6650 to copy pixel data in case the calling code requests it, or wants to
6651 turn a pixbuf from immutable to mutable.</p>
6652 </li>
6653 <li>
6654 <p>Provide GdkPixbuf APIs that deal with Cairo image surfaces. For
6655 example, deprecate <code>gdk_pixbuf_new()</code> and
6656 <code>gdk_pixbuf_new_from_data()</code>, in favor of a new
6657 <code>gdk_pixbuf_new_from_cairo_image_surface()</code>. Instead of
6658 <code>gdk_pixbuf_get_pixels()</code> and related functions, have
6659 <code>gdk_pixbuf_get_cairo_image_surface()</code>. Mark the "give me the pixel
6660 data" functions as highly discouraged, and only for use really by
6661 applications that want to use GdkPixbuf as an image loader and
6662 little else.</p>
6663 </li>
6664 <li>
6665 <p>Remove calls in GTK+ that cause image conversions; make them use
6666 Cairo image surfaces directly, from GdkTexture up.</p>
6667 </li>
6668 <li>
6669 <p>Audit applications to remove calls that cause image conversions.
6670 Generally, look for where they use GdkPixbuf's deprecated APIs and
6671 update them.</p>
6672 </li>
6673 </ul>
6674 <h1>Is this really a performance problem?</h1>
6675 <p>This is in the "<a href="https://people.gnome.org/~federico/docs/2005-GNOME-Summit/html/img13.html">excess work</a>" category of performance
6676 issues. All those conversions are not really slow (they don't make up
6677 for the biggest part of profiles), but they are nevertheless things
6678 that we could avoid doing. We may get some speedups, but it's
6679 probably more interesting to look at things like power consumption.</p>
6680 <p>Right now I'm seeing this as a cool, minor optimization, but more as
6681 <strong>a way to gradually modernize our image API</strong>.</p>
6682 <p>We seem to change imaging models every N years (X11 -&gt; libart
6683 -&gt; Cairo -&gt; render trees in GPUs -&gt; ???). It is very hard to change
6684 applications to use different APIs. In the meantime, we can provide a
6685 more linear path for image data, instead of doing unnecessary
6686 conversions everywhere.</p>
6687 <h1>Code</h1>
6688 <p>I have a <a href="https://gitlab.gnome.org/federico/gdk-pixbuf/tree/use-cairo-surface-internally"><code>use-cairo-surface-internally</code> branch in
6689 gdk-pixbuf</a>,
6690 which I'll be working on this week. Meanwhile, you may be interested
6691 in the ongoing <a href="https://wiki.gnome.org/Hackfests/Performance2018">Performance Hackfest in Cambridge</a>!</p></content><category term="misc"></category><category term="performance"></category><category term="gdk-pixbuf"></category><category term="gnome"></category></entry><entry><title>Madrid GNOME+Rust Hackfest, part 3 (conclusion)</title><link href="https://people.gnome.org/~federico/blog/madrid-gnome-rust-3.html" rel="alternate"></link><published>2018-04-23T15:04:32-05:00</published><updated>2018-04-23T15:04:32-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-04-23:/~federico/blog/madrid-gnome-rust-3.html</id><summary type="html"><p>The last code I wrote during the hackfest was the start of code
6692 generation for GObject interfaces. This is so that you can do</p>
6693 <div class="highlight"><pre><span></span><code><span class="n">gobject_gen</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6694 <span class="w"> </span><span class="n">interface</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6695 <span class="w"> </span><span class="kr">virtual</span><span class="w"> </span><span class="k">fn</span> <span class="nf">frob</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w"></span>
6696 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6697 <span class="p">}</span><span class="w"></span>
6698 </code></pre></div>
6699
6700 <p>and it will generate the appropriate <code>FooIface</code> like one would expect
6701 with the C versions of interfaces.</p>
6702 <p>It turns …</p></summary><content type="html"><p>The last code I wrote during the hackfest was the start of code
6703 generation for GObject interfaces. This is so that you can do</p>
6704 <div class="highlight"><pre><span></span><code><span class="n">gobject_gen</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6705 <span class="w"> </span><span class="n">interface</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6706 <span class="w"> </span><span class="kr">virtual</span><span class="w"> </span><span class="k">fn</span> <span class="nf">frob</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w"></span>
6707 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6708 <span class="p">}</span><span class="w"></span>
6709 </code></pre></div>
6710
6711 <p>and it will generate the appropriate <code>FooIface</code> like one would expect
6712 with the C versions of interfaces.</p>
6713 <p>It turns out that this can share a lot of code from the existing code
6714 generator for classes: both classes and interfaces are "just virtual
6715 method tables", plus signals and properties, and classes can actually
6716 have per-instance fields and such. I started refactoring the code
6717 generator to allow this.</p>
6718 <p>I also took a second look at how to present good error messages when
6719 the <code>syn</code> crate encounters a parse error. I need to sit down at home
6720 and experiment with this carefully.</p>
6721 <h2>Back home</h2>
6722 <p>I'm back home now, jetlagged but very happy that gnome-class is in a much
6723 more advanced a state than it was before the hackfest. I'm <strong>very
6724 thankful</strong> that practically everyone worked on it!</p>
6725 <p>Also, <strong>thanks</strong> to Alberto and Natalia for hosting me at their
6726 apartment and showing me around Madrid, all while wrangling their
6727 adorable baby Mario. We had a lovely time on Saturday, and ate
6728 excellent food downtown.</p>
6729 <p><img alt="Sponsored by the GNOME Foundation" src="https://people.gnome.org/~federico/blog/images/sponsored-badge-shadow.png"></p>
6730 <p><img alt="Hosted by OpenShine" src="https://www.openshine.com/wp-content/uploads/2016/03/openshine.png"></p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category><category term="hackfests"></category></entry><entry><title>Madrid GNOME+Rust Hackfest, part 2</title><link href="https://people.gnome.org/~federico/blog/madrid-gnome-rust-2.html" rel="alternate"></link><published>2018-04-20T08:59:23+02:00</published><updated>2018-04-20T08:59:23+02:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-04-20:/~federico/blog/madrid-gnome-rust-2.html</id><summary type="html"><p>Hacking on <a href="https://gitlab.gnome.org/federico/gnome-class">gnome-class</a> continues apace!</p>
6731 <p>Philippe <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/8">updated our dependencies</a>.</p>
6732 <p>Alberto made the syntax for per-instance <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/10">private structs</a>
6733 more ergonomic, and then made that code <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/12">nice and compact</a>.</p>
6734 <p>Martin improved our <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/11">conversion</a> from <code>CamelCase</code> to
6735 <code>snake_case</code> for code generation.</p>
6736 <p>Daniel added initial support for <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/9">GObject properties</a>.
6737 This is not finished yet …</p></summary><content type="html"><p>Hacking on <a href="https://gitlab.gnome.org/federico/gnome-class">gnome-class</a> continues apace!</p>
6738 <p>Philippe <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/8">updated our dependencies</a>.</p>
6739 <p>Alberto made the syntax for per-instance <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/10">private structs</a>
6740 more ergonomic, and then made that code <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/12">nice and compact</a>.</p>
6741 <p>Martin improved our <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/11">conversion</a> from <code>CamelCase</code> to
6742 <code>snake_case</code> for code generation.</p>
6743 <p>Daniel added initial support for <a href="https://gitlab.gnome.org/federico/gnome-class/merge_requests/9">GObject properties</a>.
6744 This is not finished yet, but the initial parser and code generation
6745 is done.</p>
6746 <p>Guillaume turned <a href="https://github.com/gtk-rs/gir">gir</a>, the binding generator in
6747 <a href="http://gtk-rs.org/">gtk-rs</a>, from a binary into a library crate. This will let
6748 us have all the GObject Introspection information for parent classes
6749 at compilation time.</p>
6750 <p>Antoni has been working on a tricky problem. <a href="https://gitlab.gnome.org/GNOME/gtk/blob/ecc612b1a2a0ec9791913cd22b4f94066f0448ae/gtk/gtkcontainer.h#L110">GTK+ structs that have
6751 bitfields</a> do not get reconstructed correctly from the
6752 GObject Introspection information — <a href="https://github.com/rust-lang/rfcs/issues/314">Rust does not handle C
6753 bitfields yet</a>. This has two implications.
6754 First, we lose some of the original struct fields in the generated
6755 bindings. Second, the sizes of the generated structs are not the
6756 same as the original C structs, so <code>g_type_register_static()</code>
6757 complains that one is trying to register an invalid class.</p>
6758 <p>Yesterday we got as far as reading the <a href="http://refspecs.linuxfoundation.org/elf/x86_64-abi-0.95.pdf">amd64</a> and [ARM][arm]
6759 ABI manuals to see what the hell C compilers are supposed to do for
6760 laying out structs with bitfields. Most likely, we will have a
6761 temporary fix in <a href="https://github.com/gtk-rs/gir">gir</a>'s code generator so that it generates
6762 structs with the same layout as the C ones, with padding in place of
6763 the space for bitfields. Later we can remove this when rustc gets
6764 support for C bitfields.</p>
6765 <p>I've been working on support for GObject interfaces. The basic
6766 parsing is done; I'm about to refactor the code generation so I can
6767 reuse the parts that fill vtables from classes.</p>
6768 <p>Yesterday we went to the Madrid Rust Meetup, a regular meeting of
6769 rustaceans here. Martin talked about WebRender; I talked about
6770 refactoring C to port it to Rust, and then Alex talked about Rust's
6771 plans for 2018. Fun times.</p>
6772 <p><img alt="Sponsored by the GNOME Foundation" src="https://people.gnome.org/~federico/blog/images/sponsored-badge-shadow.png"></p>
6773 <p><img alt="Hosted by OpenShine" src="https://www.openshine.com/wp-content/uploads/2016/03/openshine.png"></p>
6774 <p>[arm]: </p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category><category term="hackfests"></category><category term="librsvg"></category></entry><entry><title>Madrid GNOME+Rust Hackfest, part 1</title><link href="https://people.gnome.org/~federico/blog/madrid-gnome-rust-1.html" rel="alternate"></link><published>2018-04-18T02:55:12-05:00</published><updated>2018-04-18T02:55:12-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-04-18:/~federico/blog/madrid-gnome-rust-1.html</id><summary type="html"><p>I'm in Madrid since Monday, at the <a href="https://wiki.gnome.org/Hackfests/Rust2018">third GNOME+Rust hackfest</a>!
6775 The <a href="https://www.openshine.com/">OpenShine</a> folks are kindly letting us use their offices, on the
6776 seventh floor of a building by the <a href="https://www.openstreetmap.org/node/5303271991">Cuatro Caminos
6777 roundabout</a>.</p>
6778 <p>I am very, very thankful that this time everyone seems to be working
6779 on developing <a href="https://gitlab.gnome.org/federico/gnome-class">gnome-class</a>. It's …</p></summary><content type="html"><p>I'm in Madrid since Monday, at the <a href="https://wiki.gnome.org/Hackfests/Rust2018">third GNOME+Rust hackfest</a>!
6780 The <a href="https://www.openshine.com/">OpenShine</a> folks are kindly letting us use their offices, on the
6781 seventh floor of a building by the <a href="https://www.openstreetmap.org/node/5303271991">Cuatro Caminos
6782 roundabout</a>.</p>
6783 <p>I am very, very thankful that this time everyone seems to be working
6784 on developing <a href="https://gitlab.gnome.org/federico/gnome-class">gnome-class</a>. It's a difficult project for me, and
6785 more brainpower is definitely welcome — all the indirection, type
6786 conversion, GObject obscurity, and procedural macro shenanigans
6787 definitely take a toll on oneself.</p>
6788 <h1>Gnome-class internals</h1>
6789 <p><img alt="Gnome-class internals on the whiteboard" src="https://people.gnome.org/~federico/blog/images/madrid-whiteboard.jpg"></p>
6790 <p>I explained how gnome-class works to the rest of the hackfest
6791 attendees. I've been writing a document on <a href="https://federico.pages.gitlab.gnome.org/gnome-class/">gnome-class's
6792 internals</a>, so the whiteboard was a whirlwind tour through
6793 it.</p>
6794 <h1>Error messages from the compiler</h1>
6795 <p>Antoni Boucher, the author of <a href="http://relm.ml/">relm</a> (a Rust crate to write GTK+
6796 asynchronous widgets with an Elm-like model), explained to me how relm
6797 manages to present good error messages from the Rust compiler, when
6798 the user's code has mistakes. Right now this is in a very bad state
6799 in gnome-class: user errors within the invocation of the procedural
6800 macro get shown by the compiler as errors <em>at</em> the macro call, so you
6801 don't get line number information that is meaningful.</p>
6802 <p>For a large part of the day we tried to refactor bits of gnome-class
6803 to do something similar. It is very slightly better now, but this
6804 really requires me to sit down calmly, at home, and to fully
6805 understand how relm does it and what changes are needed in the <a href="https://github.com/dtolnay/syn">syn</a>
6806 parser crate to make it easy to present good errors.</p>
6807 <p>I think I'll continue this work at home, as there is a lot of source
6808 code to understand: the combinator parsers in <a href="https://github.com/dtolnay/syn">syn</a>, the error
6809 handling scheme in <a href="http://relm.ml/">relm</a>, and the peculiarities of gnome-class.</p>
6810 <h1>Further work during the hackfest</h1>
6811 <p>Other people working on gnome-class are adding support for GObject
6812 properties, inheritance from non-Rust classes, and improving the
6813 ergonomics of class-private structures.</p>
6814 <p>I think I'll stop working on error messages for now, and focus instead
6815 on either supporting GTypeInterfaces, or completing support for type
6816 conversions for methods and signals.</p>
6817 <h1>Other happenings in Rust</h1>
6818 <p>Paolo Borelli has been <a href="https://gitlab.gnome.org/GNOME/librsvg/merge_requests/50">porting RsvgState to Rust</a> in librsvg.
6819 This is the big structure that holds all the CSS state for SVG
6820 elements. This is very meticulous work, and I'm thankful that Paolo
6821 is paying good attention to it. Soon we will have all the style
6822 machinery for librsvg in Rust, which will make it easier to use the
6823 <a href="https://crates.io/crates/selectors">selectors crate from Servo</a> instead of libcroco, as the
6824 latter is unmaintained.</p>
6825 <h1>Food</h1>
6826 <p><img alt="Food in Madrid" src="https://people.gnome.org/~federico/blog/images/madrid-food.jpg"></p>
6827 <p>Ah, Spanish food. We have been enjoying cheese, jamón, tortilla,
6828 pimientos, oxtail stews, natillas, café con leche...</p>
6829 <h1>Thanks</h1>
6830 <p>Thanks to <a href="https://www.openshine.com/">OpenShine</a> for hosting the hackfest, and to the GNOME
6831 Foundation for sponsoring my travel. And thanks for Alberto Ruiz for
6832 putting me up in his house!</p>
6833 <p><img alt="Sponsored by the GNOME Foundation" src="https://people.gnome.org/~federico/blog/images/sponsored-badge-shadow.png"></p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category><category term="hackfests"></category><category term="librsvg"></category></entry><entry><title>Refactoring some repetitive code to a Rust macro</title><link href="https://people.gnome.org/~federico/blog/refactoring-some-repetitive-code-to-a-macro.html" rel="alternate"></link><published>2018-03-23T11:01:30-06:00</published><updated>2018-03-23T11:01:30-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-03-23:/~federico/blog/refactoring-some-repetitive-code-to-a-macro.html</id><summary type="html"><p>I have started porting the code in librsvg that parses SVG's CSS
6834 properties from C to Rust. Many properties have symbolic values:</p>
6835 <div class="highlight"><pre><span></span><code>stroke-linejoin: miter | round | bevel | inherit
6836
6837 stroke-linecap: butt | round | square | inherit
6838
6839 fill-rule: nonzero | evenodd | inherit
6840 </code></pre></div>
6841
6842 <p><code>StrokeLinejoin</code> is the first property that I ported. First I had to
6843 write a …</p></summary><content type="html"><p>I have started porting the code in librsvg that parses SVG's CSS
6844 properties from C to Rust. Many properties have symbolic values:</p>
6845 <div class="highlight"><pre><span></span><code>stroke-linejoin: miter | round | bevel | inherit
6846
6847 stroke-linecap: butt | round | square | inherit
6848
6849 fill-rule: nonzero | evenodd | inherit
6850 </code></pre></div>
6851
6852 <p><code>StrokeLinejoin</code> is the first property that I ported. First I had to
6853 write a little bunch of machinery to allow CSS properties to be kept
6854 in Rust-space instead of the main C structure that holds them
6855 (upcoming blog post about that). But for now, I just want to show how
6856 this boiled down to a macro after refactoring.</p>
6857 <h1>First cut at the code</h1>
6858 <p>The <code>stroke-linejoin</code> property can have the values <code>miter</code>, <code>round</code>,
6859 <code>bevel</code>, or <code>inherit</code>. Here is an enum definition for those values,
6860 and the conventional machinery which librsvg uses to parse property values:</p>
6861 <div class="highlight"><pre><span></span><code><span class="cp">#[derive(Debug, Copy, Clone)]</span><span class="w"></span>
6862 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">StrokeLinejoin</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6863 <span class="w"> </span><span class="n">Miter</span><span class="p">,</span><span class="w"></span>
6864 <span class="w"> </span><span class="n">Round</span><span class="p">,</span><span class="w"></span>
6865 <span class="w"> </span><span class="n">Bevel</span><span class="p">,</span><span class="w"></span>
6866 <span class="w"> </span><span class="n">Inherit</span><span class="p">,</span><span class="w"></span>
6867 <span class="p">}</span><span class="w"></span>
6868
6869 <span class="k">impl</span><span class="w"> </span><span class="n">Parse</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">StrokeLinejoin</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6870 <span class="w"> </span><span class="k">type</span> <span class="nc">Data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w"></span>
6871 <span class="w"> </span><span class="k">type</span> <span class="nb">Err</span> <span class="o">=</span><span class="w"> </span><span class="n">AttributeError</span><span class="p">;</span><span class="w"></span>
6872
6873 <span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">_</span>: <span class="nc">Self</span>::<span class="n">Data</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">StrokeLinejoin</span><span class="p">,</span><span class="w"> </span><span class="n">AttributeError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6874 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">trim</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6875 <span class="w"> </span><span class="s">&quot;miter&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinejoin</span>::<span class="n">Miter</span><span class="p">),</span><span class="w"></span>
6876 <span class="w"> </span><span class="s">&quot;round&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinejoin</span>::<span class="n">Round</span><span class="p">),</span><span class="w"></span>
6877 <span class="w"> </span><span class="s">&quot;bevel&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinejoin</span>::<span class="n">Bevel</span><span class="p">),</span><span class="w"></span>
6878 <span class="w"> </span><span class="s">&quot;inherit&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinejoin</span>::<span class="n">Inherit</span><span class="p">),</span><span class="w"></span>
6879 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">AttributeError</span>::<span class="n">from</span><span class="p">(</span><span class="n">ParseError</span>::<span class="n">new</span><span class="p">(</span><span class="s">&quot;invalid value&quot;</span><span class="p">))),</span><span class="w"></span>
6880 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6881 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6882 <span class="p">}</span><span class="w"></span>
6883 </code></pre></div>
6884
6885 <p>We <code>match</code> the allowed string values and map them to enum values. No
6886 big deal, right?</p>
6887 <p>Properties also have a default value. For example, the SVG spec says
6888 that if a shape doesn't have a <code>stroke-linejoin</code> property specified,
6889 it will use <code>miter</code> by default. Let's implement that:</p>
6890 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="nb">Default</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">StrokeLinejoin</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6891 <span class="w"> </span><span class="k">fn</span> <span class="nf">default</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">StrokeLinejoin</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6892 <span class="w"> </span><span class="n">StrokeLinejoin</span>::<span class="n">Miter</span><span class="w"></span>
6893 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6894 <span class="p">}</span><span class="w"></span>
6895 </code></pre></div>
6896
6897 <p>So far, we have three things:</p>
6898 <ul>
6899 <li>An enum definition for the property's possible values.</li>
6900 <li><code>impl Parse</code> so we can parse the property from a string.</li>
6901 <li><code>impl Default</code> so the property knows its default value.</li>
6902 </ul>
6903 <h1>Where things got repetitive</h1>
6904 <p>The next property I ported was <code>stroke-linecap</code>, which can take the
6905 following values:</p>
6906 <div class="highlight"><pre><span></span><code><span class="cp">#[derive(Debug, Copy, Clone)]</span><span class="w"></span>
6907 <span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">StrokeLinecap</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6908 <span class="w"> </span><span class="n">Butt</span><span class="p">,</span><span class="w"></span>
6909 <span class="w"> </span><span class="n">Round</span><span class="p">,</span><span class="w"></span>
6910 <span class="w"> </span><span class="n">Square</span><span class="p">,</span><span class="w"></span>
6911 <span class="w"> </span><span class="n">Inherit</span><span class="p">,</span><span class="w"></span>
6912 <span class="p">}</span><span class="w"></span>
6913 </code></pre></div>
6914
6915 <p>This is similar in shape to the <code>StrokeLinejoin</code> enum above;
6916 it's just different names.</p>
6917 <p>The parsing has exactly the same shape, and just different values:</p>
6918 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">Parse</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">StrokeLinecap</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6919 <span class="w"> </span><span class="k">type</span> <span class="nc">Data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w"></span>
6920 <span class="w"> </span><span class="k">type</span> <span class="nb">Err</span> <span class="o">=</span><span class="w"> </span><span class="n">AttributeError</span><span class="p">;</span><span class="w"></span>
6921
6922 <span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">_</span>: <span class="nc">Self</span>::<span class="n">Data</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">StrokeLinecap</span><span class="p">,</span><span class="w"> </span><span class="n">AttributeError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6923 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">trim</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6924 <span class="w"> </span><span class="s">&quot;butt&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinecap</span>::<span class="n">Butt</span><span class="p">),</span><span class="w"></span>
6925 <span class="w"> </span><span class="s">&quot;round&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinecap</span>::<span class="n">Round</span><span class="p">),</span><span class="w"></span>
6926 <span class="w"> </span><span class="s">&quot;square&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinecap</span>::<span class="n">Square</span><span class="p">),</span><span class="w"></span>
6927 <span class="w"> </span><span class="s">&quot;inherit&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">StrokeLinecap</span>::<span class="n">Inherit</span><span class="p">),</span><span class="w"></span>
6928
6929 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">AttributeError</span>::<span class="n">from</span><span class="p">(</span><span class="n">ParseError</span>::<span class="n">new</span><span class="p">(</span><span class="s">&quot;invalid value&quot;</span><span class="p">))),</span><span class="w"></span>
6930 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6931 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6932 <span class="p">}</span><span class="w"></span>
6933 </code></pre></div>
6934
6935 <p>Same thing with the default:</p>
6936 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="nb">Default</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">StrokeLinecap</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6937 <span class="w"> </span><span class="k">fn</span> <span class="nf">default</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">StrokeLinecap</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6938 <span class="w"> </span><span class="n">StrokeLinecap</span>::<span class="n">Butt</span><span class="w"></span>
6939 <span class="w"> </span><span class="p">}</span><span class="w"></span>
6940 <span class="p">}</span><span class="w"></span>
6941 </code></pre></div>
6942
6943 <p>Yes, the SVG spec has</p>
6944 <div class="highlight"><pre><span></span><code><span class="k">default</span><span class="o">:</span> <span class="n">butt</span>
6945 </code></pre></div>
6946
6947 <p>somewhere in it, much to the delight of the 12-year old in me.</p>
6948 <h1>Refactoring to a macro</h1>
6949 <p>Here I wanted to define a <code>make_ident_property!()</code> macro that would
6950 get invoked like this:</p>
6951 <div class="highlight"><pre><span></span><code><span class="n">make_ident_property</span><span class="o">!</span><span class="p">(</span><span class="w"></span>
6952 <span class="w"> </span><span class="n">StrokeLinejoin</span><span class="p">,</span><span class="w"></span>
6953 <span class="w"> </span><span class="n">default</span>: <span class="nc">Miter</span><span class="p">,</span><span class="w"></span>
6954
6955 <span class="w"> </span><span class="s">&quot;miter&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Miter</span><span class="p">,</span><span class="w"></span>
6956 <span class="w"> </span><span class="s">&quot;round&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Round</span><span class="p">,</span><span class="w"></span>
6957 <span class="w"> </span><span class="s">&quot;bevel&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Bevel</span><span class="p">,</span><span class="w"></span>
6958 <span class="w"> </span><span class="s">&quot;inherit&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Inherit</span><span class="p">,</span><span class="w"></span>
6959 <span class="p">);</span><span class="w"></span>
6960 </code></pre></div>
6961
6962 <p>It's called <code>make_ident_property</code> because it makes a property
6963 definition from simple string identifiers. It has the name of the
6964 property (<code>StrokeLinejoin</code>), a <code>default</code> value, and a few repeating
6965 elements, one for each possible value.</p>
6966 <p>In Rust-speak, the macro's basic pattern is like this:</p>
6967 <div class="highlight"><pre><span></span><code><span class="fm">macro_rules!</span><span class="w"> </span><span class="n">make_ident_property</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6968 <span class="w"> </span><span class="p">(</span><span class="cp">$name</span>: <span class="nc">ident</span><span class="p">,</span><span class="w"></span>
6969 <span class="w"> </span><span class="n">default</span>: <span class="cp">$default</span>: <span class="nc">ident</span><span class="p">,</span><span class="w"></span>
6970 <span class="w"> </span><span class="cp">$($str_prop</span>: <span class="nc">expr</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="cp">$variant</span>: <span class="nc">ident</span><span class="p">,)</span><span class="o">+</span><span class="w"></span>
6971 <span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6972 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="kr">macro</span><span class="w"> </span><span class="n">body</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">go</span><span class="w"> </span><span class="n">here</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
6973 <span class="w"> </span><span class="p">};</span><span class="w"></span>
6974 <span class="p">}</span><span class="w"></span>
6975 </code></pre></div>
6976
6977 <p>Let's dissect that pattern:</p>
6978 <div class="highlight"><pre><span></span><code><span class="fm">macro_rules!</span><span class="w"> </span><span class="n">make_ident_property</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6979 <span class="w"> </span><span class="p">(</span><span class="cp">$name</span>: <span class="nc">ident</span><span class="p">,</span><span class="w"></span>
6980 <span class="c1">// ^^^^^^^^^^^^ will match an identifier and put it in $name</span>
6981
6982 <span class="w"> </span><span class="n">default</span>: <span class="cp">$default</span>: <span class="nc">ident</span><span class="p">,</span><span class="w"></span>
6983 <span class="c1">// ^^^^^^^^^^^^^^^ will match an identifier and put it in $default</span>
6984 <span class="c1">// ^^^^^^^^ arbitrary text</span>
6985
6986 <span class="w"> </span><span class="cp">$($str_prop</span>: <span class="nc">expr</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="cp">$variant</span>: <span class="nc">ident</span><span class="p">,)</span><span class="o">+</span><span class="w"></span>
6987 <span class="w"> </span><span class="o">^^</span><span class="w"> </span><span class="n">arbitrary</span><span class="w"> </span><span class="n">text</span><span class="w"></span>
6988 <span class="c1">// ^^ start of repetition ^^ end of repetition, repeats one or more times</span>
6989
6990 <span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
6991 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
6992 <span class="w"> </span><span class="p">};</span><span class="w"></span>
6993 <span class="p">}</span><span class="w"></span>
6994 </code></pre></div>
6995
6996 <p>For example, saying "<code>$foo: ident</code>" in a macro's pattern means that the
6997 compiler will expect an identifier, and bind it to <code>$foo</code> within the
6998 macro's definition.</p>
6999 <p>Similarly, an <code>expr</code> means that the compiler will
7000 look for an expression — in this case, we want one of the string
7001 values.</p>
7002 <p>In a macro pattern, anything that is not a binding is just arbitrary
7003 text which must appear in the macro's invocation. This is how we can
7004 create a little syntax of our own within the macro: the "<code>default:</code>"
7005 part, and the "<code>=&gt;</code>" inside each string/symbol pair.</p>
7006 <p>Finally, macro patterns allow repetition. Anything within <code>$(...)</code>
7007 indicates repetition. Here, <code>$(...)+</code> indicates that the
7008 compiler must match one or more of the repeating elements.</p>
7009 <p>I pasted the duplicated code, and substituted the actual symbol names
7010 for the macro's bindings:</p>
7011 <div class="highlight"><pre><span></span><code><span class="fm">macro_rules!</span><span class="w"> </span><span class="n">make_ident_property</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7012 <span class="w"> </span><span class="p">(</span><span class="cp">$name</span>: <span class="nc">ident</span><span class="p">,</span><span class="w"></span>
7013 <span class="w"> </span><span class="n">default</span>: <span class="cp">$default</span>: <span class="nc">ident</span><span class="p">,</span><span class="w"></span>
7014 <span class="w"> </span><span class="cp">$($str_prop</span>: <span class="nc">expr</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="cp">$variant</span>: <span class="nc">ident</span><span class="p">,)</span><span class="o">+</span><span class="w"></span>
7015 <span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7016 <span class="w"> </span><span class="cp">#[derive(Debug, Copy, Clone)]</span><span class="w"></span>
7017 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="cp">$name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7018 <span class="w"> </span><span class="cp">$($variant</span><span class="p">),</span><span class="o">+</span><span class="w"></span>
7019 <span class="c1">// ^^^^^^^^^^^^^ this is how we invoke a repeated element</span>
7020
7021 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7022
7023 <span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="nb">Default</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="cp">$name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7024 <span class="w"> </span><span class="k">fn</span> <span class="nf">default</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="cp">$name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7025 <span class="w"> </span><span class="cp">$name</span>::<span class="cp">$default</span><span class="w"></span>
7026 <span class="c1">// ^^^^^^^^^^^^^^^ construct an enum::variant</span>
7027
7028 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7029 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7030
7031 <span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">Parse</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="cp">$name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7032 <span class="w"> </span><span class="k">type</span> <span class="nc">Data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w"></span>
7033 <span class="w"> </span><span class="k">type</span> <span class="nb">Err</span> <span class="o">=</span><span class="w"> </span><span class="n">AttributeError</span><span class="p">;</span><span class="w"></span>
7034
7035 <span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">_</span>: <span class="nc">Self</span>::<span class="n">Data</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="cp">$name</span><span class="p">,</span><span class="w"> </span><span class="n">AttributeError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7036 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">trim</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7037 <span class="w"> </span><span class="cp">$($str_prop</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="cp">$name</span>::<span class="cp">$variant</span><span class="p">),)</span><span class="o">+</span><span class="w"></span>
7038 <span class="c1">// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expand repeated elements</span>
7039
7040 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">AttributeError</span>::<span class="n">from</span><span class="p">(</span><span class="n">ParseError</span>::<span class="n">new</span><span class="p">(</span><span class="s">&quot;invalid value&quot;</span><span class="p">))),</span><span class="w"></span>
7041 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7042 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7043 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7044 <span class="w"> </span><span class="p">};</span><span class="w"></span>
7045 <span class="p">}</span><span class="w"></span>
7046 </code></pre></div>
7047
7048 <h1>Getting rid of duplicated code</h1>
7049 <p>Now we have a macro that we can call to define new properties.
7050 Librsvg now has this, which is much more readable than all the code
7051 written by hand:</p>
7052 <div class="highlight"><pre><span></span><code><span class="n">make_ident_property</span><span class="o">!</span><span class="p">(</span><span class="w"></span>
7053 <span class="w"> </span><span class="n">StrokeLinejoin</span><span class="p">,</span><span class="w"></span>
7054 <span class="w"> </span><span class="n">default</span>: <span class="nc">Miter</span><span class="p">,</span><span class="w"></span>
7055
7056 <span class="w"> </span><span class="s">&quot;miter&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Miter</span><span class="p">,</span><span class="w"></span>
7057 <span class="w"> </span><span class="s">&quot;round&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Round</span><span class="p">,</span><span class="w"></span>
7058 <span class="w"> </span><span class="s">&quot;bevel&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Bevel</span><span class="p">,</span><span class="w"></span>
7059 <span class="w"> </span><span class="s">&quot;inherit&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Inherit</span><span class="p">,</span><span class="w"></span>
7060 <span class="p">);</span><span class="w"></span>
7061
7062 <span class="n">make_ident_property</span><span class="o">!</span><span class="p">(</span><span class="w"></span>
7063 <span class="w"> </span><span class="n">StrokeLinecap</span><span class="p">,</span><span class="w"></span>
7064 <span class="w"> </span><span class="n">default</span>: <span class="nc">Butt</span><span class="p">,</span><span class="w"> </span><span class="c1">// :)</span>
7065
7066 <span class="w"> </span><span class="s">&quot;butt&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Butt</span><span class="p">,</span><span class="w"></span>
7067 <span class="w"> </span><span class="s">&quot;round&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Round</span><span class="p">,</span><span class="w"></span>
7068 <span class="w"> </span><span class="s">&quot;square&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Square</span><span class="p">,</span><span class="w"></span>
7069 <span class="w"> </span><span class="s">&quot;inherit&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Inherit</span><span class="p">,</span><span class="w"></span>
7070 <span class="p">);</span><span class="w"></span>
7071
7072 <span class="n">make_ident_property</span><span class="o">!</span><span class="p">(</span><span class="w"></span>
7073 <span class="w"> </span><span class="n">FillRule</span><span class="p">,</span><span class="w"></span>
7074 <span class="w"> </span><span class="n">default</span>: <span class="nc">NonZero</span><span class="p">,</span><span class="w"></span>
7075
7076 <span class="w"> </span><span class="s">&quot;nonzero&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">NonZero</span><span class="p">,</span><span class="w"></span>
7077 <span class="w"> </span><span class="s">&quot;evenodd&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">EvenOdd</span><span class="p">,</span><span class="w"></span>
7078 <span class="w"> </span><span class="s">&quot;inherit&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">Inherit</span><span class="p">,</span><span class="w"></span>
7079 <span class="p">);</span><span class="w"></span>
7080 </code></pre></div>
7081
7082 <p>Etcetera. It's now easy to port similar symbol-based properties from
7083 C to Rust.</p>
7084 <p>Eventually I'll need to refactor all the crap that deals with
7085 inheritable properties, but that's for another time.</p>
7086 <h1>Conclusion and references</h1>
7087 <p>Rust macros are very powerful to refactor repetitive code like this.</p>
7088 <p><a href="https://doc.rust-lang.org/book/second-edition/appendix-04-macros.html">The Rust book</a>
7089 has an introductory appendix to macros, and <a href="https://danielkeep.github.io/tlborm/book/index.html">The Little Book of Rust
7090 Macros</a> is a
7091 fantastic resource that really dives into what you can do.</p></content><category term="misc"></category><category term="rust"></category><category term="librsvg"></category></entry><entry><title>Making sure the repository doesn't break, automatically</title><link href="https://people.gnome.org/~federico/blog/making-sure-the-repository-doesnt-break.html" rel="alternate"></link><published>2018-03-20T19:37:02-06:00</published><updated>2018-03-20T20:33:46-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-03-20:/~federico/blog/making-sure-the-repository-doesnt-break.html</id><summary type="html"><p>Gitlab has a fairly conventional Continuous Integration system: you
7092 push some commits, the CI pipelines build the code and presumably run
7093 the test suite, and later you can know if this succeeded of failed.</p>
7094 <p>But by the time something fails, the broken code is already in the
7095 public repository.</p>
7096 <p>The …</p></summary><content type="html"><p>Gitlab has a fairly conventional Continuous Integration system: you
7097 push some commits, the CI pipelines build the code and presumably run
7098 the test suite, and later you can know if this succeeded of failed.</p>
7099 <p>But by the time something fails, the broken code is already in the
7100 public repository.</p>
7101 <p>The Rust community uses Bors, a bot that prevents this from happening:</p>
7102 <ul>
7103 <li>
7104 <p>You push some commits and submit a merge request.</p>
7105 </li>
7106 <li>
7107 <p>A human looks at your merge request; they may tell you to make
7108 changes, or they may tell Bors that your request is approved for
7109 merging.</p>
7110 </li>
7111 <li>
7112 <p>Bors looks for approved merge requests. It merges each into a
7113 <em>temporary branch</em> and waits for the CI pipeline to run there. If
7114 CI passes, Bors automatically merges to master. If CI fails, Bors
7115 annotates the merge request with the failure, <strong>and the main
7116 repository stays working</strong>.</p>
7117 </li>
7118 </ul>
7119 <p>Bors also tells you if the mainline has moved forward and there's a
7120 merge conflict. In that case you need to do a rebase yourself; the
7121 repository stays working in the meantime.</p>
7122 <p>This leads to a very fair, very transparent process for contributors
7123 and for maintainers. For all the details, watch <a href="https://www.youtube.com/watch?v=dIageYT0Vgg">Emily Dunham's
7124 presentation on Rust's community
7125 automation</a>
7126 (<a href="http://edunham.net/2016/09/27/rust_s_community_automation.html">transcript</a>).</p>
7127 <p>For a description of where Bors came from, read <a href="https://graydon.livejournal.com/186550.html">Graydon Hoare's
7128 blog</a>.</p>
7129 <p><a href="https://github.com/graydon/bors">Bors</a> evolved into
7130 <a href="https://github.com/servo/homu">Homu</a> and it is what Rust and Servo
7131 use currently. However, Homu depends on Github.</p>
7132 <p>I just found out that there is a <a href="https://github.com/coldnight/homu-gitlab">port of Homu for
7133 Gitlab</a>. Would anyone care
7134 to set it up?</p>
7135 <p><strong>Update:</strong> <a href="https://a.weirder.earth/@bb010g/99719506883461036">Two</a>
7136 <a href="https://octodon.social/@graydon/99719514193737493">people</a> have
7137 suggested porting <a href="https://bors.tech/">Bors-ng</a> to Gitlab instead,
7138 <a href="https://a.weirder.earth/@bb010g/99719537971696863">for scalability
7139 reasons</a>.</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category><category term="librsvg"></category><category term="cairo"></category></entry><entry><title>Librsvg and Gnome-class accepting interns</title><link href="https://people.gnome.org/~federico/blog/interns-summer-2018.html" rel="alternate"></link><published>2018-03-12T19:00:08-06:00</published><updated>2018-03-13T10:04:33-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-03-12:/~federico/blog/interns-summer-2018.html</id><summary type="html"><p>I would like to mentor people for <a href="https://gitlab.gnome.org/GNOME/librsvg">librsvg</a> and <a href="https://gitlab.gnome.org/federico/gnome-class">gnome-class</a> this
7140 Summer, both for <a href="https://www.outreachy.org/">Outreachy</a> and <a href="https://wiki.gnome.org/Outreach/SummerOfCode">Summer of Code</a>.</p>
7141 <h1>Librsvg projects</h1>
7142 <p><strong><em>Project:</em></strong> <a href="https://www.outreachy.org/2018-may-august/communities/gnome/">port filter effects from C to Rust</a></p>
7143 <p>Currently librsvg implements SVG filter effects in C. These are basic
7144 image processing filters like Gaussian blur, matrix convolution,
7145 Porter-Duff alpha …</p></summary><content type="html"><p>I would like to mentor people for <a href="https://gitlab.gnome.org/GNOME/librsvg">librsvg</a> and <a href="https://gitlab.gnome.org/federico/gnome-class">gnome-class</a> this
7146 Summer, both for <a href="https://www.outreachy.org/">Outreachy</a> and <a href="https://wiki.gnome.org/Outreach/SummerOfCode">Summer of Code</a>.</p>
7147 <h1>Librsvg projects</h1>
7148 <p><strong><em>Project:</em></strong> <a href="https://www.outreachy.org/2018-may-august/communities/gnome/">port filter effects from C to Rust</a></p>
7149 <p>Currently librsvg implements SVG filter effects in C. These are basic
7150 image processing filters like Gaussian blur, matrix convolution,
7151 Porter-Duff alpha compositing, etc.</p>
7152 <p>There are <a href="https://gitlab.gnome.org/GNOME/librsvg/milestones/4">some things</a> that need to be done:</p>
7153 <ul>
7154 <li>
7155 <p>Split the single <code>rsvg-filter.c</code> into multiple source files, so it's
7156 easier to port each one individually.</p>
7157 </li>
7158 <li>
7159 <p>Figure out the common infrasctructure: <code>RsvgFilter</code>,
7160 <code>RsvgFilterPrimitive</code>. All the filter use these to store
7161 intermediate results when processing SVG elements.</p>
7162 </li>
7163 <li>
7164 <p>Experiment with the correct Rust abstractions to process images
7165 pixel-by-pixel. We would like to omit per-pixel bounds checks on
7166 array accesses. The <a href="https://crates.io/crates/image">image crate</a> has some nice iterator
7167 traits for pixels. WebKit's implementation of SVG filters also has
7168 interesting abstractions for things like the need for a sliding
7169 window with edge handling for Gaussian blurs.</p>
7170 </li>
7171 <li>
7172 <p>Ensure that our current filters code is actually working. Not all
7173 of the official SVG test suite's tests are in place right now for
7174 the filter effects; it is likely that some of our implementation is
7175 broken.</p>
7176 </li>
7177 </ul>
7178 <p>For this project, it will be especially helpful to have a little
7179 background in image processing. You don't need to be an expert; just
7180 to have done some pixel crunching at some point. You need to be able
7181 to read C and write Rust.</p>
7182 <p><strong><em>Project:</em></strong> <a href="https://gitlab.gnome.org/GNOME/librsvg/milestones/6">CSS styling with rust-selectors</a></p>
7183 <p>Librsvg uses an very simplistic algorithm for CSS cascading. It uses
7184 libcroco to parse CSS style data; libcroco is unmaintained and rather
7185 prone to exploits. I want to use Servo's selectors crate to do the
7186 cascading; we already use the rust-cssparser crate as a tokenizer for
7187 basic CSS properties.</p>
7188 <ul>
7189 <li>
7190 <p>For each node in its DOM tree, librsvg's <code>Node</code> structure keeps a
7191 <code>Vec&lt;&gt;</code> of children. <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/220">We need to move this to store the next
7192 sibling and the first/last children instead</a>. This is the
7193 data structure that rust-selectors prefers. The Kuchiki crate has
7194 an example implementation; borrowing some patterns from there could
7195 also help us simplify our reference counting for nodes.</p>
7196 </li>
7197 <li>
7198 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/issues/223">Our styling machinery needs porting to Rust</a>. We have a
7199 big <code>RsvgState</code> struct which holds the CSS state for each node. It
7200 is easy to port this to Rust; it's more interesting to gradually
7201 move it to a scheme like Servo's, with a distinction between
7202 specified/computed/used values for each CSS property.</p>
7203 </li>
7204 </ul>
7205 <p>For this project, it will be helpful to know a bit of how CSS works.
7206 Definitely be comfortable with Rust concepts like ownership and
7207 borrowing. You don't need to be an expert, but if you are going
7208 through the "fighting the borrow checker" stage, you'll have a harder
7209 time with this. Or it may be what lets you grow out of it! You need
7210 to be able to read C and write Rust.</p>
7211 <p><strong><em>Bugs for newcomers:</em></strong> We have a number of <a href="https://gitlab.gnome.org/GNOME/librsvg/issues?scope=all&amp;utf8=%E2%9C%93&amp;state=opened&amp;label_name[]=4.%20Newcomers">easy bugs for newcomers
7212 to librsvg</a>. Some of these are in the Rust part, some
7213 in the C part, some in both &mdash; take your pick!</p>
7214 <h1>Projects for gnome-class</h1>
7215 <p><a href="https://gitlab.gnome.org/federico/gnome-class">Gnome-class</a> is the code generator that lets you write
7216 GObject implementations in Rust. Or at least that's the intention
7217 &mdash; the project is in early development. The code is so new that
7218 <a href="https://gitlab.gnome.org/federico/gnome-class/issues">practically all of our bugs</a> are of an exploratory
7219 nature.</p>
7220 <p>Gnome-class works like a little compiler. This is from one of the
7221 examples; note the call to <code>gobject_gen!</code> in there:</p>
7222 <div class="highlight"><pre><span></span><code><span class="k">struct</span> <span class="nc">SignalerPrivate</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7223 <span class="w"> </span><span class="n">val</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"></span>
7224 <span class="p">}</span><span class="w"></span>
7225
7226 <span class="k">impl</span><span class="w"> </span><span class="nb">Default</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SignalerPrivate</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7227 <span class="w"> </span><span class="k">fn</span> <span class="nf">default</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7228 <span class="w"> </span><span class="n">SignalerPrivate</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7229 <span class="w"> </span><span class="n">val</span>: <span class="nc">Cell</span>::<span class="n">new</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="w"></span>
7230 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7231 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7232 <span class="p">}</span><span class="w"></span>
7233
7234 <span class="n">gobject_gen</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7235 <span class="w"> </span><span class="n">class</span><span class="w"> </span><span class="n">Signaler</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7236 <span class="w"> </span><span class="k">type</span> <span class="nc">InstancePrivate</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">SignalerPrivate</span><span class="p">;</span><span class="w"></span>
7237 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7238
7239 <span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">Signaler</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7240 <span class="w"> </span><span class="n">signal</span><span class="w"> </span><span class="k">fn</span> <span class="nf">value_changed</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w"></span>
7241
7242 <span class="w"> </span><span class="k">fn</span> <span class="nf">set_value</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">v</span>: <span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7243 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">private</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">get_priv</span><span class="p">();</span><span class="w"></span>
7244 <span class="w"> </span><span class="n">private</span><span class="p">.</span><span class="n">val</span><span class="p">.</span><span class="n">set</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w"></span>
7245 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">emit_value_changed</span><span class="p">();</span><span class="w"></span>
7246 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7247 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7248 <span class="p">}</span><span class="w"></span>
7249 </code></pre></div>
7250
7251 <p>Gnome-class implements this <code>gobject_gen!</code> macro as follows:</p>
7252 <ol>
7253 <li>
7254 <p>First we parse the code inside the macro using the <code>syn</code> crate.
7255 This is a crate that lets you parse Rust source code from the
7256 <code>TokenStream</code> that the compiler hands to implementations of procedural
7257 macros. You give a <code>TokenStream</code> to <code>syn</code>, and it gives you back
7258 structs that represent function definitions, <code>impl</code> blocks,
7259 expressions, etc. From this parsing stage we build an Abstract Syntax
7260 Tree (AST) that closely matches the structure of the code that the
7261 user wrote.</p>
7262 </li>
7263 <li>
7264 <p>Second, we take the AST and convert it to higher-level concepts,
7265 while verifying that the code is semantically valid. For example, we
7266 build up a <code>Class</code> structure for each defined GObject class, and
7267 annotate it with the methods and signals that the user defined for it.
7268 This stage is the High-level Internal Representation (HIR).</p>
7269 </li>
7270 <li>
7271 <p>Third, we generate Rust code from the validated HIR. For each
7272 class, we write out the boilerplate needed to register it against the
7273 GObject type system. For each virtual method we write a trampoline to
7274 let the C code call into the Rust implementation, and then write out
7275 the actual Rust impl that the user wrote. For each signal, we
7276 register it against the GObjectClass, and write the appropriate
7277 trampolines both to invoke the signal's default handler and any Rust
7278 callbacks for signal handlers.</p>
7279 </li>
7280 </ol>
7281 <p>For this project, you definitely need to have written GObject code in
7282 C in the past. You don't need to know the GObject internals; just
7283 know that there are things like type registration, signal creation,
7284 argument marshalling, etc.</p>
7285 <p>You don't need to know about compiler internals.</p>
7286 <p>You don't need to have written Rust procedural macros; you can learn
7287 as you go. The code has enough infrastructure right now that you can
7288 cut&amp;paste useful bits to get started with new features. You should
7289 definitely be comfortable with the Rust borrow checker and simple
7290 lifetimes &mdash; again, you can cut&amp;paste useful code already, and
7291 I'm happy to help with those.</p>
7292 <p>This project demands a little patience. Working on the implementation
7293 of procedural macros is not the smoothest experience right now (one
7294 needs to examine generated code carefully, and play some tricks with
7295 the compiler to debug things), but it's getting better very fast.</p>
7296 <h1>How to apply as an intern</h1>
7297 <p><a href="https://www.outreachy.org/apply/">Details for Outreachy</a></p>
7298 <p><a href="https://wiki.gnome.org/Outreach/SummerOfCode/Students">Details for Summer of Code</a></p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome-class"></category><category term="rust"></category><category term="gnome"></category><category term="mentoring"></category></entry><entry><title>Helping Cairo</title><link href="https://people.gnome.org/~federico/blog/helping-cairo.html" rel="alternate"></link><published>2018-03-06T18:22:52-06:00</published><updated>2018-03-06T18:22:52-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-03-06:/~federico/blog/helping-cairo.html</id><summary type="html"><p><a href="https://www.cairographics.org/">Cairo</a> needs help. It is the main 2D rendering library we use
7299 in GNOME, and in particular, it's what librsvg uses to render all
7300 SVGs.</p>
7301 <p>My immediate problem with Cairo is that it explodes when called with
7302 floating-point coordinates that fall outside the range that its
7303 internal fixed-point numbers can …</p></summary><content type="html"><p><a href="https://www.cairographics.org/">Cairo</a> needs help. It is the main 2D rendering library we use
7304 in GNOME, and in particular, it's what librsvg uses to render all
7305 SVGs.</p>
7306 <p>My immediate problem with Cairo is that it explodes when called with
7307 floating-point coordinates that fall outside the range that its
7308 internal fixed-point numbers can represent. There is no validation of
7309 incoming data, so the polygon intersector ends up with data that makes
7310 no sense, and it crashes.</p>
7311 <p>I've been studying how Cairo converts from floating-point to its
7312 fixed-point representation, and it's a nifty little algorithm. So I
7313 thought, no problem, I'll add validation, see how to represent the
7314 error state internally in Cairo, and see if clients are happy with
7315 getting back a <code>cairo_t</code> in an error state.</p>
7316 <p>Cairo has a very thorough test suite... <strong><em>that doesn't pass</em></strong>. It
7317 is documented to be very hard to pass fully for all rendering
7318 backends. This is understandable, as there may be bugs in X servers
7319 or OpenGL implementations and such. But for the basic, software-only,
7320 in-memory image backend, Cairo should 100% pass its test suite all the
7321 time. This is not the case right now; in my tree, for all the tests
7322 of the image backend I get</p>
7323 <div class="highlight"><pre><span></span><code><span class="mf">497</span> <span class="n">Passed</span><span class="p">,</span> <span class="mf">54</span> <span class="n">Failed</span> <span class="err">[</span><span class="mf">0</span> <span class="n">crashed</span><span class="p">,</span> <span class="mf">14</span> <span class="nb">exp</span><span class="n">ected</span><span class="err">]</span><span class="p">,</span> <span class="mf">27</span> <span class="n">Skipped</span>
7324 </code></pre></div>
7325
7326 <p>I have been looking at test failures to see what needs fixing. Some
7327 reference images just need to be regenerated: there have been minor
7328 changes in font rendering that broke the reference tests. Some
7329 others have small differences in rendering gradients - not noticeable
7330 by eye, just by diff tools.</p>
7331 <p>But some tests, I have no idea what changed that made them break.</p>
7332 <p>Cairo's git repository is accessible through [cgit.freedesktop.org].
7333 As far as I know there is no continuous integration infrastructure to
7334 ensure that tests keep passing.</p>
7335 <h1>Adding minimal continuous testing</h1>
7336 <p>I've set up a <a href="https://gitlab.com/federicomenaquintero/cairo/tree/105084-ft-font-face-init">Cairo repository at gitlab.com</a>. That branch
7337 already has a <a href="https://bugs.freedesktop.org/show_bug.cgi?id=105084">fix for an uninitialized-memory bug which leads to an
7338 invalid <code>free()</code></a>, and some regenerated test files.</p>
7339 <p>The repository <a href="https://gitlab.com/federicomenaquintero/cairo/blob/105084-ft-font-face-init/.gitlab-ci.yml">is configured to run a continuous integration
7340 pipeline</a> on every commit. The test artifacts can then be
7341 downloaded when the test suite fails. Right now it is only testing
7342 the image backend, for in-memory software rendering.</p>
7343 <h1>Initial bugs</h1>
7344 <p>I've started reporting <a href="https://gitlab.com/federicomenaquintero/cairo/issues">a few bugs</a> against that repository for
7345 tests that fail. These should really be in Cairo's Bugzilla, but for
7346 now Gitlab makes it much easier to include test images directly in the
7347 bug descriptions, so that they are easier to browse. Read on.</p>
7348 <h1>Would you like to help?</h1>
7349 <p>A lot of projects use Cairo. We owe it to ourselves to have a library
7350 with a test suite that doesn't break. Getting to that point requires
7351 several things:</p>
7352 <ul>
7353 <li>Fixing current failures in the image backend.</li>
7354 <li>Setting up the CI infrastructure to be able to test other backends.</li>
7355 <li>Fixing failures in the other backends.</li>
7356 </ul>
7357 <p>If you have experience with Cairo, please take a look at the <a href="https://gitlab.com/federicomenaquintero/cairo/issues">bugs</a>.
7358 You can see the <a href="https://gitlab.com/federicomenaquintero/cairo/blob/105084-ft-font-face-init/.gitlab-ci.yml">CI</a> configuration to see how to run the test
7359 suite in the same fashion on your machine.</p>
7360 <p>I think we can make use of modern infrastructure like gitlab and
7361 continuous integration to improve Cairo quickly. Currently it suffers
7362 from lack of attention and hostile tools. Help us out if you can!</p></content><category term="misc"></category><category term="gnome"></category><category term="librsvg"></category><category term="cairo"></category></entry><entry><title>Quick and dirty checklist to update syn 0.11.x to syn 0.12</title><link href="https://people.gnome.org/~federico/blog/syn-012.html" rel="alternate"></link><published>2018-02-26T19:20:42-06:00</published><updated>2018-02-26T19:20:42-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-02-26:/~federico/blog/syn-012.html</id><summary type="html"><p>Today I ported <a href="https://github.com/federicomenaquintero/gnome-class">gnome-class</a> from version 0.11 of the <a href="https://github.com/dtolnay/syn/">syn</a> crate to
7363 version 0.12. <code>syn</code> is a somewhat esoteric crate that you use to
7364 parse Rust code... from a stream of tokens... from within the
7365 implementation of a procedural macro. Gnome-class implements a
7366 mini-language inside your own Rust …</p></summary><content type="html"><p>Today I ported <a href="https://github.com/federicomenaquintero/gnome-class">gnome-class</a> from version 0.11 of the <a href="https://github.com/dtolnay/syn/">syn</a> crate to
7367 version 0.12. <code>syn</code> is a somewhat esoteric crate that you use to
7368 parse Rust code... from a stream of tokens... from within the
7369 implementation of a procedural macro. Gnome-class implements a
7370 mini-language inside your own Rust code, and so it needs to parse
7371 Rust!</p>
7372 <p>The API of <code>syn</code> has changed <em>a lot</em>, which is kind of a pain in the
7373 ass — but the new API seems on the road to stabilization, and is nicer
7374 indeed.</p>
7375 <p>Here is a quick list of things I had to change in gnome-class to
7376 upgrade its version of <code>syn</code>.</p>
7377 <p>There is no <code>extern crate synom</code> anymore. You can use <code>syn::synom</code> now.</p>
7378 <div class="highlight"><pre><span></span><code><span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">synom</span><span class="p">;</span><span class="w"> </span>-&gt; <span class="nc">use</span><span class="w"> </span><span class="n">syn</span>::<span class="n">synom</span><span class="p">;</span><span class="w"></span>
7379 </code></pre></div>
7380
7381 <p><code>SynomBuffer</code> is now <code>TokenBuffer</code>:</p>
7382 <div class="highlight"><pre><span></span><code><span class="n">synom</span>::<span class="n">SynomBuffer</span><span class="w"> </span>-&gt; <span class="nc">syn</span>::<span class="n">buffer</span>:<span class="nc">TokenBuffer</span><span class="w"></span>
7383 </code></pre></div>
7384
7385 <p><code>PResult</code>, the result of <code>Synom::parse()</code>, now has the tuple's
7386 arguments reversed:</p>
7387 <div class="highlight"><pre><span></span><code><span class="o">-</span><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">type</span> <span class="nc">PResult</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">(</span><span class="n">Cursor</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">O</span><span class="p">),</span><span class="w"> </span><span class="n">ParseError</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
7388 <span class="o">+</span><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">type</span> <span class="nc">PResult</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">(</span><span class="n">O</span><span class="p">,</span><span class="w"> </span><span class="n">Cursor</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="p">),</span><span class="w"> </span><span class="n">ParseError</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
7389
7390 <span class="c1">// therefore:</span>
7391
7392 <span class="k">impl</span><span class="w"> </span><span class="n">Synom</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyThing</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
7393
7394 <span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MyThing</span>::<span class="n">parse</span><span class="p">(</span><span class="o">..</span><span class="p">.).</span><span class="n">unwrap</span><span class="p">().</span><span class="mi">1</span><span class="p">;</span><span class="w"> </span>-&gt; <span class="nc">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MyThing</span>::<span class="n">parse</span><span class="p">(</span><span class="o">..</span><span class="p">.).</span><span class="n">unwrap</span><span class="p">().</span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
7395 </code></pre></div>
7396
7397 <p>The language tokens like <code>synom::tokens::Amp</code>, and keywords like
7398 <code>synom::tokens::Type</code>, are easier to use now. There is a <code>Token!</code>
7399 macro which you can use in type definitions, instead of having to
7400 remember the particular name of each token type:</p>
7401 <div class="highlight"><pre><span></span><code><span class="n">synom</span>::<span class="n">tokens</span>::<span class="n">Amp</span><span class="w"> </span>-&gt; <span class="nc">Token</span><span class="o">!</span><span class="p">(</span><span class="o">&amp;</span><span class="p">)</span><span class="w"></span>
7402
7403 <span class="n">synom</span>::<span class="n">tokens</span>::<span class="n">For</span><span class="w"> </span>-&gt; <span class="nc">Token</span><span class="o">!</span><span class="p">(</span><span class="k">for</span><span class="p">)</span><span class="w"></span>
7404 </code></pre></div>
7405
7406 <p>And for the corresponding values when matching:</p>
7407 <div class="highlight"><pre><span></span><code><span class="n">syn</span><span class="o">!</span><span class="p">(</span><span class="n">tokens</span>::<span class="n">Colon</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">punct</span><span class="o">!</span><span class="p">(</span>:<span class="p">)</span><span class="w"></span>
7408
7409 <span class="n">syn</span><span class="o">!</span><span class="p">(</span><span class="n">tokens</span>::<span class="n">Type</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">keyword</span><span class="o">!</span><span class="p">(</span><span class="k">type</span><span class="p">)</span><span class="w"></span>
7410 </code></pre></div>
7411
7412 <p>And to instantiate them for quoting/spanning:</p>
7413 <div class="highlight"><pre><span></span><code><span class="o">-</span><span class="w"> </span><span class="n">tokens</span>::<span class="n">Comma</span>::<span class="n">default</span><span class="p">().</span><span class="n">to_tokens</span><span class="p">(</span><span class="n">tokens</span><span class="p">);</span><span class="w"></span>
7414 <span class="o">+</span><span class="w"> </span><span class="n">Token</span><span class="o">!</span><span class="p">(,)([</span><span class="n">Span</span>::<span class="n">def_site</span><span class="p">()]).</span><span class="n">to_tokens</span><span class="p">(</span><span class="n">tokens</span><span class="p">);</span><span class="w"></span>
7415 </code></pre></div>
7416
7417 <p>(OK, that one wasn't nicer after all.)</p>
7418 <p>To the get string for an <code>Ident</code>:</p>
7419 <div class="highlight"><pre><span></span><code><span class="n">ident</span><span class="p">.</span><span class="n">sym</span><span class="p">.</span><span class="n">as_str</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">ident</span><span class="p">.</span><span class="n">as_ref</span><span class="p">()</span><span class="w"></span>
7420 </code></pre></div>
7421
7422 <p>There is no <code>Delimited</code> anymore; instead there is a <code>Punctuated</code>
7423 struct. My diff has this:</p>
7424 <div class="highlight"><pre><span></span><code>- inputs: parens!(call!(Delimited::&lt;MyThing, tokens::Comma&gt;::parse_terminated)) &gt;&gt;
7425 + inputs: parens!(syn!(Punctuated&lt;MyThing, Token!(,)&gt;)) &gt;&gt;
7426 </code></pre></div>
7427
7428 <p>There is no <code>syn::Mutability</code> anymore; now it's an <code>Option&lt;token&gt;</code>, so
7429 basically</p>
7430 <div class="highlight"><pre><span></span><code><span class="n">syn</span>::<span class="n">Mutability</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Token</span><span class="o">!</span><span class="p">[</span><span class="k">mut</span><span class="p">]</span><span class="o">&gt;</span><span class="w"></span>
7431 </code></pre></div>
7432
7433 <p>which I guess lets you refer to the span of the original <code>mut</code> token
7434 if you need.</p>
7435 <p>Some things changed names:</p>
7436 <div class="highlight"><pre><span></span><code><span class="n">TypeTup</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">tys</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w"> </span>-&gt; <span class="nc">TypeTuple</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">elems</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
7437
7438 <span class="n">PatIdent</span><span class="w"> </span><span class="p">{</span><span class="w"> </span>-&gt; <span class="nc">PatIdent</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7439 <span class="w"> </span><span class="n">mode</span>: <span class="nc">BindingMode</span><span class="p">(</span><span class="n">Mutability</span><span class="p">)</span><span class="w"> </span><span class="n">by_ref</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Token</span><span class="o">!</span><span class="p">(</span><span class="k">ref</span><span class="p">)</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
7440 <span class="w"> </span><span class="n">mutability</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Token</span><span class="o">!</span><span class="p">[</span><span class="k">mut</span><span class="p">]</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
7441 <span class="w"> </span><span class="n">ident</span>: <span class="nc">Ident</span><span class="p">,</span><span class="w"> </span><span class="n">ident</span>: <span class="nc">Ident</span><span class="p">,</span><span class="w"></span>
7442 <span class="w"> </span><span class="n">subpat</span>: <span class="o">..</span><span class="p">.,</span><span class="w"> </span><span class="n">subpat</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="p">(</span><span class="n">Token</span><span class="o">!</span><span class="p">[</span><span class="o">@</span><span class="p">],</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">Pat</span><span class="o">&gt;</span><span class="p">)</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
7443 <span class="w"> </span><span class="n">at_token</span>: <span class="o">..</span><span class="p">.,</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
7444 <span class="p">}</span><span class="w"></span>
7445
7446 <span class="n">TypeParen</span><span class="p">.</span><span class="n">ty</span><span class="w"> </span>-&gt; <span class="nc">TypeParen</span><span class="p">.</span><span class="n">elem</span><span class="w"> </span><span class="p">(</span><span class="n">and</span><span class="w"> </span><span class="n">others</span><span class="w"> </span><span class="n">like</span><span class="w"> </span><span class="n">this</span><span class="p">,</span><span class="w"> </span><span class="n">too</span><span class="p">)</span><span class="w"></span>
7447 </code></pre></div>
7448
7449 <p>(I don't know everything that changed names; gnome-class doesn't use
7450 all the syn types yet; these are just the ones I've run into.)</p>
7451 <p>This new <code>syn</code> is much better at acknowledging the fine points of
7452 macro hygiene. The <a href="https://github.com/dtolnay/syn/tree/master/examples">examples directory</a> is particularly instructive;
7453 it shows how to properly span generated code vs. original code, so
7454 compiler error messages are nice. I <a href="https://github.com/rust-lang-nursery/rustc-guide/issues/15">need to write something about
7455 macro hygiene</a> at some point.</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Librsvg's continuous integration pipeline</title><link href="https://people.gnome.org/~federico/blog/librsvg-ci-pipeline.html" rel="alternate"></link><published>2018-02-23T16:25:13-06:00</published><updated>2018-02-23T16:25:13-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-02-23:/~federico/blog/librsvg-ci-pipeline.html</id><summary type="html"><p><a href="https://gitlab.gnome.org/alatiera">Jordan Petridis</a> has been kicking ass by overhauling
7456 librsvg's continous integration (CI) pipeline. Take a look at this
7457 beauty:</p>
7458 <p><img alt="Continuous integration pipeline" src="https://people.gnome.org/~federico/blog/images/librsvg-pipeline.png"></p>
7459 <p>On every push, we run the <strong>Test</strong> stage. This is a quick compilation
7460 on a Fedora container that runs "<code>make check</code>" and ensures that the
7461 test suite passes.</p>
7462 <p>We have a …</p></summary><content type="html"><p><a href="https://gitlab.gnome.org/alatiera">Jordan Petridis</a> has been kicking ass by overhauling
7463 librsvg's continous integration (CI) pipeline. Take a look at this
7464 beauty:</p>
7465 <p><img alt="Continuous integration pipeline" src="https://people.gnome.org/~federico/blog/images/librsvg-pipeline.png"></p>
7466 <p>On every push, we run the <strong>Test</strong> stage. This is a quick compilation
7467 on a Fedora container that runs "<code>make check</code>" and ensures that the
7468 test suite passes.</p>
7469 <p>We have a <strong>Lint</strong> stage which can be run manually. This runs <code>cargo
7470 clippy</code> to get Rust lints (check the style of Rust idioms), and <code>cargo
7471 fmt</code> to check indentation and code style and such.</p>
7472 <p>We have a <strong>Distro_test</strong> stage which I think will be scheduled
7473 weekly, using Gitlab's <em>Schedules</em> feature, to check that the tests
7474 pass on three major Linux distros. Recently we had trouble with
7475 different rendering due to differences in Freetype versions, which
7476 broke the tests (<em>ahem, likely because </em><em>I</em><em> hadn't updated my
7477 Freetype in a while and distros were already using a newer one</em>); these
7478 distro tests are intended to catch that.</p>
7479 <p>Finally, we have a <strong>Rustc_test</strong> stage. The various crates that
7480 librsvg depends on have different minimum versions for the Rust
7481 compiler. These tests are intended to show when updating a dependency
7482 changes the minimum Rust version on which librsvg would compile. We
7483 don't have a policy yet for "how far from $newest" we should always
7484 work on, and it would be good to get input from distros on this. I
7485 think these Rust tests will be scheduled weekly as well.</p>
7486 <p>Jordan has been experimenting with the pipeline's stages and the
7487 distro-specific idiosyncrasies for each build. This pipeline depends
7488 on some <a href="https://gitlab.com/alatiera/librsvg-oci-images">custom-built container images</a> that already have
7489 librsvg's dependencies installed. These images are built weekly in
7490 <code>gitlab.com</code>, so every week <code>gitlab.gnome.org</code> gets fresh images for
7491 librsvg's CI pipelines. Once image registries are enabled in
7492 <code>gitlab.gnome.org</code>, we should be able to regenerate the container
7493 images locally without depending on an external service.</p>
7494 <p>With the pre-built images, and caching of Rust artifacts, Jordan was
7495 able to <strong>reduce the time for the "test on every commit" builds</strong> from
7496 around 20 minutes, to little under 4 minutes in the current
7497 iteration. This will get even faster if the builds start using ccache
7498 and parallel builds from GNU make.</p>
7499 <p>Currently we have a problem in that <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/178">tests are failing on 32-bit
7500 builds</a>, and haven't had a chance to investigate the root
7501 cause. Hopefully we can add 32-bit jobs to the CI pipeline to catch
7502 this breakage as soon as possible.</p>
7503 <p>Having all these container images built for the CI infrastructure also
7504 means that it will be easy for people to <strong>set up a development
7505 environment</strong> for librsvg, even though we have <a href="https://gitlab.gnome.org/GNOME/librsvg/blob/b75c20fb3137af9610ff48d0d31ab45e008893ff/COMPILING.md#installing-dependencies-for-building">better instructions
7506 now</a> thanks to Jordan. I haven't investigated setting up a
7507 Flatpak-based environment; this would be nice to have as well.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category></entry><entry><title>RFC: Integrating rsvg-rs into librsvg</title><link href="https://people.gnome.org/~federico/blog/rfc-integrating-rsvg-rs-into-librsvg.html" rel="alternate"></link><published>2018-02-22T09:57:52-06:00</published><updated>2018-02-22T09:57:52-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-02-22:/~federico/blog/rfc-integrating-rsvg-rs-into-librsvg.html</id><summary type="html"><p>I have started an <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/207">RFC to integrate rsvg-rs into librsvg</a>.
7508 <code>rsvg-rs</code> is the Rust binding to librsvg. Like the <a href="http://gtk-rs.org/">gtk-rs</a> bindings,
7509 it gets generated from a pre-built <a href="https://people.gnome.org/~federico/blog/magic-of-gobject-introspection.html">GIR</a> file.</p>
7510 <p>It would be nice for librsvg to provide the Rust binding by itself, so
7511 that librsvg's own internal tools can be …</p></summary><content type="html"><p>I have started an <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/207">RFC to integrate rsvg-rs into librsvg</a>.
7512 <code>rsvg-rs</code> is the Rust binding to librsvg. Like the <a href="http://gtk-rs.org/">gtk-rs</a> bindings,
7513 it gets generated from a pre-built <a href="https://people.gnome.org/~federico/blog/magic-of-gobject-introspection.html">GIR</a> file.</p>
7514 <p>It would be nice for librsvg to provide the Rust binding by itself, so
7515 that librsvg's own internal tools can be implemented in Rust —
7516 currently all the tests are done in C, as are the <code>rsvg-convert(1)</code> and
7517 <code>rsvg-view-3(1)</code> programs.</p>
7518 <p>There are some implications for how <code>rsvg-rs</code> would get built then.
7519 For librsvg's internal consumption, the binding can be built from the
7520 <code>Rsvg-2.0.gir</code> file that gets built out of the main <code>librsvg.so</code>. But
7521 for public consumption of <code>rsvg-rs</code>, when it is being used as a normal
7522 crate and built by Cargo, that <code>Rsvg-2.0.gir</code> needs to be already
7523 built and available: it wouldn't be appropriate for Cargo to build
7524 librsvg and the <code>.gir</code> file itself.</p>
7525 <p>If this sort of thing interests you, <a href="https://gitlab.gnome.org/GNOME/librsvg/issues/207">take a look at the RFC</a>!</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Rust things I miss in C</title><link href="https://people.gnome.org/~federico/blog/rust-things-i-miss-in-c.html" rel="alternate"></link><published>2018-02-18T21:26:04-06:00</published><updated>2018-02-23T18:28:23-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-02-18:/~federico/blog/rust-things-i-miss-in-c.html</id><summary type="html"><p>Librsvg feels like it is reaching a tipping point, where suddenly it
7526 seems like it would be easier to just port some major parts from C to
7527 Rust than to just add accessors for them. Also, more and more of the
7528 meat of the library is in Rust now.</p>
7529 <p>I'm …</p></summary><content type="html"><p>Librsvg feels like it is reaching a tipping point, where suddenly it
7530 seems like it would be easier to just port some major parts from C to
7531 Rust than to just add accessors for them. Also, more and more of the
7532 meat of the library is in Rust now.</p>
7533 <p>I'm switching back and forth a lot between C and Rust these days, and
7534 C feels very, very primitive these days.</p>
7535 <h1>A sort of elegy to C</h1>
7536 <p>I fell in love with the C language about 24 years ago. I learned the
7537 basics of it by reading a Spanish translation of <a href="https://en.wikipedia.org/wiki/The_C_Programming_Language">The C Programming
7538 Language by K&amp;R</a> second edition. I had been using Turbo Pascal
7539 before in a reasonably low-level fashion, with pointers and manual
7540 memory allocation, and C felt refreshing and empowering.</p>
7541 <p>K&amp;R is a great book for its <em>style of writing</em> and its conciseness of
7542 programming. This little book even taught you how to implement a
7543 simple <code>malloc()</code>/<code>free()</code>, which was completely enlightening. Even
7544 low-level constructs that seemed part of the language could be
7545 implemented in the language itself!</p>
7546 <p>I got good at C over the following years. It is a small language,
7547 with a small standard library. It was probably the perfect language
7548 to implement Unix kernels in 20,000 lines of code or so.</p>
7549 <p>The GIMP and GTK+ taught me how to do fancy object orientation in C.
7550 GNOME taught me how to maintain large-scale software in C. 20,000
7551 lines of C code started to seem like a project one could more or less
7552 fully understand in a few weeks.</p>
7553 <p>But our code bases are not that small anymore. Our software now has
7554 <em>huge</em> expectations on the features that are available in the
7555 language's standard library.</p>
7556 <h2>Some good experiences with C</h2>
7557 <p>Reading the POV-Ray code source code for the first time and learning
7558 how to do object orientation and inheritance in C.</p>
7559 <p>Reading the GTK+ source code and learning a C style that was legible,
7560 maintainable, and clean.</p>
7561 <p>Reading SIOD's source code, then the early Guile sources, and seeing
7562 how a Scheme interpreter can be written in C.</p>
7563 <p>Writing the initial versions of Eye of Gnome and fine-tuning the
7564 microtile rendering.</p>
7565 <h2>Some bad experiences with C</h2>
7566 <p>In the Evolution team, when everything was crashing. We had to buy a
7567 Solaris machine just to be able to buy Purify; there was no Valgrind
7568 back then.</p>
7569 <p>Debugging gnome-vfs threading deadlocks.</p>
7570 <p>Debugging Mesa and getting nowhere.</p>
7571 <p>Taking over the intial versions of Nautilus-share and seeing that it
7572 never <code>free()</code>d anything.</p>
7573 <p>Trying to refactor code where I had no idea about the memory
7574 management strategy.</p>
7575 <p>Trying to turn code into a library when it is full of global variables
7576 and no functions are <code>static</code>.</p>
7577 <p>But anyway — let's get on with things in Rust I miss in C.</p>
7578 <h1>Automatic resource management</h1>
7579 <p>One of the first blog posts I read about Rust was "<a href="http://blog.skylight.io/rust-means-never-having-to-close-a-socket/">Rust means never
7580 having to close a socket</a>". Rust borrows C++'s ideas about
7581 <a href="http://wiki.c2.com/?ResourceAcquisitionIsInitialization">Resource Acquisition Is Initialization (RAII)</a>, Smart Pointers,
7582 adds in the single-ownership principle for values, and gives you
7583 automatic, deterministic resource management in a very neat package.</p>
7584 <ul>
7585 <li>
7586 <p>Automatic: you don't <code>free()</code> by hand. Memory gets deallocated,
7587 files get closed, mutexes get unlocked when they go out of scope.
7588 If you are wrapping an external resource, you just implement the
7589 <a href="https://doc.rust-lang.org/book/second-edition/ch15-03-drop.html">Drop</a> trait and that's basically it. The wrapped resource feels
7590 like part of the language since you don't have to babysit its
7591 lifetime by hand.</p>
7592 </li>
7593 <li>
7594 <p>Deterministic: resources get created (memory allocated, initialized,
7595 files opened, etc.), and they get destroyed when they go out of
7596 scope. There is no garbage collection: things really get terminated
7597 when you close a brace. You start to see your program's data
7598 lifetimes as a tree of function calls.</p>
7599 </li>
7600 </ul>
7601 <p>After forgetting to free/close/destroy C objects all the time, or
7602 worse, figuring out where code that I didn't write forgot to do those
7603 things (or did them <em>twice</em>, incorrectly)... I don't want to do it
7604 again.</p>
7605 <h1>Generics</h1>
7606 <p><code>Vec&lt;T&gt;</code> really is a vector of whose elements are the size of <code>T</code>.
7607 It's not an array of pointers to individually allocated objects. It
7608 gets compiled <em>specifically</em> to code that can only handle objects of
7609 type <code>T</code>.</p>
7610 <p>After writing many janky macros in C to do similar things... I don't
7611 want to do it again.</p>
7612 <h1>Traits are not just interfaces</h1>
7613 <p><a href="https://doc.rust-lang.org/book/second-edition/ch17-00-oop.html">Rust is not a Java-like object-oriented language</a>. Instead it
7614 has traits, which at first seem like Java interfaces — an easy way to
7615 do dynamic dispatch, so that if an object implements <code>Drawable</code> then
7616 you can assume it has a <code>draw()</code> method.</p>
7617 <p>However, traits are more powerful than that.</p>
7618 <h2>Associated types</h2>
7619 <p><a href="https://doc.rust-lang.org/book/second-edition/ch19-03-advanced-traits.html">Traits can have associated types</a>. As an example, Rust
7620 provies the <code>Iterator</code> trait which you can implement:</p>
7621 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7622 <span class="w"> </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w"></span>
7623 <span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
7624 <span class="p">}</span><span class="w"></span>
7625 </code></pre></div>
7626
7627 <p>This means that whenever you implement <code>Iterator</code> for some iterable
7628 object, you also have to specify an <code>Item</code> type for the things that
7629 will be produced. If you call <code>next()</code> and there are more elements,
7630 you'll get back a <code>Some(YourElementType)</code>. When your iterator runs
7631 out of items, it will return <code>None</code>.</p>
7632 <p>Associated types can refer to <em>other</em> traits.</p>
7633 <p>For example, in Rust, you can use <code>for</code> loops on anything that
7634 implements the <code>IntoIterator</code> trait:</p>
7635 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="nb">IntoIterator</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7636 <span class="w"> </span><span class="sd">/// The type of the elements being iterated over.</span>
7637 <span class="w"> </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w"></span>
7638
7639 <span class="w"> </span><span class="sd">/// Which kind of iterator are we turning this into?</span>
7640 <span class="w"> </span><span class="k">type</span> <span class="nc">IntoIter</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w"></span>
7641
7642 <span class="w"> </span><span class="k">fn</span> <span class="nf">into_iter</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">IntoIter</span><span class="p">;</span><span class="w"></span>
7643 <span class="p">}</span><span class="w"></span>
7644 </code></pre></div>
7645
7646 <p>When implementing this trait, you must provide both the type of the
7647 <code>Item</code> which your iterator will produce, and <code>IntoIter</code>, the actual
7648 type that implements <code>Iterator</code> and that holds your iterator's state.</p>
7649 <p>This way you can build webs of types that refer to each other. You
7650 can have a trait that says, "I can do foo and bar, but only if you
7651 give me a type that can do this and that".</p>
7652 <h1>Slices</h1>
7653 <p>I already posted about <a href="https://people.gnome.org/~federico/blog/rant-on-string-slices.html">the lack of string slices in C</a> and
7654 how this is a pain in the ass once you get used to having them.</p>
7655 <h1>Modern tooling for dependency management</h1>
7656 <p>Instead of </p>
7657 <ul>
7658 <li>Having to invoke <code>pkg-config</code> by hand or with Autotools macros</li>
7659 <li>Wrangling include paths for header files...</li>
7660 <li>... and library files.</li>
7661 <li>And basically depending on the user to ensure that the correct
7662 versions of libraries are installed,</li>
7663 </ul>
7664 <p>You write a <code>Cargo.toml</code> file which lists the names and versions of
7665 your dependencies. These get downloaded from a well-known location,
7666 or from elsewhere if you specify.</p>
7667 <p>You don't have to fight dependencies. It just works when you <code>cargo build</code>.</p>
7668 <h1>Tests</h1>
7669 <p>C makes it very hard to have unit tests for several reasons:</p>
7670 <ul>
7671 <li>
7672 <p>Internal functions are often <code>static</code>. This means they can't be
7673 called outside of the source file that defined them. A test program
7674 either has to <code>#include</code> the source file where the static functions
7675 live, or use <code>#ifdef</code>s to remove the <code>static</code>s only during testing.</p>
7676 </li>
7677 <li>
7678 <p>You have to write Makefile-related hackery to link the test program
7679 to only part of your code's dependencies, or to only part of the
7680 rest of your code.</p>
7681 </li>
7682 <li>
7683 <p>You have to pick a testing framework. You have to register tests
7684 against the testing framework. You have to <em>learn</em> the testing
7685 framework.</p>
7686 </li>
7687 </ul>
7688 <p>In Rust you write</p>
7689 <div class="highlight"><pre><span></span><code><span class="cp">#[test]</span><span class="w"></span>
7690 <span class="k">fn</span> <span class="nf">test_that_foo_works</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7691 <span class="w"> </span><span class="fm">assert!</span><span class="p">(</span><span class="n">foo</span><span class="p">()</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">expected_result</span><span class="p">);</span><span class="w"></span>
7692 <span class="p">}</span><span class="w"></span>
7693 </code></pre></div>
7694
7695 <p>anywhere in your program or library, and when you type <code>cargo test</code>,
7696 IT JUST FUCKING WORKS. That code only gets linked into the
7697 test binary. You don't have to compile anything twice by hand, or
7698 write Makefile hackery, or figure out how to extract internal
7699 functions for testing.</p>
7700 <p>This is a very killer feature for me.</p>
7701 <h1>Documentation, with tests</h1>
7702 <p>Rust generates documentation from comments in Markdown syntax. Code
7703 in the docs <em>gets run as tests</em>. You can illustrate how a function is
7704 used <em>and</em> test it at the same time:</p>
7705 <div class="highlight"><pre><span></span><code><span class="sd">/// Multiples the specified number by two</span>
7706 <span class="sd">///</span>
7707 <span class="sd">/// ```</span>
7708 <span class="sd">/// assert_eq!(multiply_by_two(5), 10);</span>
7709 <span class="sd">/// ```</span>
7710 <span class="k">fn</span> <span class="nf">multiply_by_two</span><span class="p">(</span><span class="n">x</span>: <span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"></span>
7711 <span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">2</span><span class="w"></span>
7712 <span class="p">}</span><span class="w"></span>
7713 </code></pre></div>
7714
7715 <p>Your example code <em>gets run as tests</em> to ensure that your
7716 documentation stays up to date with the actual code.</p>
7717 <p><strong>Update 2018/Feb/23:</strong> QuietMisdreavus has posted <a href="https://quietmisdreavus.net/code/2018/02/23/how-the-doctests-get-made/">how rustdoc turns
7718 doctests into runnable code
7719 internally</a>.
7720 This is high-grade magic and thoroughly interesting.</p>
7721 <h1>Hygienic macros</h1>
7722 <p>Rust has hygienic macros that avoid all of C's problems with things in
7723 macros that inadvertently shadow identifiers in the code. You don't
7724 need to write macros where every symbol has to be in parentheses for
7725 <code>max(5 + 3, 4)</code> to work correctly.</p>
7726 <h1>No automatic coercions</h1>
7727 <p>All the bugs in C that result from inadvertently converting an <code>int</code>
7728 to a <code>short</code> or <code>char</code> or whatever — Rust doesn't do them. You have
7729 to explicitly convert.</p>
7730 <h1>No integer overflow</h1>
7731 <p>Enough said.</p>
7732 <h1>Generally, no undefined behavior in safe Rust</h1>
7733 <p>In Rust, it is considered a bug in the language if something written
7734 in "safe Rust" (what you would be allowed to write outside <code>unsafe {}</code>
7735 blocks) results in undefined behavior. You can shift-right a negative
7736 integer and it will do exactly what you expect.</p>
7737 <h1>Pattern matching</h1>
7738 <p>You know how <code>gcc</code> warns you if you <code>switch()</code> on an enum but don't
7739 handle all values? That's like a little baby.</p>
7740 <p>Rust has <a href="https://doc.rust-lang.org/book/second-edition/ch18-03-pattern-syntax.html">pattern matching</a> in various places. It can do that
7741 trick for enums inside a <code>match()</code> expression. It can do
7742 destructuring so you can return multiple values from a function:</p>
7743 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="kt">f64</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7744 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">sin_cos</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="kt">f64</span><span class="p">);</span><span class="w"></span>
7745 <span class="p">}</span><span class="w"></span>
7746
7747 <span class="kd">let</span><span class="w"> </span><span class="n">angle</span>: <span class="kt">f64</span> <span class="o">=</span><span class="w"> </span><span class="mf">42.0</span><span class="p">;</span><span class="w"></span>
7748 <span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">sin_angle</span><span class="p">,</span><span class="w"> </span><span class="n">cos_angle</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">angle</span><span class="p">.</span><span class="n">sin_cos</span><span class="p">();</span><span class="w"></span>
7749 </code></pre></div>
7750
7751 <p>You can <code>match()</code> on strings. YOU CAN MATCH ON FUCKING STRINGS.</p>
7752 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&quot;green&quot;</span><span class="p">;</span><span class="w"></span>
7753
7754 <span class="k">match</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7755 <span class="w"> </span><span class="s">&quot;red&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="s">&quot;it&#39;s red&quot;</span><span class="p">),</span><span class="w"></span>
7756 <span class="w"> </span><span class="s">&quot;green&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="s">&quot;it&#39;s green&quot;</span><span class="p">),</span><span class="w"></span>
7757 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="s">&quot;it&#39;s something else&quot;</span><span class="p">),</span><span class="w"></span>
7758 <span class="p">}</span><span class="w"></span>
7759 </code></pre></div>
7760
7761 <p>You know how this is illegible?</p>
7762 <div class="highlight"><pre><span></span><code>my_func(true, false, false)
7763 </code></pre></div>
7764
7765 <p>How about this instead, with pattern matching on function arguments:</p>
7766 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Fubarize</span><span class="p">(</span><span class="k">pub</span><span class="w"> </span><span class="kt">bool</span><span class="p">);</span><span class="w"></span>
7767 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Frobnify</span><span class="p">(</span><span class="k">pub</span><span class="w"> </span><span class="kt">bool</span><span class="p">);</span><span class="w"></span>
7768 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Bazificate</span><span class="p">(</span><span class="k">pub</span><span class="w"> </span><span class="kt">bool</span><span class="p">);</span><span class="w"></span>
7769
7770 <span class="k">fn</span> <span class="nf">my_func</span><span class="p">(</span><span class="n">Fubarize</span><span class="p">(</span><span class="n">fub</span><span class="p">)</span>: <span class="nc">Fubarize</span><span class="p">,</span><span class="w"> </span>
7771 <span class="w"> </span><span class="n">Frobnify</span><span class="p">(</span><span class="n">frob</span><span class="p">)</span>: <span class="nc">Frobnify</span><span class="p">,</span><span class="w"> </span>
7772 <span class="w"> </span><span class="n">Bazificate</span><span class="p">(</span><span class="n">baz</span><span class="p">)</span>: <span class="nc">Bazificate</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7773 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">fub</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7774 <span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w"></span>
7775 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7776
7777 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">frob</span><span class="w"> </span><span class="o">&amp;&amp;</span><span class="w"> </span><span class="n">baz</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7778 <span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w"></span>
7779 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7780 <span class="p">}</span><span class="w"></span>
7781
7782 <span class="o">..</span><span class="p">.</span><span class="w"></span>
7783
7784 <span class="n">my_func</span><span class="p">(</span><span class="n">Fubarize</span><span class="p">(</span><span class="kc">true</span><span class="p">),</span><span class="w"> </span><span class="n">Frobnify</span><span class="p">(</span><span class="kc">false</span><span class="p">),</span><span class="w"> </span><span class="n">Bazificate</span><span class="p">(</span><span class="kc">true</span><span class="p">));</span><span class="w"></span>
7785 </code></pre></div>
7786
7787 <h1>Standard, useful error handling</h1>
7788 <p>I've talked at length about this. No more returning a boolean with no
7789 extra explanation for an error, no ignoring errors inadvertently, no
7790 exception handling with nonlocal jumps.</p>
7791 <h1>#[derive(Debug)]</h1>
7792 <p>If you write a new type (say, a struct with a ton of fields), you can
7793 <code>#[derive(Debug)]</code> and Rust will know how to automatically print that
7794 type's contents for debug output. You no longer have to write a
7795 special function that you must call in gdb by hand just to examine a
7796 custom type.</p>
7797 <h1>Closures</h1>
7798 <p>No more passing function pointers and a <code>user_data</code> by hand.</p>
7799 <h1>Conclusion</h1>
7800 <p>I haven't done the "<a href="https://doc.rust-lang.org/book/second-edition/ch16-00-concurrency.html">fearless concurrency</a>" bit yet, where the
7801 compiler is able to prevent data races in threaded code. I imagine it
7802 being a game-changer for people who write concurrent code on an
7803 everyday basis.</p>
7804 <p>C is an old language with primitive constructs and primitive tooling.
7805 It was a good language for small uniprocessor Unix kernels that ran in
7806 trusted, academic environments. It's no longer a good language for
7807 the software of today.</p>
7808 <p>Rust is not easy to learn, but I think it is completely worth it.
7809 It's hard because it demands a lot from your understanding of the code
7810 you want to write. I think it's one of those languages that make you
7811 a better programmer and that let you tackle more ambitious problems.</p></content><category term="misc"></category><category term="rust"></category></entry><entry><title>Writing a command-line program in Rust</title><link href="https://people.gnome.org/~federico/blog/writing-a-command-line-program-in-rust.html" rel="alternate"></link><published>2018-02-03T11:41:20-06:00</published><updated>2018-02-03T11:41:20-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-02-03:/~federico/blog/writing-a-command-line-program-in-rust.html</id><summary type="html"><p>As a library writer, it feels a bit strange, but refreshing, to write
7812 a program that actually has a <code>main()</code> function.</p>
7813 <p>My experience with Rust so far has been threefold:</p>
7814 <ul>
7815 <li>
7816 <p>Porting chunks of C to Rust for librsvg - this is all work on
7817 librsvg's internals and no users are exposed …</p></li></ul></summary><content type="html"><p>As a library writer, it feels a bit strange, but refreshing, to write
7818 a program that actually has a <code>main()</code> function.</p>
7819 <p>My experience with Rust so far has been threefold:</p>
7820 <ul>
7821 <li>
7822 <p>Porting chunks of C to Rust for librsvg - this is all work on
7823 librsvg's internals and no users are exposed to it directly.</p>
7824 </li>
7825 <li>
7826 <p>Working on <a href="https://github.com/nikomatsakis/gnome-class/">gnome-class</a>, the procedural macro ("a little compiler")
7827 to generate GObject boilerplate from Rust. This feels like working
7828 on the edge of the exotic; it is something that runs <em>in</em> the Rust
7829 compiler and spits code on behalf of the programmer.</p>
7830 </li>
7831 <li>
7832 <p>A few patches to the <a href="http://gtk-rs.org">gtk-rs</a> ecosystem. Again, work on the
7833 internals, or something that feels library-like.</p>
7834 </li>
7835 </ul>
7836 <p>But other than toy programs to test things, I haven't written a
7837 stand-alone tool until <a href="https://people.gnome.org/~federico/blog/rsvg-bench.html">rsvg-bench</a>. It's quite a thrill to be able
7838 to just <em>run the thing</em> instead of waiting for other people to write
7839 code to use it!</p>
7840 <h1>Parsing command-line arguments</h1>
7841 <p>There are quite a few Rust crates ("libraries") to parse command-line
7842 arguments. I read about <a href="https://docs.rs/structopt-derive/0.1.5/structopt_derive/">structopt</a> via <a href="http://robert.ocallahan.org/2017/11/in-praise-of-rusts-structopt-for.html">Robert O'Callahan's
7843 blog</a>; structopt lets you define a <code>struct</code> to hold the values of
7844 your command-line options, and then you annotate the fields in that
7845 <code>struct</code> to indicate how they should be parsed from the command line.
7846 It works via Rust's procedural macros. Internally it generates stuff
7847 for the <a href="https://docs.rs/clap/2.29.2/clap/">clap</a> crate, a well-established mechanism for dealing with
7848 command-line options.</p>
7849 <p>And it is quite pleasant! This is basically all I needed to do:</p>
7850 <div class="highlight"><pre><span></span><code><span class="cp">#[derive(StructOpt, Debug)]</span><span class="w"></span>
7851 <span class="cp">#[structopt(name = </span><span class="s">&quot;rsvg-bench&quot;</span><span class="cp">, about = </span><span class="s">&quot;Benchmarking utility for librsvg.&quot;</span><span class="cp">)]</span><span class="w"></span>
7852 <span class="k">struct</span> <span class="nc">Opt</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7853 <span class="w"> </span><span class="cp">#[structopt(short = </span><span class="s">&quot;s&quot;</span><span class="cp">,</span>
7854 <span class="cp"> long = </span><span class="s">&quot;sleep&quot;</span><span class="cp">,</span>
7855 <span class="cp"> help = </span><span class="s">&quot;Number of seconds to sleep before starting to process SVGs&quot;</span><span class="cp">,</span>
7856 <span class="cp"> default_value = </span><span class="s">&quot;0&quot;</span><span class="cp">)]</span><span class="w"></span>
7857 <span class="w"> </span><span class="n">sleep_secs</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"></span>
7858
7859 <span class="w"> </span><span class="cp">#[structopt(short = </span><span class="s">&quot;p&quot;</span><span class="cp">,</span>
7860 <span class="cp"> long = </span><span class="s">&quot;num-parse&quot;</span><span class="cp">,</span>
7861 <span class="cp"> help = </span><span class="s">&quot;Number of times to parse each file&quot;</span><span class="cp">,</span>
7862 <span class="cp"> default_value = </span><span class="s">&quot;100&quot;</span><span class="cp">)]</span><span class="w"></span>
7863 <span class="w"> </span><span class="n">num_parse</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"></span>
7864
7865 <span class="w"> </span><span class="cp">#[structopt(short = </span><span class="s">&quot;r&quot;</span><span class="cp">,</span>
7866 <span class="cp"> long = </span><span class="s">&quot;num-render&quot;</span><span class="cp">,</span>
7867 <span class="cp"> help = </span><span class="s">&quot;Number of times to render each file&quot;</span><span class="cp">,</span>
7868 <span class="cp"> default_value = </span><span class="s">&quot;100&quot;</span><span class="cp">)]</span><span class="w"></span>
7869 <span class="w"> </span><span class="n">num_render</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"></span>
7870
7871 <span class="w"> </span><span class="cp">#[structopt(long = </span><span class="s">&quot;pixbuf&quot;</span><span class="cp">,</span>
7872 <span class="cp"> help = </span><span class="s">&quot;Render to a GdkPixbuf instead of a Cairo image surface&quot;</span><span class="cp">)]</span><span class="w"></span>
7873 <span class="w"> </span><span class="n">render_to_pixbuf</span>: <span class="kt">bool</span><span class="p">,</span><span class="w"></span>
7874
7875 <span class="w"> </span><span class="cp">#[structopt(help = </span><span class="s">&quot;Input files or directories&quot;</span><span class="cp">,</span>
7876 <span class="cp"> parse(from_os_str))]</span><span class="w"></span>
7877 <span class="w"> </span><span class="n">inputs</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">PathBuf</span><span class="o">&gt;</span><span class="w"></span>
7878 <span class="p">}</span><span class="w"></span>
7879
7880 <span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7881 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">opt</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Opt</span>::<span class="n">from_args</span><span class="p">();</span><span class="w"></span>
7882
7883 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">opt</span><span class="p">.</span><span class="n">inputs</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7884 <span class="w"> </span><span class="fm">eprintln!</span><span class="p">(</span><span class="s">&quot;No input files or directories specified</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">);</span><span class="w"></span>
7885 <span class="w"> </span><span class="n">process</span><span class="p">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span><span class="w"></span>
7886 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7887
7888 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
7889 <span class="p">}</span><span class="w"></span>
7890 </code></pre></div>
7891
7892 <p>Each field in the <code>Opt</code> struct above corresponds to one command-line
7893 argument; each field has annotations for <code>structopt</code> to generate the
7894 appropriate code to parse each option. For example, the
7895 <code>render_to_pixbuf</code> field has a long option name called <code>"pixbuf"</code>;
7896 that field will be set to <code>true</code> if the <code>--pixbuf</code> option gets passed
7897 to rsvg-bench.</p>
7898 <h1>Handling errors</h1>
7899 <p>Command-line programs generally have the luxury of being able to just
7900 exit as soon as they encounter an error.</p>
7901 <p>In C this is a bit cumbersome since you need to deal with <em>every</em>
7902 place that may return an error, find out what to print, and call
7903 <code>exit(1)</code> by hand or something. If you miss a single place where an
7904 error is returned, your program will keep running with an inconsistent
7905 state.</p>
7906 <p>In languages with exception handling, it's a bit easier - a small
7907 script can just let exceptions be thrown wherever, and if it catches
7908 them at the toplevel, it can just print the exception and abort
7909 gracefully. However, these nonlocal jumps make me uncomfortable; I
7910 think <a href="http://joeduffyblog.com/2016/02/07/the-error-model/">exceptions are hard to reason about</a>.</p>
7911 <p>Rust makes this easy: it forces you to handle every call that may
7912 return an error, but it lets you bubble errors up easily, or handle
7913 them in-place, or translate them to a higher-level error.</p>
7914 <p>In the Rust world the [<code>failure</code>] crate is getting a lot of traction
7915 as a convenient, modern way to handle errors.</p>
7916 <p>In rsvg-bench, errors can come from several places:</p>
7917 <ul>
7918 <li>
7919 <p>I/O errors when reading files and directories.</p>
7920 </li>
7921 <li>
7922 <p>Errors from librsvg's parsing stage; you get a <a href="https://developer.gnome.org/glib/stable/glib-Error-Reporting.html">GError</a>.</p>
7923 </li>
7924 <li>
7925 <p>Errors from the rendering stage. This can be a Cairo error (a
7926 <a href="https://www.cairographics.org/manual/cairo-Error-handling.html">cairo_status_t</a>), or a simple "something bad happened; can't
7927 render" from librsvg's old convenience api in C. Don't you hate it
7928 when C code just gives up and returns NULL or a boolean false,
7929 without any further details on <em>what</em> went wrong?</p>
7930 </li>
7931 </ul>
7932 <p>For rsvg-bench, I just needed to be able to represent Cairo errors and
7933 generic rendering errors. Everything else, like an <code>io::Error</code>, is
7934 automatically wrapped by the <code>failure</code> crate's mechanism. I just
7935 needed to do this:</p>
7936 <div class="highlight"><pre><span></span><code><span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">failure</span><span class="p">;</span><span class="w"></span>
7937 <span class="cp">#[macro_use]</span><span class="w"></span>
7938 <span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">failure_derive</span><span class="p">;</span><span class="w"></span>
7939
7940 <span class="cp">#[derive(Debug, Fail)]</span><span class="w"></span>
7941 <span class="k">enum</span> <span class="nc">ProcessingError</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7942 <span class="w"> </span><span class="cp">#[fail(display = </span><span class="s">&quot;Cairo error: {:?}&quot;</span><span class="cp">, status)]</span><span class="w"></span>
7943 <span class="w"> </span><span class="n">CairoError</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7944 <span class="w"> </span><span class="n">status</span>: <span class="nc">cairo</span>::<span class="n">Status</span><span class="w"></span>
7945 <span class="w"> </span><span class="p">},</span><span class="w"></span>
7946
7947 <span class="w"> </span><span class="cp">#[fail(display = </span><span class="s">&quot;Rendering error&quot;</span><span class="cp">)]</span><span class="w"></span>
7948 <span class="w"> </span><span class="n">RenderingError</span><span class="w"></span>
7949 <span class="p">}</span><span class="w"></span>
7950 </code></pre></div>
7951
7952 <p>Whenever the code gets a Cairo error, I can translate it to a
7953 <code>ProcessingError::CairoError</code> and bubble it up:</p>
7954 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">render_to_cairo</span><span class="p">(</span><span class="n">handle</span>: <span class="kp">&amp;</span><span class="nc">rsvg</span>::<span class="n">Handle</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7955 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">dim</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">handle</span><span class="p">.</span><span class="n">get_dimensions</span><span class="p">();</span><span class="w"></span>
7956 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">surface</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cairo</span>::<span class="n">ImageSurface</span>::<span class="n">create</span><span class="p">(</span><span class="n">cairo</span>::<span class="n">Format</span>::<span class="n">ARgb32</span><span class="p">,</span><span class="w"></span>
7957 <span class="w"> </span><span class="n">dim</span><span class="p">.</span><span class="n">width</span><span class="p">,</span><span class="w"></span>
7958 <span class="w"> </span><span class="n">dim</span><span class="p">.</span><span class="n">height</span><span class="p">)</span><span class="w"></span>
7959 <span class="w"> </span><span class="p">.</span><span class="n">map_err</span><span class="p">(</span><span class="o">|</span><span class="n">e</span><span class="o">|</span><span class="w"> </span><span class="n">ProcessingError</span>::<span class="n">CairoError</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">status</span>: <span class="nc">e</span><span class="w"> </span><span class="p">})</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
7960
7961 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
7962 <span class="p">}</span><span class="w"></span>
7963 </code></pre></div>
7964
7965 <p>And when librsvg returns a "couldn't render" error, I translate that
7966 to a <code>ProcessingError::RenderingError</code>:</p>
7967 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">render_to_cairo</span><span class="p">(</span><span class="n">handle</span>: <span class="kp">&amp;</span><span class="nc">rsvg</span>::<span class="n">Handle</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7968 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
7969
7970 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">cr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cairo</span>::<span class="n">Context</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">surface</span><span class="p">);</span><span class="w"></span>
7971
7972 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">handle</span><span class="p">.</span><span class="n">render_cairo</span><span class="p">(</span><span class="o">&amp;</span><span class="n">cr</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7973 <span class="w"> </span><span class="nb">Ok</span><span class="p">(())</span><span class="w"></span>
7974 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7975 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">Error</span>::<span class="n">from</span><span class="p">(</span><span class="n">ProcessingError</span>::<span class="n">RenderingError</span><span class="p">))</span><span class="w"></span>
7976 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7977 <span class="p">}</span><span class="w"></span>
7978 </code></pre></div>
7979
7980 <p>Here, the <code>Ok()</code> case of the <code>Result</code> does not contain any value —
7981 it's just <code>()</code>, as the generated images are not stored anywhere: they
7982 are just rendered to get some timings, not to be saved or anything.</p>
7983 <h1>Up to where do errors bubble?</h1>
7984 <p>This is the "do everything" function:</p>
7985 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">run</span><span class="p">(</span><span class="n">opt</span>: <span class="kp">&amp;</span><span class="nc">Opt</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7986 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
7987
7988 <span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">path</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="n">opt</span><span class="p">.</span><span class="n">inputs</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
7989 <span class="w"> </span><span class="n">process_path</span><span class="p">(</span><span class="n">opt</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">path</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
7990 <span class="w"> </span><span class="p">}</span><span class="w"></span>
7991
7992 <span class="w"> </span><span class="nb">Ok</span><span class="p">(())</span><span class="w"></span>
7993 <span class="p">}</span><span class="w"></span>
7994 </code></pre></div>
7995
7996 <p>For each path passed in the command line, process it. The program
7997 sees if the path corresponds to a directory, and it will scan it
7998 recursively. Or if the path is an SVG file, the program will load the
7999 file and render it.</p>
8000 <p>Finally, <code>main()</code> just has this:</p>
8001 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8002 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">opt</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Opt</span>::<span class="n">from_args</span><span class="p">();</span><span class="w"></span>
8003
8004 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
8005
8006 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">run</span><span class="p">(</span><span class="o">&amp;</span><span class="n">opt</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8007 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">(),</span><span class="w"></span>
8008 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8009 <span class="w"> </span><span class="fm">eprintln!</span><span class="p">(</span><span class="s">&quot;{}&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">e</span><span class="p">);</span><span class="w"></span>
8010 <span class="w"> </span><span class="n">process</span>::<span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span><span class="w"></span>
8011 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8012 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8013 <span class="p">}</span><span class="w"></span>
8014 </code></pre></div>
8015
8016 <p>I.e. process command line arguments, run the whole thing, and print an
8017 error if there was one.</p>
8018 <p>I really appreciate that most places that can return an error an just
8019 put a <code>?</code> for the error to bubble up. This is much more legible than
8020 in C, where every call must have an <code>if (something_bad_happened) {
8021 deal_with_it; }</code> after it... and Rust won't let me get away with
8022 ignoring an error, but it makes it easy to actually deal with it properly.</p>
8023 <h1>Reading an SVG file quickly</h1>
8024 <p>Why, just <code>mmap()</code> it and feed it to librsvg, to avoid buffer copies.
8025 This is easy in Rust:</p>
8026 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">process_file</span><span class="o">&lt;</span><span class="n">P</span>: <span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">Path</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">opt</span>: <span class="kp">&amp;</span><span class="nc">Opt</span><span class="p">,</span><span class="w"> </span><span class="n">path</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8027 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">File</span>::<span class="n">open</span><span class="p">(</span><span class="n">path</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
8028 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">mmap</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">MmapOptions</span>::<span class="n">new</span><span class="p">().</span><span class="n">map</span><span class="p">(</span><span class="o">&amp;</span><span class="n">file</span><span class="p">)</span><span class="o">?</span><span class="w"> </span><span class="p">};</span><span class="w"></span>
8029
8030 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">bytes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">mmap</span><span class="p">;</span><span class="w"></span>
8031
8032 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">handle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rsvg</span>::<span class="n">Handle</span>::<span class="n">new_from_data</span><span class="p">(</span><span class="n">bytes</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
8033 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
8034 <span class="p">}</span><span class="w"></span>
8035 </code></pre></div>
8036
8037 <p>Many things can go wrong here:</p>
8038 <ul>
8039 <li><code>File::open()</code> can return an io::Error.</li>
8040 <li><code>MmapOptions::map()</code> can return an io::Error from the <code>mmap(2)</code>
8041 system call, or from the <code>fstat(2)</code> to read the file's size to map
8042 it.</li>
8043 <li><code>rsvg::Handle::new_from_data()</code> can return a GError from parsing the
8044 file.</li>
8045 </ul>
8046 <p>The little <code>?</code> characters after each call that can return an error
8047 mean, just give me back the result, or convert the error to a
8048 <code>failure::Error</code> that can be examined later. This is beautifully
8049 legible to me.</p>
8050 <h1>Summary</h1>
8051 <p>Writing command-line programs in Rust is fun! It's nice to have
8052 neurotically-safe scripts that one can trust in the future.</p>
8053 <p><a href="https://gitlab.gnome.org/federico/rsvg-bench">Rsvg-bench is available here</a>.</p></content><category term="misc"></category><category term="rust"></category><category term="librsvg"></category><category term="gnome"></category></entry><entry><title>rsvg-bench - a benchmark for librsvg</title><link href="https://people.gnome.org/~federico/blog/rsvg-bench.html" rel="alternate"></link><published>2018-02-02T16:10:34-06:00</published><updated>2018-02-02T16:10:34-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-02-02:/~federico/blog/rsvg-bench.html</id><summary type="html"><p>Librsvg 2.42.0 came out with a rather major performance regression
8054 compared to 2.40.20: SVGs with many <a href="https://www.w3.org/TR/SVG/coords.html#TransformAttribute"><code>transform</code></a>
8055 attributes would slow it down. It was fixed in 2.42.1. We changed
8056 from using a <a href="https://github.com/lalrpop/lalrpop/issues/269">parser that would recompile regexes</a> each time it was
8057 called, to <a href="https://github.com/servo/rust-cssparser">one …</a></p></summary><content type="html"><p>Librsvg 2.42.0 came out with a rather major performance regression
8058 compared to 2.40.20: SVGs with many <a href="https://www.w3.org/TR/SVG/coords.html#TransformAttribute"><code>transform</code></a>
8059 attributes would slow it down. It was fixed in 2.42.1. We changed
8060 from using a <a href="https://github.com/lalrpop/lalrpop/issues/269">parser that would recompile regexes</a> each time it was
8061 called, to <a href="https://github.com/servo/rust-cssparser">one that does simple string-based matching</a> and
8062 parsing.</p>
8063 <p>When I rewrote librsvg's parser for the <code>transform</code> attribute from C
8064 to Rust, I was just <a href="https://people.gnome.org/~federico/news-2017-02.html#24">learning about writing parsers in Rust</a>.
8065 I chose <a href="https://github.com/lalrpop/lalrpop">lalrpop</a>, an excellent, Yacc-like parser generator for Rust.
8066 It generates big, fast parsers, like what you would need for a
8067 compiler — but it compiles the tokenizer's regexes each time you call
8068 the parser. This is not a problem for a compiler, where you basically
8069 call the parser only once, but in librsvg, we may call it thousands of
8070 times for an SVG file with thousands of objects with <code>transform</code>
8071 attributes.</p>
8072 <p>So, for 2.42.1 I rewrote that parser using
8073 <a href="https://github.com/servo/rust-cssparser">rust-cssparser</a>. This is what <a href="https://servo.org/">Servo</a> uses to
8074 parse CSS data; it's a simple tokenizer with an API that knows about
8075 CSS's particular constructs. This is exactly the kind of data that
8076 librsvg cares about. Today all of librsvg's internal parsers work
8077 using rust-cssparser, or they are so simple that they can be done with
8078 Rust's normal functions to split strings and such.</p>
8079 <h1>Getting good timings</h1>
8080 <p>Librsvg ships with <code>rsvg-convert</code>, a command-line utility that can
8081 render an SVG file and write the output to a PNG. While it would be
8082 possible to get timings for SVG rendering by timing how long
8083 <code>rsvg-convert</code> takes to run, it's a bit clunky for that. The process
8084 startup adds noise to the timings, and it only handles one file at a
8085 time.</p>
8086 <p>So, I've written <a href="https://gitlab.gnome.org/federico/rsvg-bench">rsvg-bench</a>, a small utility to get timings out of
8087 librsvg. I wanted a tool that:</p>
8088 <ul>
8089 <li>
8090 <p>Is able to process many SVG images with a single command. For
8091 example, this lets us answer a question like, "how long does version
8092 N of librsvg take to render a directory full of SVG icons?" — which
8093 is important for the performance of an application chooser.</p>
8094 </li>
8095 <li>
8096 <p>Is able to <em>repeatedly</em> process SVG files, for example, "render this
8097 SVG 1000 times in a row". This is useful to get accurate timings,
8098 as a single render may only take a few microseconds and may be hard
8099 to measure. It also helps with running profilers, as they will be
8100 able to get more useful samples if the SVG rendering process runs
8101 repeatedly for a long time.</p>
8102 </li>
8103 <li>
8104 <p>Exercises librsvg's major code paths for parsing and rendering
8105 separately. For example, librsvg uses different parts of the XML
8106 parser depending on whether it is being pushed data, vs. being asked
8107 to pull data from a stream. Also, we may only want to benchmark the
8108 parser but not the renderer; or we may want to parse SVGs only once
8109 but render them many times after that.</p>
8110 </li>
8111 <li>
8112 <p>Is aware of librsvg's peculiarities, such as the extra pass to
8113 convert a Cairo image surface to a GdkPixbuf when one uses the
8114 convenience function <code>rsvg_handle_get_pixbuf()</code>.</p>
8115 </li>
8116 </ul>
8117 <p>Currently rsvg-bench supports all of that.</p>
8118 <h1>An initial benchmark</h1>
8119 <p>I ran this</p>
8120 <p><code>/usr/bin/time rsvg-bench -p 1 -r 1 /usr/share/icons</code></p>
8121 <p>to cause every SVG icon in <code>/usr/share/icons</code> to be parsed once, and
8122 rendered once (i.e. just render every file sequentially). I did this
8123 for librsvg 2.40.20 (C only), and 2.42.{0, 1, 2} (C and Rust). There
8124 are 5522 SVG files in there. The timings look like this:</p>
8125 <table>
8126 <thead>
8127 <tr>
8128 <th>version</th>
8129 <th>time (sec)</th>
8130 </tr>
8131 </thead>
8132 <tbody>
8133 <tr>
8134 <td>2.40.20</td>
8135 <td>95.54</td>
8136 </tr>
8137 <tr>
8138 <td>2.42.0</td>
8139 <td>209.50</td>
8140 </tr>
8141 <tr>
8142 <td>2.42.1</td>
8143 <td>97.18</td>
8144 </tr>
8145 <tr>
8146 <td>2.42.2</td>
8147 <td>95.89</td>
8148 </tr>
8149 </tbody>
8150 </table>
8151 <p><img alt="Bar chart of timings" src="https://people.gnome.org/~federico/blog/images/rsvg-bench-timings.png"></p>
8152 <p>So, 2.42.0 was over twice as slow as the C-only version, due to the
8153 parsing problems. But now, 2.42.2 is practically just as fast as the
8154 C only version. What made this possible?</p>
8155 <ul>
8156 <li>2.40.20 - the old C-only version</li>
8157 <li>2.42.0 - C + Rust, with a lalrpop parser for the <code>transform</code> attribute</li>
8158 <li>2.42.1 - Servo's cssparser for the <code>transform</code> attribute</li>
8159 <li>2.42.2 - removed most C-to-Rust string copies during parsing</li>
8160 </ul>
8161 <p>I have started taking profiles of rsvg-bench runs with sysprof, and
8162 there are some improvements worth making. Expect news soon!</p>
8163 <p><a href="https://gitlab.gnome.org/federico/rsvg-bench">Rsvg-bench is available in Gnome's gitlab instance</a>.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category><category term="performance"></category></entry><entry><title>Help needed for librsvg 2.42.1</title><link href="https://people.gnome.org/~federico/blog/help-needed-for-librsvg-2421.html" rel="alternate"></link><published>2018-01-16T11:01:05-06:00</published><updated>2018-01-16T11:01:05-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-01-16:/~federico/blog/help-needed-for-librsvg-2421.html</id><summary type="html"><p>Would you like to help fix a couple of bugs in <a href="https://gitlab.gnome.org/GNOME/librsvg">librsvg</a>, in
8164 preparation for the 2.42.1 release?</p>
8165 <p>I have prepared a list of bugs which I'd like to be fixed in the
8166 <a href="https://gitlab.gnome.org/GNOME/librsvg/issues?milestone_title=2.42.1">2.42.1 milestone</a>. Two of them are assigned to myself, as
8167 I'm already working …</p></summary><content type="html"><p>Would you like to help fix a couple of bugs in <a href="https://gitlab.gnome.org/GNOME/librsvg">librsvg</a>, in
8168 preparation for the 2.42.1 release?</p>
8169 <p>I have prepared a list of bugs which I'd like to be fixed in the
8170 <a href="https://gitlab.gnome.org/GNOME/librsvg/issues?milestone_title=2.42.1">2.42.1 milestone</a>. Two of them are assigned to myself, as
8171 I'm already working on them.</p>
8172 <p>There are two other bugs which I'd love someone to look at. Neither
8173 of these requires deep knowledge of librsvg, just some debugging and
8174 code-writing:</p>
8175 <ul>
8176 <li>
8177 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/issues/141">Bug 141</a> - GNOME's thumbnailing machinery creates an icon which has
8178 the wrong fill: it's an image of a builder's trowel, and the inside
8179 is filled black instead of with a nice gradient. This is the only
8180 place in librsvg where a <code>cairo_surface_t</code> is converted to a
8181 <code>GdkPixbuf</code>; this involves unpremultiplying the alpha channel.
8182 Maybe the relevant function is buggy?</p>
8183 </li>
8184 <li>
8185 <p><a href="https://gitlab.gnome.org/GNOME/librsvg/issues/136">Bug 136</a>: The <code>stroke-dasharray</code> attribute in SVG elements is
8186 parsed incorrectly. It is a list of CSS length values, separated by
8187 commas or spaces. Currently librsvg uses a shitty parser based on
8188 <code>g_strsplit()</code> only for commas; it doesn't allow just a
8189 space-separated list. Then, it uses <code>g_ascii_strtod()</code> to parse
8190 plain numbers; it doesn't support CSS lengths generically. This
8191 parser needs to be rewritten in Rust; we already have machinery
8192 there to parse CSS length values properly.</p>
8193 </li>
8194 </ul>
8195 <p>Feel free to <a href="federico@gnome.org">contact me</a> by mail, or write something in the
8196 bugs themselves, if you would like to work on them. I'll happily
8197 guide you through the code :)</p></content><category term="misc"></category><category term="librsvg"></category></entry><entry><title>Librsvg gets Continuous Integration</title><link href="https://people.gnome.org/~federico/blog/librsvg-gets-continuous-integration.html" rel="alternate"></link><published>2018-01-12T14:04:20-06:00</published><updated>2018-01-12T14:04:20-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-01-12:/~federico/blog/librsvg-gets-continuous-integration.html</id><summary type="html"><p>One nice thing about <code>gitlab.gnome.org</code> is that we can now have
8198 Continuous Integration (CI) enabled for projects there. After every
8199 commit, the CI machinery can build the project, run the tests, and
8200 tell you if something goes wrong.</p>
8201 <p><a href="https://mail.gnome.org/archives/desktop-devel-list/2018-January/msg00008.html">Carlos Soriano posted</a> a "tips of the
8202 week" mail to …</p></summary><content type="html"><p>One nice thing about <code>gitlab.gnome.org</code> is that we can now have
8203 Continuous Integration (CI) enabled for projects there. After every
8204 commit, the CI machinery can build the project, run the tests, and
8205 tell you if something goes wrong.</p>
8206 <p><a href="https://mail.gnome.org/archives/desktop-devel-list/2018-January/msg00008.html">Carlos Soriano posted</a> a "tips of the
8207 week" mail to desktop-devel-list, and a link to how Nautilus
8208 implements CI in Gitlab. It turns out that it's reasonably easy to
8209 set up: you just create a <a href="https://docs.gitlab.com/ce/ci/yaml/README.html"><code>.gitlab-ci.yml</code></a> file in the
8210 toplevel of your project, and that has the configuration for what to
8211 run on every commit.</p>
8212 <p>Of course instead of reading the manual, I copied-and-pasted the file
8213 from Nautilus and just changed some things in it. <a href="https://gitlab.gnome.org/ci/lint">There is a .yml
8214 linter</a> so you can at least check the syntax before pushing a
8215 full job.</p>
8216 <p>Then I read <a href="https://mail.gnome.org/archives/desktop-devel-list/2018-January/msg00010.html">Robert Ancell's reply</a> about how simple-scan
8217 builds its CI jobs on both Fedora and Ubuntu... and then the
8218 realization hit me:</p>
8219 <p><em>This lets me CI librsvg on multiple distros at once.</em> I've had
8220 trouble with slight differences in fontconfig/freetype in the past,
8221 and this would let me catch them early.</p>
8222 <p>However, people on IRC advised against this, as <strong>we need more
8223 hardware</strong> to run CI on a large scale.</p>
8224 <p>Linux distros have a vested interest in getting code out of gnome.org
8225 that works well. Surely they can give us some hardware?</p></content><category term="misc"></category><category term="librsvg"></category><category term="gitlab"></category></entry><entry><title>Loving Gitlab.gnome.org, and getting notifications</title><link href="https://people.gnome.org/~federico/blog/loving-gitlab.html" rel="alternate"></link><published>2018-01-08T11:45:01-06:00</published><updated>2018-01-08T11:45:01-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2018-01-08:/~federico/blog/loving-gitlab.html</id><summary type="html"><p>I'm loving <a href="https://gitlab.gnome.org"><code>gitlab.gnome.org</code></a>. It has been only a couple of
8226 weeks since <a href="https://people.gnome.org/~federico/blog/librsvg-moves-to-gitlab.html">librsvg moved to gitlab</a>, and I've
8227 already received and merged <a href="https://gitlab.gnome.org/GNOME/librsvg/merge_requests?scope=all&amp;utf8=%E2%9C%93&amp;state=merged">two merge requests</a>. (Isn't it a bit
8228 weird that Github uses "pull request" and Everyone(tm) knows the PR
8229 acronym, but Gitlab uses "merge request"?)</p>
8230 <h1>Notifications …</h1></summary><content type="html"><p>I'm loving <a href="https://gitlab.gnome.org"><code>gitlab.gnome.org</code></a>. It has been only a couple of
8231 weeks since <a href="https://people.gnome.org/~federico/blog/librsvg-moves-to-gitlab.html">librsvg moved to gitlab</a>, and I've
8232 already received and merged <a href="https://gitlab.gnome.org/GNOME/librsvg/merge_requests?scope=all&amp;utf8=%E2%9C%93&amp;state=merged">two merge requests</a>. (Isn't it a bit
8233 weird that Github uses "pull request" and Everyone(tm) knows the PR
8234 acronym, but Gitlab uses "merge request"?)</p>
8235 <h1>Notifications about merge requests</h1>
8236 <p>One thing to note if your GNOME project has moved to Gitlab: <strong>if you
8237 want to get notified of incoming merge requests</strong>, you need
8238 to tell Gitlab that you want to "<strong>Watch</strong>" that project, instead of
8239 using one of the default notification settings. <a href="https://gitlab.gnome.org/GNOMEInfrastructure/GitLab/issues/95">Thanks to Carlos
8240 Soriano</a> for making me aware of this.</p>
8241 <h1>Notifications from Github's mirror</h1>
8242 <p>The <a href="https://github.com/GNOME/">github</a> mirror of git.gnome.org is configured so that pull
8243 requests are <a href="https://wiki.gnome.org/Sysadmin/GitHub">automatically closed</a>, since currently there
8244 is no way to notify the upstream maintainers when someone creates a
8245 pull request in the mirror (this is super-unfriendly by default, but
8246 at least submitters get notified that their PR would not be looked at
8247 by anyone, by default).</p>
8248 <p>If you have a Github account, you can Watch the project in question to
8249 get notified — the bot will close the pull request, but you will get
8250 notified, and then you can check it by hand, review it as appropriate,
8251 or redirect the submitter to gitlab.gnome.org instead.</p></content><category term="misc"></category><category term="gnome"></category><category term="gitlab"></category></entry><entry><title>Librsvg 2.40.20 is released</title><link href="https://people.gnome.org/~federico/blog/librsvg-24020-is-released.html" rel="alternate"></link><published>2017-12-15T18:31:50-06:00</published><updated>2017-12-15T18:31:50-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-12-15:/~federico/blog/librsvg-24020-is-released.html</id><summary type="html"><p>Today I released <a href="https://ftp.gnome.org/pub/GNOME/sources/librsvg/2.40/">librsvg 2.40.20</a>. This will be the
8252 <strong>last release</strong> in the 2.40.x series, which is deprecated effectively
8253 immediately.</p>
8254 <p>People and distros are <strong>strongly encouraged</strong> to switch to
8255 <a href="https://ftp.gnome.org/pub/GNOME/sources/librsvg/2.41/">librsvg 2.41.x</a> as soon as possible. This is the version that is
8256 implemented in a …</p></summary><content type="html"><p>Today I released <a href="https://ftp.gnome.org/pub/GNOME/sources/librsvg/2.40/">librsvg 2.40.20</a>. This will be the
8257 <strong>last release</strong> in the 2.40.x series, which is deprecated effectively
8258 immediately.</p>
8259 <p>People and distros are <strong>strongly encouraged</strong> to switch to
8260 <a href="https://ftp.gnome.org/pub/GNOME/sources/librsvg/2.41/">librsvg 2.41.x</a> as soon as possible. This is the version that is
8261 implemented in a mixture of C and Rust. It is 100% API and ABI
8262 compatible with 2.40.x, so it is a drop-in replacement for it. If you
8263 or your distro can compile Firefox 57, you can probably build
8264 librsvg-2.41.x without problems.</p>
8265 <h1>Some statistics</h1>
8266 <p>Here are a few runs of <a href="https://github.com/cgag/loc">loc</a> — a tool to count lines of code —
8267 when run on librsvg. The output is trimmed by hand to only include C
8268 and Rust files.</p>
8269 <div class="highlight"><pre><span></span><code><span class="c">This is 2</span><span class="nt">.</span><span class="c">40</span><span class="nt">.</span><span class="c">20:</span>
8270 <span class="nb">-------------------------------------------------------</span><span class="c"></span>
8271 <span class="c"> Language Files Lines Blank Comment Code</span>
8272 <span class="nb">-------------------------------------------------------</span><span class="c"></span>
8273 <span class="c"> C 41 20972 3438 2100 15434</span>
8274 <span class="c"> C/C</span><span class="nb">++</span><span class="c"> Header 27 2377 452 625 1300</span>
8275 </code></pre></div>
8276
8277 <div class="highlight"><pre><span></span><code><span class="c">This is 2</span><span class="nt">.</span><span class="c">41</span><span class="nt">.</span><span class="c">latest (the master branch):</span>
8278 <span class="nb">-------------------------------------------------------</span><span class="c"></span>
8279 <span class="c"> Language Files Lines Blank Comment Code</span>
8280 <span class="nb">-------------------------------------------------------</span><span class="c"></span>
8281 <span class="c"> C 34 17253 3024 1892 12337</span>
8282 <span class="c"> C/C</span><span class="nb">++</span><span class="c"> Header 23 2327 501 624 1202</span>
8283 <span class="c"> Rust 38 11254 1873 675 8706</span>
8284 </code></pre></div>
8285
8286 <div class="highlight"><pre><span></span><code><span class="c">And this is 2</span><span class="nt">.</span><span class="c">41</span><span class="nt">.</span><span class="c">latest *without unit tests*</span><span class="nt">,</span><span class="c"> </span>
8287 <span class="c">just &quot;real source code&quot;:</span>
8288 <span class="nb">-------------------------------------------------------</span><span class="c"></span>
8289 <span class="c"> Language Files Lines Blank Comment Code</span>
8290 <span class="nb">-------------------------------------------------------</span><span class="c"></span>
8291 <span class="c"> C 34 17253 3024 1892 12337</span>
8292 <span class="c"> C/C</span><span class="nb">++</span><span class="c"> Header 23 2327 501 624 1202</span>
8293 <span class="c"> Rust 38 9340 1513 610 7217</span>
8294 </code></pre></div>
8295
8296 <h2>Summary</h2>
8297 <p>Not counting blank lines nor comments:</p>
8298 <ul>
8299 <li>
8300 <p>The C-only version has 16734 lines of C code.</p>
8301 </li>
8302 <li>
8303 <p>The C-only version has <strong>no unit tests</strong>, just some integration tests.</p>
8304 </li>
8305 <li>
8306 <p>The Rust-and-C version has 13539 lines of C code, 7217 lines of Rust
8307 code, and 1489 lines of unit tests in Rust.</p>
8308 </li>
8309 </ul>
8310 <p>As for the integration tests:</p>
8311 <ul>
8312 <li>
8313 <p>The C-only version has 64 integration tests.</p>
8314 </li>
8315 <li>
8316 <p>The Rust-and-C version has 130 integration tests.</p>
8317 </li>
8318 </ul>
8319 <p>The Rust-and-C version supports a few more SVG features, and it is A
8320 LOT more robust and spec-compliant with the SVG features that were
8321 supported in the C-only version.</p>
8322 <p>The C sources in librsvg are shrinking steadily. It would be
8323 incredibly awesome if someone could run some <code>git filter-branch</code> magic
8324 with the <a href="https://github.com/cgag/loc"><code>loc</code></a> tool and generate some pretty graphs of source
8325 lines vs. commits over time.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category><category term="rust"></category></entry><entry><title>Librsvg moves to Gitlab</title><link href="https://people.gnome.org/~federico/blog/librsvg-moves-to-gitlab.html" rel="alternate"></link><published>2017-12-13T14:09:32-06:00</published><updated>2017-12-13T14:09:32-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-12-13:/~federico/blog/librsvg-moves-to-gitlab.html</id><summary type="html"><p>Librsvg now lives in GNOME's <a href="https://gitlab.gnome.org/">Gitlab</a> instance. You can
8326 access it <a href="https://gitlab.gnome.org/GNOME/librsvg">here</a>.</p>
8327 <p>Gitlab allows workflows similar to Github: you can create an account
8328 there, fork the librsvg repository, file bug reports, create merge
8329 requests... Hopefully this will make it nicer for contributors.</p>
8330 <p>In the meantime, feel free to <a href="https://gitlab.gnome.org/GNOME/librsvg">take a …</a></p></summary><content type="html"><p>Librsvg now lives in GNOME's <a href="https://gitlab.gnome.org/">Gitlab</a> instance. You can
8331 access it <a href="https://gitlab.gnome.org/GNOME/librsvg">here</a>.</p>
8332 <p>Gitlab allows workflows similar to Github: you can create an account
8333 there, fork the librsvg repository, file bug reports, create merge
8334 requests... Hopefully this will make it nicer for contributors.</p>
8335 <p>In the meantime, feel free to <a href="https://gitlab.gnome.org/GNOME/librsvg">take a look</a>!</p>
8336 <p>This is a huge improvement for GNOME's development infrastructure.
8337 Thanks to Carlos Soriano, Andrea Veri, Philip Chimento, Alberto Ruiz,
8338 and all the people that made the move to Gitlab possible.</p></content><category term="misc"></category><category term="librsvg"></category><category term="gnome"></category></entry><entry><title>A mini-rant on the lack of string slices in C</title><link href="https://people.gnome.org/~federico/blog/rant-on-string-slices.html" rel="alternate"></link><published>2017-12-07T09:54:14-06:00</published><updated>2017-12-07T09:54:14-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-12-07:/~federico/blog/rant-on-string-slices.html</id><summary type="html"><p>Porting of librsvg to Rust goes on. Yesterday I started porting the
8339 C code that implements SVG's <code>&lt;text&gt;</code> family of elements. I have also
8340 been replacing the little parsers in librsvg with Rust code.</p>
8341 <p>And these days, the lack of string slices in C is bothering me <em>a
8342 lot</em>.</p>
8343 <h1>What …</h1></summary><content type="html"><p>Porting of librsvg to Rust goes on. Yesterday I started porting the
8344 C code that implements SVG's <code>&lt;text&gt;</code> family of elements. I have also
8345 been replacing the little parsers in librsvg with Rust code.</p>
8346 <p>And these days, the lack of string slices in C is bothering me <em>a
8347 lot</em>.</p>
8348 <h1>What if...</h1>
8349 <p>It feels like it should be easy to just write something like</p>
8350 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
8351 <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">ptr</span><span class="p">;</span>
8352 <span class="kt">size_t</span> <span class="n">len</span><span class="p">;</span>
8353 <span class="p">}</span> <span class="n">StringSlice</span><span class="p">;</span>
8354 </code></pre></div>
8355
8356 <p>And then a whole family of functions. The starting point, where you
8357 slice a whole string:</p>
8358 <div class="highlight"><pre><span></span><code><span class="n">StringSlice</span>
8359 <span class="nf">make_slice_from_string</span> <span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">s</span><span class="p">)</span>
8360 <span class="p">{</span>
8361 <span class="n">StringSlice</span> <span class="n">slice</span><span class="p">;</span>
8362
8363 <span class="n">assert</span> <span class="p">(</span><span class="n">s</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">);</span>
8364
8365 <span class="n">slice</span><span class="p">.</span><span class="n">ptr</span> <span class="o">=</span> <span class="n">s</span><span class="p">;</span>
8366 <span class="n">slice</span><span class="p">.</span><span class="n">len</span> <span class="o">=</span> <span class="n">strlen</span> <span class="p">(</span><span class="n">s</span><span class="p">);</span>
8367 <span class="k">return</span> <span class="n">slice</span><span class="p">;</span>
8368 <span class="p">}</span>
8369 </code></pre></div>
8370
8371 <p>But that wouldn't keep track of the lifetime of the original string.
8372 Okay, this is C, so you are used to keeping track of that yourself.</p>
8373 <p>Onwards. Substrings?</p>
8374 <div class="highlight"><pre><span></span><code><span class="n">StringSlice</span>
8375 <span class="nf">make_sub_slice</span><span class="p">(</span><span class="n">StringSlice</span> <span class="n">slice</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">start</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">len</span><span class="p">)</span>
8376 <span class="p">{</span>
8377 <span class="n">StringSlice</span> <span class="n">sub</span><span class="p">;</span>
8378
8379 <span class="n">assert</span> <span class="p">(</span><span class="n">len</span> <span class="o">&lt;=</span> <span class="n">slice</span><span class="p">.</span><span class="n">len</span><span class="p">);</span>
8380 <span class="n">assert</span> <span class="p">(</span><span class="n">start</span> <span class="o">&lt;=</span> <span class="n">slice</span><span class="p">.</span><span class="n">len</span> <span class="o">-</span> <span class="n">len</span><span class="p">);</span> <span class="cm">/* Not &quot;start + len &lt;= slice.len&quot; or it can overflow. */</span>
8381 <span class="cm">/* The subtraction can&#39;t underflow because of the previous assert */</span>
8382 <span class="n">sub</span><span class="p">.</span><span class="n">ptr</span> <span class="o">=</span> <span class="n">slice</span><span class="p">.</span><span class="n">ptr</span> <span class="o">+</span> <span class="n">start</span><span class="p">;</span>
8383 <span class="n">sub</span><span class="p">.</span><span class="n">len</span> <span class="o">=</span> <span class="n">len</span><span class="p">;</span>
8384 <span class="k">return</span> <span class="n">sub</span><span class="p">;</span>
8385 <span class="p">}</span>
8386 </code></pre></div>
8387
8388 <p>Then you could write a million wrappers for <code>g_strsplit()</code> and
8389 friends, or equivalents to them, to give you slices instead of C
8390 strings. But then:</p>
8391 <ul>
8392 <li>
8393 <p>You have to keep track of lifetimes yourself.</p>
8394 </li>
8395 <li>
8396 <p>You have to wrap every function that returns a plain "<code>char *</code>"...</p>
8397 </li>
8398 <li>
8399 <p>... and every function that takes a plain "<code>char *</code>" as an argument,
8400 without a length parameter, because...</p>
8401 </li>
8402 <li>
8403 <p>You <strong>CANNOT</strong> take <code>slice.ptr</code> and pass it to a function that just
8404 expects a plain "<code>char *</code>", because your slice does not include a
8405 nul terminator (the <code>'\0</code> byte at the end of a C string). This is
8406 what kills the whole plan.</p>
8407 </li>
8408 </ul>
8409 <p>Even if you had a helper library that implements C string slices
8410 like that, you would have a mismatch every time you needed to call a C
8411 function that expects a conventional C string in the form of a
8412 "<code>char *</code>". <em>You need to put a nul terminator somewhere</em>, and if you
8413 only have a slice, you need to <em>allocate memory</em>, copy the slice into
8414 it, and slap a 0 byte at the end. <em>Then</em> you can pass that to a
8415 function that expects a normal C string.</p>
8416 <p>There is hacky C code that needs to pass a substring to another
8417 function, so it <em>overwrites the byte after the substring with a 0</em>,
8418 passes the substring, and <em>overwrites the byte back</em>. This is
8419 horrible, and doesn't work with strings that live in read-only
8420 memory. But that's the best that C lets you do.</p>
8421 <p>I'm very happy with string slices in Rust, which work exactly like the
8422 <code>StringSlice</code> above, but <code>&amp;str</code> is actually at the language level and
8423 everything knows how to handle it.</p>
8424 <p>The <code>glib-rs</code> crate has conversion traits to go from Rust strings or
8425 slices into C, and vice-versa. We alredy saw some of those in the
8426 blog post about <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html#pointer-types">conversions in Glib-rs</a>.</p>
8427 <h1>Sizes of things</h1>
8428 <p>Rust uses <code>usize</code> to specify the size of things; it's an unsigned
8429 integer; 32 bits on 32-bit machines, and 64 bits on 64-bit machines;
8430 it's like C's <code>size_t</code>.</p>
8431 <p>In the Glib/C world, we have an assortment of types to represent the
8432 sizes of things:</p>
8433 <ul>
8434 <li>
8435 <p><code>gsize</code>, the same as <code>size_t</code>. This is an unsigned integer; it's okay.</p>
8436 </li>
8437 <li>
8438 <p><code>gssize</code>, a signed integer of the same size as <code>gsize</code>. This is
8439 okay if used to represent a negative offset, and <em>really funky</em> in
8440 the Glib functions like
8441 <code>g_string_new_len (const char *str, gssize len)</code>, where <code>len == -1</code>
8442 means "call <code>strlen(str)</code> for me because I'm too lazy to compute the
8443 length myself".</p>
8444 </li>
8445 <li>
8446 <p><code>int</code> - broken, as in libxml2, but we can't change the API. On
8447 64-bit machines, an <code>int</code> to specify a length means you can't pass
8448 objects bigger than 2 GB.</p>
8449 </li>
8450 <li>
8451 <p><code>long</code> - marginally better than <code>int</code>, since it has a better chance
8452 of actually being the same size as <code>size_t</code>, but still funky.
8453 Probably okay for negative offsets; problematic for sizes which
8454 should really be unsigned.</p>
8455 </li>
8456 <li>
8457 <p>etc.</p>
8458 </li>
8459 </ul>
8460 <p>I'm not sure how old <code>size_t</code> is in the C standard library, but it
8461 can't have been there since the beginning of time &mdash; otherwise
8462 people wouldn't have been using <code>int</code> to specify the sizes of things.</p></content><category term="misc"></category><category term="rust"></category><category term="librsvg"></category></entry><entry><title>Code Hospitality</title><link href="https://people.gnome.org/~federico/blog/code-hospitality.html" rel="alternate"></link><published>2017-11-17T14:21:04-06:00</published><updated>2017-11-17T14:21:06-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-11-17:/~federico/blog/code-hospitality.html</id><summary type="html"><p>Recently on the <a href="http://www.greaterthancode.com/">Greater than Code podcast</a> there was an episode
8463 called "<a href="http://www.greaterthancode.com/podcast/054-code-hospitality-with-nadia-odunayo/">Code Hospitality</a>", by
8464 <a href="http://www.nadiaodunayo.com/">Nadia Odunayo</a>. </p>
8465 <p>Nadia talks about thinking of how to make people comfortable in your
8466 code and in your team/organization/etc., and does it in terms of
8467 thinking about host/guest relationships. Have you ever …</p></summary><content type="html"><p>Recently on the <a href="http://www.greaterthancode.com/">Greater than Code podcast</a> there was an episode
8468 called "<a href="http://www.greaterthancode.com/podcast/054-code-hospitality-with-nadia-odunayo/">Code Hospitality</a>", by
8469 <a href="http://www.nadiaodunayo.com/">Nadia Odunayo</a>. </p>
8470 <p>Nadia talks about thinking of how to make people comfortable in your
8471 code and in your team/organization/etc., and does it in terms of
8472 thinking about host/guest relationships. Have you ever stayed in an
8473 AirBnB where the host carefully prepares some "welcome instructions"
8474 for you, or puts little notes in their apartment to orient/guide you,
8475 or gives you basic guidance around their city's transportation system?
8476 We can think in similar ways of how to make people comfortable with
8477 code bases.</p>
8478 <p>This of course hit me on so many levels, because in the past I've
8479 written about analogies between software and urbanism/architecture.
8480 <a href="https://people.gnome.org/~federico/docs/software-with-qwan/index.html">Software that has the Quality Without A Name</a>
8481 talks about Christopher Alexander's architecture/urbanism patterns in
8482 the context of software, based on <a href="http://dreamsongs.com/">Richard Gabriel</a>'s ideas,
8483 and <a href="http://zeta.math.utsa.edu/~yxk833/">Nikos Salingaros</a>'s formalization of the design
8484 process. <a href="https://people.gnome.org/~federico/blog/legacy-systems-as-old-cities.html">Legacy Systems as Old Cities</a> talks about
8485 how GNOME evolved parts of its user-visible software, and makes an
8486 analogy with cities that evolve over time instead of being torn down
8487 and rebuilt, based on urbanism ideas by Jane Jacobs, and
8488 architecture/construction ideas by Stewart Brand.</p>
8489 <p>I definitely intend to do some thinking on Nadia's ideas for Code
8490 Hospitality and try to connect them with this.</p>
8491 <p>In the meantime, I've just rewritten the <a href="https://github.com/federicomenaquintero/gnome-class/blob/master/README.md">README in
8492 gnome-class</a> to make it suitable as an
8493 introduction to hacking there.</p></content><category term="misc"></category><category term="software with living structure"></category></entry><entry><title>Rust+GNOME Hackfest in Berlin, 2017</title><link href="https://people.gnome.org/~federico/blog/rust-gnome-hackfest-berlin.html" rel="alternate"></link><published>2017-11-16T18:33:58-06:00</published><updated>2017-11-16T18:34:01-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-11-16:/~federico/blog/rust-gnome-hackfest-berlin.html</id><summary type="html"><p>Last weekend I was in Berlin for the <a href="https://wiki.gnome.org/Hackfests/Rust2017-2">second Rust+GNOME
8494 Hackfest</a>, kindly hosted at the <a href="https://kinvolk.io/">Kinvolk</a> office.
8495 This is in a <em>great</em> location, half a block away from the <a href="https://www.openstreetmap.org/node/3900136339">Kottbusser
8496 Tor</a> station, right at the entrance of the trendy Kreuzberg
8497 neighborhood — full of interesting people, incredible graffitti, and
8498 good …</p></summary><content type="html"><p>Last weekend I was in Berlin for the <a href="https://wiki.gnome.org/Hackfests/Rust2017-2">second Rust+GNOME
8499 Hackfest</a>, kindly hosted at the <a href="https://kinvolk.io/">Kinvolk</a> office.
8500 This is in a <em>great</em> location, half a block away from the <a href="https://www.openstreetmap.org/node/3900136339">Kottbusser
8501 Tor</a> station, right at the entrance of the trendy Kreuzberg
8502 neighborhood — full of interesting people, incredible graffitti, and
8503 good, diverse food.</p>
8504 <p><a href="https://people.gnome.org/~federico/blog/images/kottbusser.jpg"><img alt="Rug of Kottbusser Tor" src="https://people.gnome.org/~federico/blog/images/kottbusser-thumb.jpg"></a></p>
8505 <h1>My goals for the hackfest</h1>
8506 <p>Over the past weeks I had been converting <a href="https://github.com/federicomenaquintero/gnome-class">gnome-class</a>
8507 from the old <a href="https://github.com/nikomatsakis/lalrpop/">lalrpop</a>-based parser into the new Procedural
8508 Macros framework for Rust, or <code>proc-macro2</code> for short. To do this the
8509 parser for the gnome-class mini-language needs to be rewritten from
8510 being specified in a lalrpop grammar, to using Rust's <a href="https://github.com/dtolnay/syn/">syn</a>
8511 crate.</p>
8512 <p>Syn is a parser for Rust source code, written as a set of <a href="https://github.com/geal/nom">nom</a>
8513 combinator parser macros. For gnome-class we want to extend the Rust
8514 language with a few conveniences to be able to specify GObject
8515 classes/subclasses, methods, signals, properties, interfaces, and all
8516 the goodies that GObject Introspection would expect.</p>
8517 <p>During the hackfest, <a href="https://github.com/alexcrichton/">Alex Crichton</a>, from the Rust core team,
8518 kindly took over my baby steps in compiler writing and made everything
8519 much more functional. It was invaluable to have him there to reason
8520 about macro hygiene (we <em>are</em> generating an unhygienic macro!), bugs
8521 in the quoting system, and general Rust-iness of the whole thing.</p>
8522 <p>I was also able to talk to <a href="https://coaxion.net/blog/">Sebastian Dröge</a> about his work in
8523 writing GObjects in Rust by hand, for GStreamer, and what sort of
8524 things gnome-class could make easier. Sebastian knows GObject very
8525 well, and has been doing awesome work in making it easy to derive
8526 GObjects by hand in Rust, without lots of boilerplate — something with
8527 which gnome-class can certainly help.</p>
8528 <p>I was also looking forward to talking again with <a href="https://github.com/GuillaumeGomez">Guillaume
8529 Gomez</a>, one of the maintainers of <a href="http://gtk-rs.org/">gtk-rs</a>, and who
8530 does so much work in the Rust ecosystem that I can't believe he has
8531 time for it all.</p>
8532 <p><img alt="Graffitti heads" src="https://people.gnome.org/~federico/blog/images/graffitti-heads.jpg"></p>
8533 <h1>Extend the Rust language for GObject? Like Vala?</h1>
8534 <p>Yeah, pretty much.</p>
8535 <p>Except that instead of a wholly new language, we use Rust as-is, and
8536 we just add syntactic constructs that make it easy to write GObjects
8537 without boilerplate. For example, this works right now:</p>
8538 <div class="highlight"><pre><span></span><code><span class="cp">#![feature(proc_macro)]</span><span class="w"></span>
8539
8540 <span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">gobject_gen</span><span class="p">;</span><span class="w"></span>
8541
8542 <span class="cp">#[macro_use]</span><span class="w"></span>
8543 <span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">glib</span><span class="p">;</span><span class="w"></span>
8544 <span class="k">use</span><span class="w"> </span><span class="n">gobject_gen</span>::<span class="n">gobject_gen</span><span class="p">;</span><span class="w"></span>
8545
8546 <span class="n">gobject_gen</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8547 <span class="w"> </span><span class="c1">// Derives from GObject</span>
8548 <span class="w"> </span><span class="n">class</span><span class="w"> </span><span class="n">One</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8549 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8550
8551 <span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">One</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8552 <span class="w"> </span><span class="c1">// non-virtual method</span>
8553 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">one</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w"></span>
8554 <span class="w"> </span><span class="mi">1</span><span class="w"></span>
8555 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8556
8557 <span class="w"> </span><span class="kr">virtual</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w"></span>
8558 <span class="w"> </span><span class="mi">1</span><span class="w"></span>
8559 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8560 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8561
8562 <span class="w"> </span><span class="c1">// Inherits from our other class</span>
8563 <span class="w"> </span><span class="n">class</span><span class="w"> </span><span class="n">Two</span>: <span class="nc">One</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8564 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8565
8566 <span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">One</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Two</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8567 <span class="w"> </span><span class="c1">// overrides the virtual method</span>
8568 <span class="w"> </span><span class="c1">// maybe we should use &quot;override&quot; instead of &quot;virtual&quot; here?</span>
8569 <span class="w"> </span><span class="kr">virtual</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w"></span>
8570 <span class="w"> </span><span class="mi">2</span><span class="w"></span>
8571 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8572 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8573 <span class="p">}</span><span class="w"></span>
8574
8575 <span class="cp">#[test]</span><span class="w"></span>
8576 <span class="k">fn</span> <span class="nf">test</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8577 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">one</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">One</span>::<span class="n">new</span><span class="p">();</span><span class="w"></span>
8578 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">two</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Two</span>::<span class="n">new</span><span class="p">();</span><span class="w"></span>
8579
8580 <span class="w"> </span><span class="fm">assert!</span><span class="p">(</span><span class="n">one</span><span class="p">.</span><span class="n">one</span><span class="p">()</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">1</span><span class="p">);</span><span class="w"></span>
8581 <span class="w"> </span><span class="fm">assert!</span><span class="p">(</span><span class="n">one</span><span class="p">.</span><span class="n">get</span><span class="p">()</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">1</span><span class="p">);</span><span class="w"></span>
8582 <span class="w"> </span><span class="fm">assert!</span><span class="p">(</span><span class="n">two</span><span class="p">.</span><span class="n">one</span><span class="p">()</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">1</span><span class="p">);</span><span class="w"></span>
8583 <span class="w"> </span><span class="fm">assert!</span><span class="p">(</span><span class="n">two</span><span class="p">.</span><span class="n">get</span><span class="p">()</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">2</span><span class="p">);</span><span class="w"></span>
8584 <span class="p">}</span><span class="w"></span>
8585 </code></pre></div>
8586
8587 <p>This generates a little boatload of <a href="https://people.gnome.org/~federico/blog/docs/generated.rs">generated code</a>,
8588 including a good number of <code>unsafe</code> calls to GObject functions
8589 like <code>g_type_register_static_simple()</code>. It also creates all the
8590 traits and paraphernalia that Glib-rs would create for the Rust
8591 binding of a normal GObject written in C.</p>
8592 <p>The idea is that from the outside world, your generated GObject
8593 classes are indistinguishable from GObjects implemented in C.</p>
8594 <p><em>The idea is to write GObject libraries in a better language than C,
8595 which can then be consumed from language bindings.</em></p>
8596 <h2>Current status of gnome-class</h2>
8597 <p>Up to about two weeks before the hackfest, the syntax for this
8598 mini-language was totally ad-hoc and limited. After a very productive
8599 <a href="https://mail.gnome.org/archives/rust-list/2017-October/msg00000.html">discussion on the mailing list</a>, we came up with a better
8600 syntax that definitely looks more Rust-like. It is also easier to
8601 implement, since the Rust parser in syn can be mostly reused as-is, or
8602 pruned down for the parts where we only support GObject-like methods,
8603 and not all the Rust bells and whistles (generics, lifetimes, trait
8604 bounds).</p>
8605 <p>Gnome-class supports deriving classes directly from the basic GObject,
8606 or from other GObject subclasses in the style of <a href="https://github.com/gtk-rs/glib">glib-rs</a>.</p>
8607 <p>You can define virtual and non-virtual methods. You can override
8608 virtual methods from your superclasses.</p>
8609 <p>Not all argument types are supported. In the end we should support
8610 argument types which are convertible from Rust to C types. We need to
8611 finish figuring out the annotations for ownership transfer of
8612 references.</p>
8613 <p>We don't support GObject signals yet; I think that's my next task.</p>
8614 <p>We don't support GObject properties yet.</p>
8615 <p>We don't support defining new GType interfaces yet, but it is planned.
8616 It should be easy to support implementing existing interfaces, as it
8617 is pretty much the same as implementing a subclass.</p>
8618 <p>The best way to see what works right now is probably to <a href="https://github.com/federicomenaquintero/gnome-class/tree/master/tests">look at the
8619 examples</a>, which also work as tests.</p>
8620 <h1>Digression on macro hygiene</h1>
8621 <p>Rust macros are <em>hygienic</em>, unlike C macros which work just through
8622 textual substitution. That is, names declared inside Rust macros will
8623 not clash with names in the calling code.</p>
8624 <p>One peculiar thing about gnome-class is that the user gives us a few
8625 names, like a class name <code>Foo</code> and some things inside it, say, a
8626 method name <code>bar</code>, and a signal <code>baz</code> and a property <code>qux</code>.
8627 From there we want to generate a bunch of boilerplate for GObject
8628 registration and implementaiton. Some of the generated names in that
8629 boilerplate would be</p>
8630 <div class="highlight"><pre><span></span><code><span class="n">Foo</span> <span class="o">//</span> <span class="n">base</span> <span class="n">name</span>
8631 <span class="n">FooClass</span> <span class="o">//</span> <span class="n">generated</span> <span class="n">name</span> <span class="k">for</span> <span class="n">the</span> <span class="k">class</span> <span class="n">struct</span>
8632 <span class="n">Foo</span><span class="p">::</span><span class="n">bar</span><span class="p">()</span> <span class="o">//</span> <span class="n">A</span> <span class="n">method</span>
8633 <span class="n">Foo</span><span class="p">::</span><span class="n">emit_baz</span><span class="p">()</span> <span class="o">//</span> <span class="n">Generated</span> <span class="n">from</span> <span class="n">the</span> <span class="k">signal</span> <span class="n">name</span>
8634 <span class="n">Foo</span><span class="p">::</span><span class="n">set_qux</span><span class="p">()</span> <span class="o">//</span> <span class="n">Generated</span> <span class="n">property</span> <span class="n">setter</span>
8635 <span class="n">foo_bar</span><span class="p">()</span> <span class="o">//</span> <span class="n">Generated</span> <span class="n">C</span> <span class="n">function</span> <span class="k">for</span> <span class="n">a</span> <span class="n">method</span> <span class="n">call</span>
8636 <span class="n">foo_get_type</span><span class="p">()</span> <span class="o">//</span> <span class="n">Generated</span> <span class="n">C</span> <span class="n">function</span> <span class="n">that</span> <span class="n">all</span> <span class="n">GObjects</span> <span class="n">have</span>
8637 </code></pre></div>
8638
8639 <p>However, if we want to actually generate those names inside our
8640 gnome-class macro <em>and make them visible to the caller</em>, we need to do
8641 so <em>unhygienically</em>. Alex started started a <a href="https://github.com/rust-lang/rust/issues/45934">very interesting discussion
8642 on macro hygiene</a>, so expect some news in the Rust world
8643 soon.</p>
8644 <p>TL;DR: there is a difference between a <em>code generator</em>, which
8645 gnome-class mostly intends to be, and a <em>macro system</em> which is just
8646 an aid in typing repetitive code.</p>
8647 <p><img alt="Fuck wars" src="https://people.gnome.org/~federico/blog/images/fuck-wars.jpg"></p>
8648 <h1>People for whom to to be thankful</h1>
8649 <p>During the hackfest, <a href="http://blog.nirbheek.in/">Nirbheek</a> has been porting librsvg
8650 from Autotools to the Meson build system, and dealing with Rust
8651 peculiarities along the way. This is exactly what I needed! Thanks,
8652 Nirbheek!</p>
8653 <p><a href="https://coaxion.net/blog/">Sebastian</a> answered many of my questions about GObject
8654 internals and how to use them from the Rust side.</p>
8655 <p><a href="http://zee-nix.blogspot.com/">Zeeshan</a> took us to a bunch of good restaurants. Korean,
8656 ramen, Greek, excellent pizza... My stomach is definitely thankful.</p>
8657 <h1>Berlin</h1>
8658 <p>I love Berlin. It is a cosmopolitan, progressive, LGBTQ-friendly
8659 city, with lots of things to do, vast distances to be traveled, with
8660 good public transport and bike lanes, diverse food to be eaten along
8661 the way...</p>
8662 <p>But damnit, it's also cold at this time of the year. I don't think
8663 the weather was ever above 10°C while we were there, and mostly in a
8664 constant state of not-quite-rain. This is much different from the Berlin
8665 in the summer that I knew!</p>
8666 <p><a href="https://people.gnome.org/~federico/blog/images/kimchi.jpg"><img alt="Hackers at Kimchi Princess" src="https://people.gnome.org/~federico/blog/images/kimchi-thumb.jpg"></a></p>
8667 <p>This is my third time visiting Berlin. The first one was during the
8668 Desktop Summit in 2011, and the second one was when my family and I
8669 visited the city two years ago. It is a city that I would definitely
8670 like to know better.</p>
8671 <h1>Thanks to the GNOME Foundation...</h1>
8672 <p>... for sponsoring my travel and accomodation during the hackfest.</p>
8673 <p><img alt="Sponsored by the GNOME Foundation" src="https://people.gnome.org/~federico/blog/images/sponsored-badge-shadow.png"></p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category><category term="Berlin"></category><category term="hackfests"></category></entry><entry><title>Compilation notifications in Emacs</title><link href="https://people.gnome.org/~federico/blog/compilation-notifications-in-emacs.html" rel="alternate"></link><published>2017-11-07T10:47:52-06:00</published><updated>2017-11-07T10:47:52-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-11-07:/~federico/blog/compilation-notifications-in-emacs.html</id><summary type="html"><p>Here is a little Emacs Lisp snippet that I've started using. It makes
8674 Emacs pop up a desktop-wide notification when a compilation finishes,
8675 i.e. after "<code>M-x compile</code>" is done. Let's see if that keeps me from
8676 wasting time in the web when I launch a compilation.</p>
8677 <div class="highlight"><pre><span></span><code><span class="p">(</span><span class="k">setq</span> <span class="nv">compilation-finish-functions</span>
8678 <span class="p">(</span><span class="nb">append …</span></code></pre></div></summary><content type="html"><p>Here is a little Emacs Lisp snippet that I've started using. It makes
8679 Emacs pop up a desktop-wide notification when a compilation finishes,
8680 i.e. after "<code>M-x compile</code>" is done. Let's see if that keeps me from
8681 wasting time in the web when I launch a compilation.</p>
8682 <div class="highlight"><pre><span></span><code><span class="p">(</span><span class="k">setq</span> <span class="nv">compilation-finish-functions</span>
8683 <span class="p">(</span><span class="nb">append</span> <span class="nv">compilation-finish-functions</span>
8684 <span class="o">&#39;</span><span class="p">(</span><span class="nv">fmq-compilation-finish</span><span class="p">)))</span>
8685
8686 <span class="p">(</span><span class="nb">defun</span> <span class="nv">fmq-compilation-finish</span> <span class="p">(</span><span class="nv">buffer</span> <span class="nv">status</span><span class="p">)</span>
8687 <span class="p">(</span><span class="nv">call-process</span> <span class="s">&quot;notify-send&quot;</span> <span class="no">nil</span> <span class="no">nil</span> <span class="no">nil</span>
8688 <span class="s">&quot;-t&quot;</span> <span class="s">&quot;0&quot;</span>
8689 <span class="s">&quot;-i&quot;</span> <span class="s">&quot;emacs&quot;</span>
8690 <span class="s">&quot;Compilation finished in Emacs&quot;</span>
8691 <span class="nv">status</span><span class="p">))</span>
8692 </code></pre></div></content><category term="misc"></category><category term="emacs"></category></entry><entry><title>How glib-rs works, part 3: Boxed types</title><link href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-3.html" rel="alternate"></link><published>2017-09-08T19:28:11-05:00</published><updated>2017-09-08T19:28:11-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-09-08:/~federico/blog/how-glib-rs-works-part-3.html</id><summary type="html"><p>(<a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">First part</a> of the series, with index to all the articles)</p>
8693 <p>Now let's get on and see how glib-rs handles boxed types.</p>
8694 <h1>Boxed types?</h1>
8695 <p>Let's say you are given a sealed cardboard box with <em>something</em>, but you
8696 can't know what's inside. You can just pass it on to someone else …</p></summary><content type="html"><p>(<a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">First part</a> of the series, with index to all the articles)</p>
8697 <p>Now let's get on and see how glib-rs handles boxed types.</p>
8698 <h1>Boxed types?</h1>
8699 <p>Let's say you are given a sealed cardboard box with <em>something</em>, but you
8700 can't know what's inside. You can just pass it on to someone else, or
8701 burn it. And since computers are magic duplication machines, you may
8702 want to copy the box and its contents... and maybe some day you will
8703 get around to opening it.</p>
8704 <p>That's a boxed type. You get a pointer to <em>something</em>, who knows
8705 what's inside. You can just pass it on to someone else, burn it — I
8706 mean, free it — or since computers are magic, copy the pointer and
8707 whatever it points to.</p>
8708 <p>That's exactly the API for boxed types.</p>
8709 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="n">gpointer</span> <span class="p">(</span><span class="o">*</span><span class="n">GBoxedCopyFunc</span><span class="p">)</span> <span class="p">(</span><span class="n">gpointer</span> <span class="n">boxed</span><span class="p">);</span>
8710 <span class="k">typedef</span> <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">GBoxedFreeFunc</span><span class="p">)</span> <span class="p">(</span><span class="n">gpointer</span> <span class="n">boxed</span><span class="p">);</span>
8711
8712 <span class="n">GType</span> <span class="nf">g_boxed_type_register_static</span> <span class="p">(</span><span class="k">const</span> <span class="n">gchar</span> <span class="o">*</span><span class="n">name</span><span class="p">,</span>
8713 <span class="n">GBoxedCopyFunc</span> <span class="n">boxed_copy</span><span class="p">,</span>
8714 <span class="n">GBoxedFreeFunc</span> <span class="n">boxed_free</span><span class="p">);</span>
8715 </code></pre></div>
8716
8717 <h2>Simple copying, simple freeing</h2>
8718 <p>Imagine you have a color...</p>
8719 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
8720 <span class="n">guchar</span> <span class="n">r</span><span class="p">;</span>
8721 <span class="n">guchar</span> <span class="n">g</span><span class="p">;</span>
8722 <span class="n">guchar</span> <span class="n">b</span><span class="p">;</span>
8723 <span class="p">}</span> <span class="n">Color</span><span class="p">;</span>
8724 </code></pre></div>
8725
8726 <p>If you had a pointer to a Color, how would you copy it? Easy:</p>
8727 <div class="highlight"><pre><span></span><code><span class="n">Color</span> <span class="o">*</span><span class="nf">copy_color</span> <span class="p">(</span><span class="n">Color</span> <span class="o">*</span><span class="n">a</span><span class="p">)</span>
8728 <span class="p">{</span>
8729 <span class="n">Color</span> <span class="o">*</span><span class="n">b</span> <span class="o">=</span> <span class="n">g_new</span> <span class="p">(</span><span class="n">Color</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
8730 <span class="o">*</span><span class="n">b</span> <span class="o">=</span> <span class="o">*</span><span class="n">a</span><span class="p">;</span>
8731 <span class="k">return</span> <span class="n">b</span><span class="p">;</span>
8732 <span class="p">}</span>
8733 </code></pre></div>
8734
8735 <p>That is, allocate a new <code>Color</code>, and essentially <code>memcpy()</code> the
8736 contents.</p>
8737 <p>And to free it? A simple <code>g_free()</code> works — there are no internal
8738 things that need to be freed individually.</p>
8739 <h2>Complex copying, complex freeing</h2>
8740 <p>And if we had a color with a name?</p>
8741 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
8742 <span class="n">guchar</span> <span class="n">r</span><span class="p">;</span>
8743 <span class="n">guchar</span> <span class="n">g</span><span class="p">;</span>
8744 <span class="n">guchar</span> <span class="n">b</span><span class="p">;</span>
8745 <span class="kt">char</span> <span class="o">*</span><span class="n">name</span><span class="p">;</span>
8746 <span class="p">}</span> <span class="n">ColorWithName</span><span class="p">;</span>
8747 </code></pre></div>
8748
8749 <p>We can't just <code>*a = *b</code> here, as we actually need to copy the string
8750 <code>name</code>. Okay:</p>
8751 <div class="highlight"><pre><span></span><code><span class="n">ColorWithName</span> <span class="o">*</span><span class="nf">copy_color_with_name</span> <span class="p">(</span><span class="n">ColorWithName</span> <span class="o">*</span><span class="n">a</span><span class="p">)</span>
8752 <span class="p">{</span>
8753 <span class="n">ColorWithName</span> <span class="o">*</span><span class="n">b</span> <span class="o">=</span> <span class="n">g_new</span> <span class="p">(</span><span class="n">ColorWithName</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
8754 <span class="n">b</span><span class="o">-&gt;</span><span class="n">r</span> <span class="o">=</span> <span class="n">a</span><span class="o">-&gt;</span><span class="n">r</span><span class="p">;</span>
8755 <span class="n">b</span><span class="o">-&gt;</span><span class="n">g</span> <span class="o">=</span> <span class="n">a</span><span class="o">-&gt;</span><span class="n">g</span><span class="p">;</span>
8756 <span class="n">b</span><span class="o">-&gt;</span><span class="n">b</span> <span class="o">=</span> <span class="n">a</span><span class="o">-&gt;</span><span class="n">b</span><span class="p">;</span>
8757 <span class="n">b</span><span class="o">-&gt;</span><span class="n">name</span> <span class="o">=</span> <span class="n">g_strdup</span> <span class="p">(</span><span class="n">a</span><span class="o">-&gt;</span><span class="n">name</span><span class="p">);</span>
8758 <span class="k">return</span> <span class="n">b</span><span class="p">;</span>
8759 <span class="p">}</span>
8760 </code></pre></div>
8761
8762 <p>The corresponding <code>free_color_with_name()</code> would <code>g_free(b-&gt;name)</code> and then
8763 <code>g_free(b)</code>, of course.</p>
8764 <h1>Glib-rs and boxed types</h1>
8765 <p>Let's look at this by parts. First, a <a href="https://github.com/gtk-rs/glib/blob/f0a2aae96162fd628fb0e3eedd3d3e042c25e2c4/src/boxed.rs#L265"><code>BoxedMemoryManager</code>
8766 trait</a> to define the basic API to manage the
8767 memory of boxed types. This is what defines the <code>copy</code> and <code>free</code>
8768 functions, like above.</p>
8769 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">BoxedMemoryManager</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="o">&#39;</span><span class="nb">static</span> <span class="p">{</span><span class="w"></span>
8770 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">copy</span><span class="p">(</span><span class="n">ptr</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w"></span>
8771 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">free</span><span class="p">(</span><span class="n">ptr</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="p">);</span><span class="w"></span>
8772 <span class="p">}</span><span class="w"></span>
8773 </code></pre></div>
8774
8775 <p>Second, the <a href="https://github.com/gtk-rs/glib/blob/f0a2aae96162fd628fb0e3eedd3d3e042c25e2c4/src/boxed.rs#L273">actual representation</a> of a <code>Boxed</code> type:</p>
8776 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Boxed</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">&#39;</span><span class="nb">static</span><span class="p">,</span><span class="w"> </span><span class="n">MM</span>: <span class="nc">BoxedMemoryManager</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8777 <span class="w"> </span><span class="n">inner</span>: <span class="nc">AnyBox</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
8778 <span class="w"> </span><span class="n">_dummy</span>: <span class="nc">PhantomData</span><span class="o">&lt;</span><span class="n">MM</span><span class="o">&gt;</span><span class="p">,</span><span class="w"></span>
8779 <span class="p">}</span><span class="w"></span>
8780 </code></pre></div>
8781
8782 <p>This struct is generic over <code>T</code>, the actual type that we will be
8783 wrapping, and <code>MM</code>, something which must implement the
8784 <code>BoxedMemoryManager</code> trait.</p>
8785 <p>Inside, it stores <code>inner</code>, an <code>AnyBox</code>, which we will see shortly.
8786 The <code>_dummy: PhantomData&lt;MM&gt;</code> is a <a href="https://doc.rust-lang.org/std/marker/struct.PhantomData.html">Rust-ism</a> to indicate that although this
8787 struct doesn't actually store a memory manager, it acts as if it does
8788 — it does not concern us here.</p>
8789 <h2>The <em>actual</em> representation of boxed data</h2>
8790 <p>Let's look at that <a href="https://github.com/gtk-rs/glib/blob/f0a2aae96162fd628fb0e3eedd3d3e042c25e2c4/src/boxed.rs#L247"><code>AnyBox</code></a> that is stored inside a <code>Boxed</code>:</p>
8791 <div class="highlight"><pre><span></span><code><span class="k">enum</span> <span class="nc">AnyBox</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8792 <span class="w"> </span><span class="n">Native</span><span class="p">(</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
8793 <span class="w"> </span><span class="n">ForeignOwned</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="p">),</span><span class="w"></span>
8794 <span class="w"> </span><span class="n">ForeignBorrowed</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="p">),</span><span class="w"></span>
8795 <span class="p">}</span><span class="w"></span>
8796 </code></pre></div>
8797
8798 <p>We have three cases:</p>
8799 <ul>
8800 <li>
8801 <p><code>Native(Box&lt;T&gt;)</code> - this boxed value <code>T</code> comes from Rust itself, so we
8802 know everything about it!</p>
8803 </li>
8804 <li>
8805 <p><code>ForeignOwned(*mut T)</code> - this boxed value <code>T</code> came from the outside, but
8806 we own it now. We will have to free it when we are done with it.</p>
8807 </li>
8808 <li>
8809 <p><code>ForeignBorrowed(*mut T)</code> - this boxed value <code>T</code> came from the
8810 outside, but we are just borrowing it temporarily: we <strong>don't</strong> want to
8811 free it when we are done with it.</p>
8812 </li>
8813 </ul>
8814 <p>For example, if we look at the <a href="https://github.com/gtk-rs/glib/blob/f0a2aae96162fd628fb0e3eedd3d3e042c25e2c4/src/boxed.rs#L367">implementation of the <code>Drop</code>
8815 trait</a> for the <code>Boxed</code> struct, we will indeed see that it
8816 calls the <code>BoxedMemoryManager::free()</code> <strong>only</strong> if we have a
8817 <code>ForeignOwned</code> value:</p>
8818 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">&#39;</span><span class="nb">static</span><span class="p">,</span><span class="w"> </span><span class="n">MM</span>: <span class="nc">BoxedMemoryManager</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="nb">Drop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Boxed</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">MM</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8819 <span class="w"> </span><span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8820 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8821 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">AnyBox</span>::<span class="n">ForeignOwned</span><span class="p">(</span><span class="n">ptr</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">inner</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8822 <span class="w"> </span><span class="n">MM</span>::<span class="n">free</span><span class="p">(</span><span class="n">ptr</span><span class="p">);</span><span class="w"></span>
8823 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8824 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8825 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8826 <span class="p">}</span><span class="w"></span>
8827 </code></pre></div>
8828
8829 <p>If we had a <code>Native(Box&lt;T&gt;)</code> value, it means it came from Rust itself,
8830 and Rust knows how to <code>Drop</code> its own <code>Box&lt;T&gt;</code> (i.e. a chunk of memory
8831 allocated in the heap).</p>
8832 <p>But for external resources, we must tell Rust how to manage them.
8833 Again: in the case where the Rust side owns the reference to the
8834 external boxed data, we have a <code>ForeignOwned</code> and <code>Drop</code> it by
8835 <code>free()</code>ing it; in the case where the Rust side is just borrowing the
8836 data temporarily, we have a <code>ForeignBorrowed</code> and don't touch it when
8837 we are done.</p>
8838 <h2>Copying</h2>
8839 <p>When do we have to copy a boxed value? For example, when we transfer
8840 from Rust to Glib with full transfer of ownership, i.e. the
8841 <a href="https://github.com/gtk-rs/glib/blob/f0a2aae96162fd628fb0e3eedd3d3e042c25e2c4/src/boxed.rs#L312"><code>to_glib_full()</code></a> pattern <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html#ptr-transfer-full">that we saw
8842 before</a>. This is how that trait method is
8843 implemented for <code>Boxed</code>:</p>
8844 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="o">&#39;</span><span class="nb">static</span><span class="p">,</span><span class="w"> </span><span class="n">MM</span>: <span class="nc">BoxedMemoryManager</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="n">ToGlibPtr</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Boxed</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">MM</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8845 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_glib_full</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8846 <span class="w"> </span><span class="k">use</span><span class="w"> </span><span class="bp">self</span>::<span class="n">AnyBox</span>::<span class="o">*</span><span class="p">;</span><span class="w"></span>
8847 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">ptr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">inner</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8848 <span class="w"> </span><span class="n">Native</span><span class="p">(</span><span class="k">ref</span><span class="w"> </span><span class="n">b</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">&amp;**</span><span class="n">b</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w"></span>
8849 <span class="w"> </span><span class="n">ForeignOwned</span><span class="p">(</span><span class="n">p</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">ForeignBorrowed</span><span class="p">(</span><span class="n">p</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w"></span>
8850 <span class="w"> </span><span class="p">};</span><span class="w"></span>
8851 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">MM</span>::<span class="n">copy</span><span class="p">(</span><span class="n">ptr</span><span class="p">)</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
8852 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8853 <span class="p">}</span><span class="w"></span>
8854 </code></pre></div>
8855
8856 <p>See the <code>MM:copy(ptr)</code> in the last line? That's where the copy
8857 happens. The lines above just get the appropriate pointer to the data
8858 data from the <code>AnyBox</code> and cast it.</p>
8859 <p>There is extra boilerplate in <code>boxed.rs</code> which you can look at; it's
8860 mostly a bunch of trait implementations to copy the boxed data at the
8861 appropriate times (e.g. the <code>FromGlibPtrNone</code> trait), also an
8862 implementation of the <code>Deref</code> trait to get to the contents of a <code>Boxed
8863 / AnyBox</code> easily, etc. The trait implementations are there just to
8864 make it as convenient as possible to handle <code>Boxed</code> types.</p>
8865 <h2>Who implements BoxedMemoryManager?</h2>
8866 <p>Up to now, we have seen things like the implementation of <code>Drop</code> for
8867 <code>Boxed</code>, which uses <code>BoxedMemoryManager::free()</code>, and the
8868 implementation of <code>ToGlibPtr</code> which uses <code>::copy()</code>.</p>
8869 <p>But those are just the trait's "abstract" methods, so to speak. What
8870 actually implements them?</p>
8871 <p>Glib-rs has a general-purpose macro to wrap Glib types. It can wrap
8872 boxed types, shared pointer types, and GObjects. For now we will just
8873 look at boxed types.</p>
8874 <p>Glib-rs comes with a macro, <a href="http://gtk-rs.org/docs/glib/macro.glib_wrapper.html#boxed"><code>glib_wrapper!()</code></a>, that can be used in
8875 different ways. You can use it to automatically write the boilerplate
8876 for a boxed type like this:</p>
8877 <div class="highlight"><pre><span></span><code><span class="n">glib_wrapper</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8878 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Color</span><span class="p">(</span><span class="n">Boxed</span><span class="o">&lt;</span><span class="n">ffi</span>::<span class="n">Color</span><span class="o">&gt;</span><span class="p">);</span><span class="w"></span>
8879
8880 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="k">fn</span> <span class="p">{</span><span class="w"></span>
8881 <span class="w"> </span><span class="n">copy</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">|</span><span class="n">ptr</span><span class="o">|</span><span class="w"> </span><span class="n">ffi</span>::<span class="n">color_copy</span><span class="p">(</span><span class="n">mut_override</span><span class="p">(</span><span class="n">ptr</span><span class="p">)),</span><span class="w"></span>
8882 <span class="w"> </span><span class="n">free</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">|</span><span class="n">ptr</span><span class="o">|</span><span class="w"> </span><span class="n">ffi</span>::<span class="n">color_free</span><span class="p">(</span><span class="n">ptr</span><span class="p">),</span><span class="w"></span>
8883 <span class="w"> </span><span class="n">get_type</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">ffi</span>::<span class="n">color_get_type</span><span class="p">(),</span><span class="w"></span>
8884 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8885 <span class="p">}</span><span class="w"></span>
8886 </code></pre></div>
8887
8888 <p>This expands to an internal
8889 <a href="https://github.com/gtk-rs/glib/blob/f0a2aae96162fd628fb0e3eedd3d3e042c25e2c4/src/boxed.rs#L15"><code>glib_boxed_wrapper!()</code></a> macro that does a few
8890 things. We will only look at particularly interesting bits.</p>
8891 <p>First, the macro creates a newtype around a tuple with 1) the actual
8892 data type you want to box, and 2) a memory manager. In the example
8893 above, the newtype would be called <code>Color</code>, and it would wrap an
8894 <code>ffi:Color</code> (say, a C struct).</p>
8895 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="cp">$name</span><span class="p">(</span><span class="n">Boxed</span><span class="o">&lt;</span><span class="cp">$ffi_name</span><span class="p">,</span><span class="w"> </span><span class="n">MemoryManager</span><span class="o">&gt;</span><span class="p">);</span><span class="w"></span>
8896 </code></pre></div>
8897
8898 <p>Aha! And that <code>MemoryManager</code>? The macro defines it as a zero-sized
8899 type:</p>
8900 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">MemoryManager</span><span class="p">;</span><span class="w"></span>
8901 </code></pre></div>
8902
8903 <p>Then it <a href="https://github.com/gtk-rs/glib/blob/f0a2aae96162fd628fb0e3eedd3d3e042c25e2c4/src/boxed.rs#L59">implements</a> the <code>BoxedMemoryManager</code> trait for that
8904 <code>MemoryManager</code> struct:</p>
8905 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">BoxedMemoryManager</span><span class="o">&lt;</span><span class="cp">$ffi_name</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MemoryManager</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8906 <span class="w"> </span><span class="cp">#[inline]</span><span class="w"></span>
8907 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">copy</span><span class="p">(</span><span class="cp">$copy_arg</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8908 <span class="w"> </span><span class="cp">$copy_expr</span><span class="w"></span>
8909 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8910
8911 <span class="w"> </span><span class="cp">#[inline]</span><span class="w"></span>
8912 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">free</span><span class="p">(</span><span class="cp">$free_arg</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8913 <span class="w"> </span><span class="cp">$free_expr</span><span class="w"></span>
8914 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8915 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8916 </code></pre></div>
8917
8918 <p>There! <em>This</em> is where the <code>copy/free</code> methods are implemented, based
8919 on the bits of code with which you invoked the macro. In the call to
8920 <code>glib_wrapper!()</code> we had this:</p>
8921 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="n">copy</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">|</span><span class="n">ptr</span><span class="o">|</span><span class="w"> </span><span class="n">ffi</span>::<span class="n">color_copy</span><span class="p">(</span><span class="n">mut_override</span><span class="p">(</span><span class="n">ptr</span><span class="p">)),</span><span class="w"></span>
8922 <span class="w"> </span><span class="n">free</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">|</span><span class="n">ptr</span><span class="o">|</span><span class="w"> </span><span class="n">ffi</span>::<span class="n">color_free</span><span class="p">(</span><span class="n">ptr</span><span class="p">),</span><span class="w"></span>
8923 </code></pre></div>
8924
8925 <p>In the impl aboe, the <code>$copy_expr</code> will expand to
8926 <code>ffi::color_copy(mut_override(ptr))</code> and <code>$free_expr</code> will expand to
8927 <code>ffi::color_free(ptr)</code>, which defines our implementation of a memory
8928 manager for our <code>Color</code> boxed type.</p>
8929 <h2>Zero-sized what?</h2>
8930 <p>Within the macro's definition, let's look again at the definitions of
8931 our boxed type and the memory manager object that actually implements
8932 the <code>BoxedMemoryManager</code> trait. Here is what the macro would expand
8933 to with our <code>Color</code> example:</p>
8934 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Color</span><span class="p">(</span><span class="n">Boxed</span><span class="o">&lt;</span><span class="n">ffi</span>::<span class="n">Color</span><span class="p">,</span><span class="w"> </span><span class="n">MemoryManager</span><span class="o">&gt;</span><span class="p">);</span><span class="w"></span>
8935
8936 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">MemoryManager</span><span class="p">;</span><span class="w"></span>
8937
8938 <span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">BoxedMemoryManager</span><span class="o">&lt;</span><span class="n">ffi</span>::<span class="n">Color</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MemoryManager</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
8939 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">copy</span><span class="p">(</span><span class="o">..</span><span class="p">.)</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ffi</span>::<span class="n">Color</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
8940 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">free</span><span class="p">(</span><span class="o">..</span><span class="p">.)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
8941 <span class="w"> </span><span class="p">}</span><span class="w"></span>
8942 </code></pre></div>
8943
8944 <p>Here, <code>MemoryManager</code> is a zero-sized type. This means <strong>it doesn't
8945 take up any space</strong> in the <code>Color</code> tuple! When a <code>Color</code> is allocated
8946 in the heap, it is really as if it contained an <code>ffi::Color</code> (the
8947 C struct we are wrapping) <em>and nothing else</em>.</p>
8948 <p>All the knowledge about how to copy/free <code>ffi::Color</code> <strong>lives only in
8949 the compiler</strong> thanks to the trait implementation. When the compiler
8950 expands all the macros and monomorphizes all the generic functions,
8951 the calls to <code>ffi::color_copy()</code> and <code>ffi::color_free()</code> <strong>will be
8952 inlined at the appropriate spots</strong>. There is no need to have
8953 auxiliary structures taking up space in the heap, just to store
8954 function pointers to the copy/free functions, or anything like that.</p>
8955 <h1>Next up</h1>
8956 <p>You may have seen that our example call to <code>glib_wrapper!()</code> also
8957 passed in a <code>ffi::color_get_type()</code> function. We haven't talked about
8958 how glib-rs wraps Glib's <code>GType</code>, <code>GValue</code>, and all of that. We are
8959 getting closer and closer to being able to wrap <code>GObject</code>.</p>
8960 <p>Stay tuned!</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Initial posts about librsvg's C to Rust conversion</title><link href="https://people.gnome.org/~federico/blog/librsvg-posts.html" rel="alternate"></link><published>2017-09-07T15:50:46-05:00</published><updated>2017-09-07T15:50:46-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-09-07:/~federico/blog/librsvg-posts.html</id><summary type="html"><p>The initial articles about librsvg's conversion to Rust are in my <a href="https://people.gnome.org/~federico/news.html">old
8961 blog</a>, so they may be a bit hard to find from this new blog.
8962 Here is a list of those posts, just so they are easier to find:</p>
8963 <ul>
8964 <li><a href="https://people.gnome.org/~federico/news-2016-10.html#25">Librsvg gets Rusty</a></li>
8965 <li><a href="https://people.gnome.org/~federico/news-2016-10.html#28">Porting a few C functions to Rust …</a></li></ul></summary><content type="html"><p>The initial articles about librsvg's conversion to Rust are in my <a href="https://people.gnome.org/~federico/news.html">old
8966 blog</a>, so they may be a bit hard to find from this new blog.
8967 Here is a list of those posts, just so they are easier to find:</p>
8968 <ul>
8969 <li><a href="https://people.gnome.org/~federico/news-2016-10.html#25">Librsvg gets Rusty</a></li>
8970 <li><a href="https://people.gnome.org/~federico/news-2016-10.html#28">Porting a few C functions to Rust</a></li>
8971 <li><a href="https://people.gnome.org/~federico/news-2016-11.html#01">Bézier curves, markers, and SVG's concept of directionality</a></li>
8972 <li><a href="https://people.gnome.org/~federico/news-2016-11.html#03">Refactoring C to make Rustification easier</a></li>
8973 <li><a href="https://people.gnome.org/~federico/news-2016-11.html#14">Exposing Rust objects to C code</a></li>
8974 <li><a href="https://people.gnome.org/~federico/news-2016-11.html#16">Debugging Rust code inside a C library</a></li>
8975 <li><a href="https://people.gnome.org/~federico/news-2017-01.html#11">Reproducible font rendering for librsvg's tests</a></li>
8976 <li><a href="https://people.gnome.org/~federico/news-2017-02.html#03">Algebraic data types in Rust, and basic parsing</a></li>
8977 <li><a href="https://people.gnome.org/~federico/news-2017-02.html#17">How librsvg exports reference-counted objects from Rust to C</a></li>
8978 <li><a href="https://people.gnome.org/~federico/news-2017-02.html#24">Griping about parsers and shitty specifications</a></li>
8979 <li><a href="https://people.gnome.org/~federico/news-2017-02.html#28">Porting librsvg's tree of nodes to Rust</a></li>
8980 <li><a href="https://people.gnome.org/~federico/news-2017-04.html#28">gboolean is not Rust bool</a></li>
8981 </ul>
8982 <p>Within this new blog, you can look for articles with the <a href="https://people.gnome.org/~federico/blog/tag/librsvg.html">librsvg tag</a>.</p></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category></entry><entry><title>The Magic of GObject Introspection</title><link href="https://people.gnome.org/~federico/blog/magic-of-gobject-introspection.html" rel="alternate"></link><published>2017-09-07T11:46:03-05:00</published><updated>2017-09-07T11:46:03-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-09-07:/~federico/blog/magic-of-gobject-introspection.html</id><summary type="html"><p>Before continuing with the <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">glib-rs architecture</a>, let's take
8983 a detour and look at <a href="https://wiki.gnome.org/Projects/GObjectIntrospection">GObject Introspection</a>. Although it can
8984 seem like an obscure part of the GNOME platform, it is an absolutely
8985 vital part of it: it is what lets people write GNOME applications in
8986 any language.</p>
8987 <p>Let's start with a …</p></summary><content type="html"><p>Before continuing with the <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">glib-rs architecture</a>, let's take
8988 a detour and look at <a href="https://wiki.gnome.org/Projects/GObjectIntrospection">GObject Introspection</a>. Although it can
8989 seem like an obscure part of the GNOME platform, it is an absolutely
8990 vital part of it: it is what lets people write GNOME applications in
8991 any language.</p>
8992 <p>Let's start with a bit of history.</p>
8993 <h1>Brief history of language bindings in GNOME</h1>
8994 <p>When we started GNOME in 1997, we didn't want to write <em>all</em> of it in
8995 C. We had some inspiration from elsewhere.</p>
8996 <h2>Prehistory: GIMP and the Procedural Database</h2>
8997 <p>There was already good precedent for software written in a combination of
8998 programming languages. Emacs, the flagship text editor of the
8999 GNU project, was written with a relatively small core in C, and the
9000 majority of the program in Emacs Lisp.</p>
9001 <p>In similar fashion, we were very influenced by the design of the GIMP,
9002 which was very innovative at that time. The GIMP has a large core
9003 written in C. However, it supports plug-ins or <em>scripts</em> written in a
9004 variety of languages. Initially the only scripting language available
9005 for the GIMP was Scheme.</p>
9006 <p>The GIMP's plug-ins and scripts run as separate processes, so
9007 they don't have immediate access to the data of the image being
9008 edited, or to the core functions of the program like "paint with a
9009 brush at this location". To let plug-ins and scripts access these
9010 data and these functions, the GIMP has what it calls a
9011 Procedural Database (PDB). This is a
9012 list of functions that the core program or plug-ins wish to export.
9013 For example, there are functions like <code>gimp-scale-image</code> and
9014 <code>gimp-move-layer</code>. Once these functions are registered in the
9015 PDB, any part of the program or plug-ins can call them. Scripts are
9016 often written to automate common tasks — for example, when one wants
9017 to adjust the contrast of photos and scale them in bulk. Scripts can
9018 call functions in the PDB easily, irrespective of the programming
9019 language they are written in.</p>
9020 <p>We wanted to write GNOME's core libraries in C, and write a similar
9021 Procedural Database to allow those libraries to be called from any
9022 programming language. Eventually it turned out that a PDB was not
9023 necessary, and there were better ways to go about enabling different
9024 programming languages.</p>
9025 <h2>Enabling sane memory management</h2>
9026 <p>GTK+ started out with a very simple scheme for memory management: a
9027 container owned its child widgets, and so on recursively. When you
9028 freed a container, it would be responsible for freeing its children.</p>
9029 <p>However, consider what happens when a widget needs to hold a reference
9030 to another widget that is not one of its children. For example, a
9031 GtkLabel with an underlined mnemonic ("_N_ame:") needs to have a
9032 reference to the GtkEntry that should be focused when you press
9033 Alt-N. In the very earliest versions of GTK+, how to do this was
9034 undefined: C programmers were already used to having shared pointers
9035 everywhere, and they were used to being responsible for managing their
9036 memory.</p>
9037 <p>Of course, this was prone to bugs. If you have something like</p>
9038 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
9039 <span class="n">GtkWidget</span> <span class="n">parent</span><span class="p">;</span>
9040
9041 <span class="kt">char</span> <span class="o">*</span><span class="n">label_string</span><span class="p">;</span>
9042 <span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget_to_focus</span><span class="p">;</span>
9043 <span class="p">}</span> <span class="n">GtkLabel</span><span class="p">;</span>
9044 </code></pre></div>
9045
9046 <p>then if you are writing the destructor, you may simply want to</p>
9047 <div class="highlight"><pre><span></span><code><span class="k">static</span> <span class="kt">void</span>
9048 <span class="nf">gtk_label_free</span> <span class="p">(</span><span class="n">GtkLabel</span> <span class="o">*</span><span class="n">label</span><span class="p">)</span>
9049 <span class="p">{</span>
9050 <span class="n">g_free</span> <span class="p">(</span><span class="n">label_string</span><span class="p">);</span>
9051 <span class="n">gtk_widget_free</span> <span class="p">(</span><span class="n">widget_to_focus</span><span class="p">);</span> <span class="cm">/* oops, we don&#39;t own this */</span>
9052
9053 <span class="n">free_parent_instance</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">label</span><span class="o">-&gt;</span><span class="n">parent</span><span class="p">);</span>
9054 <span class="p">}</span>
9055 </code></pre></div>
9056
9057 <p>Say you have a GtkBox with the label and its associated GtkEntry.
9058 Then, freeing the GtkBox would recursively free the label with that
9059 <code>gtk_label_free()</code>, and then the entry with its own function. But by
9060 the time the entry gets freed, the line
9061 <code>gtk_widget_free (widget_to_focus)</code> has already freed the entry, and
9062 we get a double-free bug!</p>
9063 <p>Madness!</p>
9064 <p>That is, we had no idea what we were doing. Or rather, our
9065 understanding of widgets had not evolved to the point of acknowledging
9066 that a widget tree is not a simply tree, but rather a
9067 directed graph of container-child relationships, plus
9068 random-widget-to-random-widget relationships. And of course, other
9069 parts of the program <em>which are not even widget implementations</em> may
9070 need to keep references to widgets and free them or not as
9071 appropriate.</p>
9072 <p>I think Marius Vollmer was the first person to start formalizing
9073 this. He came from the world of GNU Guile, a Scheme interpreter, and
9074 so he already knew how garbage collection and seas of shared
9075 references ought to work.</p>
9076 <p>Marius implemented reference-counting for GTK+ — that's where
9077 <code>gtk_object_ref()</code> and <code>gtk_object_unref()</code> come from; they eventually
9078 got moved to the base <code>GObject</code> class, so we now have <code>g_object_ref()</code>
9079 and <code>g_object_unref()</code> and a host of functions to have weak
9080 references, notification of destruction, and all the things required
9081 to keep garbage collectors happy.</p>
9082 <h2>The first language bindings</h2>
9083 <p>The very first language bindings were written by hand. The GTK+ API
9084 was small, and it seemed feasible to take</p>
9085 <div class="highlight"><pre><span></span><code><span class="kt">void</span> <span class="nf">gtk_widget_show</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">);</span>
9086 <span class="kt">void</span> <span class="nf">gtk_widget_hide</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">);</span>
9087
9088 <span class="kt">void</span> <span class="nf">gtk_container_add</span> <span class="p">(</span><span class="n">GtkContainer</span> <span class="o">*</span><span class="n">container</span><span class="p">,</span> <span class="n">GtkWidget</span> <span class="o">*</span><span class="n">child</span><span class="p">);</span>
9089 <span class="kt">void</span> <span class="nf">gtk_container_remove</span> <span class="p">(</span><span class="n">GtkContainer</span> <span class="o">*</span><span class="n">container</span><span class="p">,</span> <span class="n">GtkWidget</span> <span class="o">*</span><span class="n">child</span><span class="p">);</span>
9090 </code></pre></div>
9091
9092 <p>and just wrap those functions in various languages, by hand, on an
9093 as-needed basis.</p>
9094 <p>Of course, there is a lot of duplication when doing things that way.
9095 As the C API grows, one needs to do more and more manual work to
9096 keep up with it.</p>
9097 <p>Also, C structs with public fields are problematic. If we had</p>
9098 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
9099 <span class="n">guchar</span> <span class="n">r</span><span class="p">;</span>
9100 <span class="n">guchar</span> <span class="n">g</span><span class="p">;</span>
9101 <span class="n">guchar</span> <span class="n">b</span><span class="p">;</span>
9102 <span class="p">}</span> <span class="n">GdkColor</span><span class="p">;</span>
9103 </code></pre></div>
9104
9105 <p>and we <em>expect</em> program code to fill in a <code>GdkColor</code> by hand and
9106 pass it to a drawing function like</p>
9107 <div class="highlight"><pre><span></span><code><span class="kt">void</span> <span class="nf">gdk_set_foreground_color</span> <span class="p">(</span><span class="n">GdkDrawingContext</span> <span class="o">*</span><span class="n">gc</span><span class="p">,</span> <span class="n">GdkColor</span> <span class="o">*</span><span class="n">color</span><span class="p">);</span>
9108 </code></pre></div>
9109
9110 <p>then it is no problem to do that in C:</p>
9111 <div class="highlight"><pre><span></span><code><span class="n">GdkColor</span> <span class="n">magenta</span> <span class="o">=</span> <span class="p">{</span> <span class="mi">255</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">255</span> <span class="p">};</span>
9112
9113 <span class="n">gdk_set_foreground_color</span> <span class="p">(</span><span class="n">gc</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">magenta</span><span class="p">);</span>
9114 </code></pre></div>
9115
9116 <p>But to do that in a high level language? You don't have access to C
9117 struct fields! And back then, libffi wasn't generally available.</p>
9118 <p>Authors of language bindings had to write some glue code, in C, by
9119 hand, to let people access a C struct and then pass it on to GTK+.
9120 For example, for Python, they would need to write something like</p>
9121 <div class="highlight"><pre><span></span><code><span class="n">PyObject</span> <span class="o">*</span>
9122 <span class="nf">make_wrapped_gdk_color</span> <span class="p">(</span><span class="n">PyObject</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="n">PyObject</span> <span class="o">*</span><span class="n">kwargs</span><span class="p">)</span>
9123 <span class="p">{</span>
9124 <span class="n">GdkColor</span> <span class="o">*</span><span class="n">g_color</span><span class="p">;</span>
9125 <span class="n">PyObject</span> <span class="o">*</span><span class="n">py_color</span><span class="p">;</span>
9126
9127 <span class="n">g_color</span> <span class="o">=</span> <span class="n">g_new</span> <span class="p">(</span><span class="n">GdkColor</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
9128 <span class="cm">/* ... fill in g_color-&gt;r, g, b from the Python args */</span>
9129
9130 <span class="n">py_color</span> <span class="o">=</span> <span class="n">wrap_g_color</span> <span class="p">(</span><span class="n">g_color</span><span class="p">);</span>
9131 <span class="k">return</span> <span class="n">py_color</span><span class="p">;</span>
9132 <span class="p">}</span>
9133 </code></pre></div>
9134
9135 <p>Writing that by hand is an incredible amount of drudgery.</p>
9136 <p>What language bindings needed was a <em>description</em> of the API in a
9137 machine-readable format, so that the glue code could be written by a
9138 code generator.</p>
9139 <h2>The first API descriptions</h2>
9140 <p>I don't remember if it was the GNU Guile people, or the PyGTK people,
9141 who started to write descriptions of the GNOME API by hand. For ease
9142 of parsing, it was done in a Scheme-like dialect. A description may
9143 look like</p>
9144 <div class="highlight"><pre><span></span><code><span class="p">(</span><span class="nf">class</span> <span class="nv">GtkWidget</span>
9145 <span class="c1">;;; void gtk_widget_show (GtkWidget *widget);</span>
9146 <span class="p">(</span><span class="nf">method</span> <span class="nv">show</span>
9147 <span class="p">(</span><span class="nf">args</span> <span class="nv">nil</span><span class="p">)</span>
9148 <span class="p">(</span><span class="nf">retval</span> <span class="nv">nil</span><span class="p">))</span>
9149
9150 <span class="c1">;;; void gtk_widget_hide (GtkWidget *widget);</span>
9151 <span class="p">(</span><span class="nf">method</span> <span class="nv">hide</span>
9152 <span class="p">(</span><span class="nf">args</span> <span class="nv">nil</span><span class="p">)</span>
9153 <span class="p">(</span><span class="nf">retval</span> <span class="nv">nil</span><span class="p">)))</span>
9154
9155 <span class="p">(</span><span class="nf">class</span> <span class="nv">GtkContainer</span>
9156 <span class="c1">;;; void gtk_container_add (GtkContainer *container, GtkWidget *child);</span>
9157 <span class="p">(</span><span class="nf">method</span> <span class="nv">add</span>
9158 <span class="p">(</span><span class="nf">args</span> <span class="nv">GtkWidget</span><span class="p">)</span>
9159 <span class="p">(</span><span class="nf">retval</span> <span class="nv">nil</span><span class="p">)))</span>
9160
9161 <span class="p">(</span><span class="nf">struct</span> <span class="nv">GdkColor</span>
9162 <span class="p">(</span><span class="nf">field</span> <span class="nv">r</span> <span class="p">(</span><span class="nf">type</span> <span class="ss">&#39;guchar</span><span class="p">))</span>
9163 <span class="p">(</span><span class="nf">field</span> <span class="nv">g</span> <span class="p">(</span><span class="nf">type</span> <span class="ss">&#39;guchar</span><span class="p">))</span>
9164 <span class="p">(</span><span class="nf">field</span> <span class="nv">b</span> <span class="p">(</span><span class="nf">type</span> <span class="ss">&#39;guchar</span><span class="p">)))</span>
9165 </code></pre></div>
9166
9167 <p>Again, writing those descriptions by hand (and keeping up with the C
9168 API) was a lot of work, but the glue code to implement the binding
9169 could be done mostly automatically. The generated code may need
9170 subsequent tweaks by hand to deal with details that the Scheme-like
9171 descriptions didn't contemplate, but it was better than writing
9172 <em>everything</em> by hand.</p>
9173 <h2 id="type-system">Glib gets a real type system</h2>
9174 <p>Tim Janik took over the parts of Glib that implement
9175 objects/signals/types, and added a lot of things to create a good type
9176 system for C. This is where things like <code>GType</code>, <code>GValue</code>, <code>GParamSpec</code>, and
9177 fundamental types come from.</p>
9178 <p>For example, a <code>GType</code> is an identifier for a type, and a <code>GValue</code> is a
9179 type plus, well, a value of that type. You can ask a <code>GValue</code>, "are
9180 you an int? are you a GObject?".</p>
9181 <p>You can register new types: for example, there would be code in Gdk
9182 that registers a new <code>GType</code> for <code>GdkColor</code>, so you can ask a value,
9183 "are you a color?".</p>
9184 <p>Registering a type involves telling the GObject system things like how
9185 to copy values of that type, and how to free them. For <code>GdkColor</code>
9186 this may be just <code>g_new() / g_free()</code>; for reference-counted objects
9187 it may be <code>g_object_ref() / g_object_unref()</code>.</p>
9188 <h3>Objects can be queried about some of their properties</h3>
9189 <p>A widget can tell you when you press a mouse button mouse on it: it
9190 will emit the <code>button-press-event</code> signal. When <code>GtkWidget</code>'s
9191 implementation registers this signal, it calls something like</p>
9192 <div class="highlight"><pre><span></span><code> <span class="n">g_signal_new</span> <span class="p">(</span><span class="s">&quot;button-press-event&quot;</span><span class="p">,</span>
9193 <span class="n">gtk_widget_get_type</span><span class="p">(),</span> <span class="cm">/* type of object for which this signal is being created */</span>
9194 <span class="p">...</span>
9195 <span class="n">G_TYPE_BOOLEAN</span><span class="p">,</span> <span class="cm">/* type of return value */</span>
9196 <span class="mi">1</span><span class="p">,</span> <span class="cm">/* number of arguments */</span>
9197 <span class="n">GDK_TYPE_EVENT</span><span class="p">);</span> <span class="cm">/* type of first and only argument */</span>
9198 </code></pre></div>
9199
9200 <p>This tells GObject that <code>GtkWidget</code> will have a signal called
9201 <code>button-press-event</code>, with a return type of <code>G_TYPE_BOOLEAN</code>, and with
9202 a single argument of type <code>GDK_TYPE_EVENT</code>. This lets GObject do the
9203 appropriate marshalling of arguments when the signal is emitted.</p>
9204 <p>But also! <em>You can query the signal for its argument types!</em> You can
9205 run <code>g_signal_query()</code>, which will then tell you all the details of
9206 the signal: its name, return type, argument types, etc. A language
9207 binding could run <code>g_signal_query()</code> <em>and generate a description of the
9208 signal automatically</em> to the Scheme-like description language. And
9209 then generate the binding from <em>that</em>.</p>
9210 <h2>Not all of an object's properties can be queried</h2>
9211 <p>Unfortunately, although GObject signals and properties <em>can</em> be
9212 queried, methods can't be. C doesn't have classes with methods, and GObject does
9213 not really have any provisions to implement them. </p>
9214 <p>Conventionally, for a static method one would just do</p>
9215 <div class="highlight"><pre><span></span><code><span class="kt">void</span>
9216 <span class="nf">gtk_widget_set_flags</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">,</span> <span class="n">GtkWidgetFlags</span> <span class="n">flags</span><span class="p">)</span>
9217 <span class="p">{</span>
9218 <span class="cm">/* modify a struct field within &quot;widget&quot; or whatever */</span>
9219 <span class="cm">/* repaint or something */</span>
9220 <span class="p">}</span>
9221 </code></pre></div>
9222
9223 <p>And for a virtual method one would put a function pointer in the class
9224 structure, and provide a convenient way to call it:</p>
9225 <div class="highlight"><pre><span></span><code><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
9226 <span class="n">GtkObjectClass</span> <span class="n">parent_class</span><span class="p">;</span>
9227
9228 <span class="kt">void</span> <span class="p">(</span><span class="o">*</span> <span class="n">draw</span><span class="p">)</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">,</span> <span class="n">cairo_t</span> <span class="o">*</span><span class="n">cr</span><span class="p">);</span>
9229 <span class="p">}</span> <span class="n">GtkWidgetClass</span><span class="p">;</span>
9230
9231 <span class="kt">void</span>
9232 <span class="nf">gtk_widget_draw</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">,</span> <span class="n">cairo_t</span> <span class="o">*</span><span class="n">cr</span><span class="p">)</span>
9233 <span class="p">{</span>
9234 <span class="n">GtkWidgetClass</span> <span class="o">*</span><span class="n">klass</span> <span class="o">=</span> <span class="n">find_widget_class</span> <span class="p">(</span><span class="n">widget</span><span class="p">);</span>
9235
9236 <span class="p">(</span><span class="o">*</span> <span class="n">klass</span><span class="o">-&gt;</span><span class="n">draw</span><span class="p">)</span> <span class="p">(</span><span class="n">widget</span><span class="p">,</span> <span class="n">cr</span><span class="p">);</span>
9237 <span class="p">}</span>
9238 </code></pre></div>
9239
9240 <p>And GObject has no idea about this method — there is no way to query
9241 it; it just exists in C-space.</p>
9242 <p>Now, historically, GTK+'s header files have been written in a <em>very</em>
9243 consistent style. It is quite possible to write a tool that will take
9244 a header file like</p>
9245 <div class="highlight"><pre><span></span><code><span class="cm">/* gtkwidget.h */</span>
9246 <span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
9247 <span class="n">GtkObject</span> <span class="n">parent_class</span><span class="p">;</span>
9248
9249 <span class="kt">void</span> <span class="p">(</span><span class="o">*</span> <span class="n">draw</span><span class="p">)</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">,</span> <span class="n">cairo_t</span> <span class="o">*</span><span class="n">cr</span><span class="p">);</span>
9250 <span class="p">}</span> <span class="n">GtkWidgetClass</span><span class="p">;</span>
9251
9252 <span class="kt">void</span> <span class="nf">gtk_widget_set_flags</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">,</span> <span class="n">GtkWidgetFlags</span> <span class="n">flags</span><span class="p">);</span>
9253 <span class="kt">void</span> <span class="nf">gtk_widget_draw</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">,</span> <span class="n">cairo_t</span> <span class="o">*</span><span class="n">cr</span><span class="p">);</span>
9254 </code></pre></div>
9255
9256 <p>and parse it, even if it is with a simple parser that does not
9257 completely understand the C language, and have heuristics like</p>
9258 <ul>
9259 <li>
9260 <p>Is there a <code>class_name_foo()</code> function prototype with no
9261 corresponding <code>foo</code> field in the <code>Class</code> structure? It's probably a
9262 static method.</p>
9263 </li>
9264 <li>
9265 <p>Is there a <code>class_name_bar()</code> function with a <code>bar</code> field in the
9266 <code>Class</code> structure? It's probably a virtual method.</p>
9267 </li>
9268 <li>
9269 <p>Etc.</p>
9270 </li>
9271 </ul>
9272 <p>And in fact, that's what we had. C header files would get parsed
9273 with those heuristics, and the Scheme-like description files would get
9274 generated.</p>
9275 <h2>Scheme-like descriptions get reused, kind of</h2>
9276 <p>Language binding authors started reusing the Scheme-like
9277 descriptions. Sometimes they would cannibalize the descriptions from
9278 PyGTK, or Guile (again, I don't remember where the canonical version
9279 was maintained) and use them as they were.</p>
9280 <p>Other times they would copy the files, modify them by hand some more,
9281 and <em>then</em> use them to generate their language binding.</p>
9282 <h2>C being hostile</h2>
9283 <p>From just reading/parsing a C function prototype, you cannot know
9284 certain things. If one function argument is of type <code>Foo *</code>, does it mean:</p>
9285 <ul>
9286 <li>
9287 <p>the function gets a pointer to something which it should not modify
9288 ("in" parameter)</p>
9289 </li>
9290 <li>
9291 <p>the function gets a pointer to uninitialized data which it will set
9292 ("out" parameter)</p>
9293 </li>
9294 <li>
9295 <p>the function gets a pointer to initialized data which it will use
9296 and modify ("inout" parameter)</p>
9297 </li>
9298 <li>
9299 <p>the function will copy that pointer and hold a reference to the
9300 pointed data, and not free it when it's done</p>
9301 </li>
9302 <li>
9303 <p>the function will take over the ownership of the pointed data, and
9304 free it when it's done</p>
9305 </li>
9306 <li>
9307 <p>etc.</p>
9308 </li>
9309 </ul>
9310 <p>Sometimes people would include these annotations in the Scheme-like
9311 description language. But wouldn't it be better if those annotations
9312 came <em>from the C code itself</em>?</p>
9313 <h1>GObject Introspection appears</h1>
9314 <p>For GNOME 3, we wanted a unified solution for language bindings:</p>
9315 <ul>
9316 <li>
9317 <p>Have a single way to extract the machine-readable descriptions of
9318 the C API.</p>
9319 </li>
9320 <li>
9321 <p>Have every language binding be automatically generated from those
9322 descriptions.</p>
9323 </li>
9324 <li>
9325 <p>In the descriptions, have <em>all</em> the information necessary to
9326 generate a correct language binding...</p>
9327 </li>
9328 <li>
9329 <p>... including documentation.</p>
9330 </li>
9331 </ul>
9332 <p>We had to do a lot of work to accomplish this. For example:</p>
9333 <ul>
9334 <li>
9335 <p>Remove C-isms from the public API. Varargs functions, those that
9336 have <code>foo (int x, ...)</code>, can't be easily described and called from
9337 other languages. Instead, have something like
9338 <code>foov (int x, int num_args, GValue *args_array)</code> that can be easily
9339 consumed by other languages.</p>
9340 </li>
9341 <li>
9342 <p>Add <em>annotations</em> throughout the code so that the ad-hoc C parser
9343 can know about in/out/inout arguments, and whether pointer arguments
9344 are borrowed references or a full transfership of ownership.</p>
9345 </li>
9346 <li>
9347 <p>Take the in-line documentation comments and store them as part of
9348 the machine-readable description of the API.</p>
9349 </li>
9350 <li>
9351 <p>When compiling a library, automatically do all the things like
9352 <code>g_signal_query()</code> and spit out machine-readable descriptions of
9353 those parts of the API.</p>
9354 </li>
9355 </ul>
9356 <p>So, GObject Introspection is all of those things.</p>
9357 <h2>Annotations</h2>
9358 <p>If you have looked at the C code for a GNOME library, you may have
9359 seen something like this:</p>
9360 <div class="highlight"><pre><span></span><code><span class="cm">/**</span>
9361 <span class="cm"> * gtk_widget_get_parent:</span>
9362 <span class="cm"> * @widget: a #GtkWidget</span>
9363 <span class="cm"> *</span>
9364 <span class="cm"> * Returns the parent container of @widget.</span>
9365 <span class="cm"> *</span>
9366 <span class="cm"> * Returns: (transfer none) (nullable): the parent container of @widget, or %NULL</span>
9367 <span class="cm"> **/</span>
9368 <span class="n">GtkWidget</span> <span class="o">*</span>
9369 <span class="nf">gtk_widget_get_parent</span> <span class="p">(</span><span class="n">GtkWidget</span> <span class="o">*</span><span class="n">widget</span><span class="p">)</span>
9370 <span class="p">{</span>
9371 <span class="p">...</span>
9372 <span class="p">}</span>
9373 </code></pre></div>
9374
9375 <p>See that "<code>(transfer none) (nullable)</code>" in the documentation comments?
9376 The <code>(transfer none)</code> means that the return value is a pointer whose
9377 ownership does <em>not</em> get transferred to the caller, i.e. the widget
9378 retains ownership. Finally, the <code>(nullable)</code> indicates that the
9379 function can return <code>NULL</code>, when the widget has no parent.</p>
9380 <p>A language binding will then use this information as follows:</p>
9381 <ul>
9382 <li>
9383 <p>It will not <code>unref()</code> the parent widget when it is done with it.</p>
9384 </li>
9385 <li>
9386 <p>It will deal with a <code>NULL</code> pointer in a special way, instead of
9387 assuming that references are not null.</p>
9388 </li>
9389 </ul>
9390 <p>Every now and then someone discovers a public function which is
9391 lacking an annotation of that sort — for GNOME's purposes this is a
9392 bug; fortunately, it is easy to add that annotation to the C sources
9393 and regenerate the machine-readable descriptions.</p>
9394 <h2>Machine-readable descriptions, or repository files</h2>
9395 <p>So, what do those machine-readable descriptions actually look like?
9396 They moved away from a Scheme-like language and got turned into XML,
9397 because early XXIst century.</p>
9398 <p>The machine-readable descriptions are called <em>GObject Introspection
9399 Repository files</em>, or GIR for short.</p>
9400 <p>Let's look at some parts of <code>Gtk-3.0.gir</code>, which your distro may put in
9401 <code>/usr/share/gir-1.0/Gtk-3.0.gir</code>.</p>
9402 <div class="highlight"><pre><span></span><code><span class="nt">&lt;repository</span> <span class="na">version=</span><span class="s">&quot;1.2&quot;</span> <span class="err">...</span><span class="nt">&gt;</span>
9403
9404 <span class="nt">&lt;namespace</span> <span class="na">name=</span><span class="s">&quot;Gtk&quot;</span>
9405 <span class="na">version=</span><span class="s">&quot;3.0&quot;</span>
9406 <span class="na">shared-library=</span><span class="s">&quot;libgtk-3.so.0,libgdk-3.so.0&quot;</span>
9407 <span class="na">c:identifier-prefixes=</span><span class="s">&quot;Gtk&quot;</span>
9408 <span class="na">c:symbol-prefixes=</span><span class="s">&quot;gtk&quot;</span><span class="nt">&gt;</span>
9409 </code></pre></div>
9410
9411 <p>For the toplevel "<code>Gtk</code>" namespace, this is what the <code>.so</code> library is
9412 called. All identifiers have "<code>Gtk</code>" or "<code>gtk</code>" prefixes.</p>
9413 <h3>A class with methods and a signal</h3>
9414 <p>Let's look at the description for <code>GtkEntry</code>...</p>
9415 <div class="highlight"><pre><span></span><code> <span class="nt">&lt;class</span> <span class="na">name=</span><span class="s">&quot;Entry&quot;</span>
9416 <span class="na">c:symbol-prefix=</span><span class="s">&quot;entry&quot;</span>
9417 <span class="na">c:type=</span><span class="s">&quot;GtkEntry&quot;</span>
9418 <span class="na">parent=</span><span class="s">&quot;Widget&quot;</span>
9419 <span class="na">glib:type-name=</span><span class="s">&quot;GtkEntry&quot;</span>
9420 <span class="na">glib:get-type=</span><span class="s">&quot;gtk_entry_get_type&quot;</span>
9421 <span class="na">glib:type-struct=</span><span class="s">&quot;EntryClass&quot;</span><span class="nt">&gt;</span>
9422
9423 <span class="nt">&lt;doc</span> <span class="na">xml:space=</span><span class="s">&quot;preserve&quot;</span><span class="nt">&gt;</span>The #GtkEntry widget is a single line text entry
9424 widget. A fairly large set of key bindings are supported
9425 by default. If the entered text is longer than the allocation
9426 ...
9427 <span class="nt">&lt;/doc&gt;</span>
9428 </code></pre></div>
9429
9430 <p>This is the start of the description for <code>GtkEntry</code>. We already know
9431 that everything is prefixed with "<code>Gtk</code>", so the name is just given as
9432 "<code>Entry</code>". Its parent class is <code>Widget</code> and the function which
9433 registers it against the GObject type system is <code>gtk_entry_get_type</code>.</p>
9434 <p>Also, there are the toplevel documentation comments for the <code>Entry</code>
9435 class.</p>
9436 <p>Onwards!</p>
9437 <div class="highlight"><pre><span></span><code> <span class="nt">&lt;implements</span> <span class="na">name=</span><span class="s">&quot;Atk.ImplementorIface&quot;</span><span class="nt">/&gt;</span>
9438 <span class="nt">&lt;implements</span> <span class="na">name=</span><span class="s">&quot;Buildable&quot;</span><span class="nt">/&gt;</span>
9439 <span class="nt">&lt;implements</span> <span class="na">name=</span><span class="s">&quot;CellEditable&quot;</span><span class="nt">/&gt;</span>
9440 <span class="nt">&lt;implements</span> <span class="na">name=</span><span class="s">&quot;Editable&quot;</span><span class="nt">/&gt;</span>
9441 </code></pre></div>
9442
9443 <p>GObject classes can implement various interfaces; this is the list
9444 that <code>GtkEntry</code> supports.</p>
9445 <p>Next, let's look at a single method:</p>
9446 <div class="highlight"><pre><span></span><code> <span class="nt">&lt;method</span> <span class="na">name=</span><span class="s">&quot;get_text&quot;</span> <span class="na">c:identifier=</span><span class="s">&quot;gtk_entry_get_text&quot;</span><span class="nt">&gt;</span>
9447 <span class="nt">&lt;doc</span> <span class="na">xml:space=</span><span class="s">&quot;preserve&quot;</span><span class="nt">&gt;</span>Retrieves the contents of the entry widget. ... <span class="nt">&lt;/doc&gt;</span>
9448
9449 <span class="nt">&lt;return-value</span> <span class="na">transfer-ownership=</span><span class="s">&quot;none&quot;</span><span class="nt">&gt;</span>
9450 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;utf8&quot;</span> <span class="na">c:type=</span><span class="s">&quot;const gchar*&quot;</span><span class="nt">/&gt;</span>
9451 <span class="nt">&lt;/return-value&gt;</span>
9452
9453 <span class="nt">&lt;parameters&gt;</span>
9454 <span class="nt">&lt;instance-parameter</span> <span class="na">name=</span><span class="s">&quot;entry&quot;</span> <span class="na">transfer-ownership=</span><span class="s">&quot;none&quot;</span><span class="nt">&gt;</span>
9455 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;Entry&quot;</span> <span class="na">c:type=</span><span class="s">&quot;GtkEntry*&quot;</span><span class="nt">/&gt;</span>
9456 <span class="nt">&lt;/instance-parameter&gt;</span>
9457 <span class="nt">&lt;/parameters&gt;</span>
9458 <span class="nt">&lt;/method&gt;</span>
9459 </code></pre></div>
9460
9461 <p>The method <code>get_text</code> and its corresponding C symbol. Its return
9462 value is an UTF-8 encoded string, and ownership of the memory for that
9463 string is not transferred to the caller.</p>
9464 <p>The method takes a single parameter which is the <code>entry</code> instance itself.</p>
9465 <p>Now, let's look at a signal:</p>
9466 <div class="highlight"><pre><span></span><code> <span class="nt">&lt;glib:signal</span> <span class="na">name=</span><span class="s">&quot;activate&quot;</span> <span class="na">when=</span><span class="s">&quot;last&quot;</span> <span class="na">action=</span><span class="s">&quot;1&quot;</span><span class="nt">&gt;</span>
9467 <span class="nt">&lt;doc</span> <span class="na">xml:space=</span><span class="s">&quot;preserve&quot;</span><span class="nt">&gt;</span>The ::activate signal is emitted when the user hits
9468 the Enter key. ...<span class="nt">&lt;/doc&gt;</span>
9469
9470 <span class="nt">&lt;return-value</span> <span class="na">transfer-ownership=</span><span class="s">&quot;none&quot;</span><span class="nt">&gt;</span>
9471 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;none&quot;</span> <span class="na">c:type=</span><span class="s">&quot;void&quot;</span><span class="nt">/&gt;</span>
9472 <span class="nt">&lt;/return-value&gt;</span>
9473 <span class="nt">&lt;/glib:signal&gt;</span>
9474
9475 <span class="nt">&lt;/class&gt;</span>
9476 </code></pre></div>
9477
9478 <p>The "<code>activate</code>" signal takes no arguments, and has a return value of
9479 type <code>void</code>, i.e. no return value.</p>
9480 <h3>A struct with public fields</h3>
9481 <p>The following comes from <code>Gdk-3.0.gir</code>; it's the description for
9482 <code>GdkRectangle</code>.</p>
9483 <div class="highlight"><pre><span></span><code> <span class="nt">&lt;record</span> <span class="na">name=</span><span class="s">&quot;Rectangle&quot;</span>
9484 <span class="na">c:type=</span><span class="s">&quot;GdkRectangle&quot;</span>
9485 <span class="na">glib:type-name=</span><span class="s">&quot;GdkRectangle&quot;</span>
9486 <span class="na">glib:get-type=</span><span class="s">&quot;gdk_rectangle_get_type&quot;</span>
9487 <span class="na">c:symbol-prefix=</span><span class="s">&quot;rectangle&quot;</span><span class="nt">&gt;</span>
9488
9489 <span class="nt">&lt;field</span> <span class="na">name=</span><span class="s">&quot;x&quot;</span> <span class="na">writable=</span><span class="s">&quot;1&quot;</span><span class="nt">&gt;</span>
9490 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;gint&quot;</span> <span class="na">c:type=</span><span class="s">&quot;int&quot;</span><span class="nt">/&gt;</span>
9491 <span class="nt">&lt;/field&gt;</span>
9492 <span class="nt">&lt;field</span> <span class="na">name=</span><span class="s">&quot;y&quot;</span> <span class="na">writable=</span><span class="s">&quot;1&quot;</span><span class="nt">&gt;</span>
9493 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;gint&quot;</span> <span class="na">c:type=</span><span class="s">&quot;int&quot;</span><span class="nt">/&gt;</span>
9494 <span class="nt">&lt;/field&gt;</span>
9495 <span class="nt">&lt;field</span> <span class="na">name=</span><span class="s">&quot;width&quot;</span> <span class="na">writable=</span><span class="s">&quot;1&quot;</span><span class="nt">&gt;</span>
9496 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;gint&quot;</span> <span class="na">c:type=</span><span class="s">&quot;int&quot;</span><span class="nt">/&gt;</span>
9497 <span class="nt">&lt;/field&gt;</span>
9498 <span class="nt">&lt;field</span> <span class="na">name=</span><span class="s">&quot;height&quot;</span> <span class="na">writable=</span><span class="s">&quot;1&quot;</span><span class="nt">&gt;</span>
9499 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;gint&quot;</span> <span class="na">c:type=</span><span class="s">&quot;int&quot;</span><span class="nt">/&gt;</span>
9500 <span class="nt">&lt;/field&gt;</span>
9501
9502 <span class="nt">&lt;/record&gt;</span>
9503 </code></pre></div>
9504
9505 <p>So that's the <code>x/y/width/height</code> fields in the struct, in the same
9506 order as they are defined in the C code.</p>
9507 <p>And so on. The idea is for the whole API exported by a GObject
9508 library to be describable by that format. If something can't be
9509 described, it's a bug in the library, or a bug in the format.</p>
9510 <h1>Making language bindings start up quickly: typelib files</h1>
9511 <p>As we saw, the GIR files are the XML descriptions of GObject APIs.
9512 Dynamic languages like Python would prefer to generate the language
9513 binding on the fly, as needed, instead of pre-generating a huge
9514 binding.</p>
9515 <p>However, GTK+ is a big API: <code>Gtk-3.0.gir</code> is 7 MB of XML. Parsing
9516 all of that just to be able to generate <code>gtk_widget_show()</code> on the fly
9517 would be too slow. Also, there are GTK+'s dependencies: Atk, Gdk,
9518 Cairo, etc. You don't want to parse <em>everything</em> just to start up!</p>
9519 <p>So, we have an extra step that compiles the GIR files down to binary
9520 <code>.typelib</code> files. For example,
9521 <code>/usr/lib64/girepository-1.0/Gtk-3.0.typelib</code> is about 600 KB on my
9522 machine. Those files get <code>mmap()</code>ed for fast access, and can be
9523 shared between processes.</p>
9524 <h2>How dynamic language bindings use typelib files</h2>
9525 <p>GObject Introspection comes with a library that language binding
9526 implementors can use to consume those <code>.typelib</code> files. The
9527 <code>libgirepository</code> library has functions like "list all the classes
9528 available in this namespace", or "call this function with these
9529 values for arguments, and give me back the return value here".</p>
9530 <p>Internally, <code>libgirepository</code> uses <code>libffi</code> to actually call the C
9531 functions in the dynamically-linked libraries.</p>
9532 <p>So, when you write <code>foo.py</code> and do</p>
9533 <div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">gi</span>
9534 <span class="n">gi</span><span class="o">.</span><span class="n">require_version</span><span class="p">(</span><span class="s1">&#39;Gtk&#39;</span><span class="p">,</span> <span class="s1">&#39;3.0&#39;</span><span class="p">)</span>
9535 <span class="kn">from</span> <span class="nn">gi.repository</span> <span class="kn">import</span> <span class="n">Gtk</span>
9536 <span class="n">win</span> <span class="o">=</span> <span class="n">Gtk</span><span class="o">.</span><span class="n">Window</span><span class="p">()</span>
9537 </code></pre></div>
9538
9539 <p>what happens is that <code>pygobject</code> calls <code>libgirepository</code> to <code>mmap()</code>
9540 the <code>.typelib</code>, and sees that the constructor for <code>Gtk.Window</code> is a C
9541 function called <code>gtk_window_new()</code>. After seeing how that function
9542 wants to be called, it calls the function using <code>libffi</code>, wraps the
9543 result with a <code>PyObject</code>, and that's what you get on the Python side.</p>
9544 <h1>Static languages</h1>
9545 <p>A static language like Rust prefers to have the whole language binding
9546 pre-generated. This is what the various crates in <a href="https://github.com/gtk-rs/">gtk-rs</a>
9547 do.</p>
9548 <p><a href="https://github.com/gtk-rs/gir/tree/master/src">The gir crate</a> takes a <code>.gir</code> file (i.e. the XML descriptions)
9549 and does two things:</p>
9550 <ul>
9551 <li>
9552 <p>Reconstructs the C function prototypes and C struct declarations,
9553 but in a way Rust can understand them. This gets output to the <a href="https://github.com/gtk-rs/sys">sys
9554 crate</a>.</p>
9555 </li>
9556 <li>
9557 <p>Creates idiomatic Rust code for the language binding. This gets
9558 output to the various crates; for example, <a href="https://github.com/gtk-rs/gtk">the gtk one</a>.</p>
9559 </li>
9560 </ul>
9561 <p>When reconstructing the C structs and prototypes, we get stuff like</p>
9562 <div class="highlight"><pre><span></span><code><span class="cp">#[repr(C)]</span><span class="w"></span>
9563 <span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">GtkWidget</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
9564 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">parent_instance</span>: <span class="nc">gobject</span>::<span class="n">GInitiallyUnowned</span><span class="p">,</span><span class="w"></span>
9565 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">priv_</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">GtkWidgetPrivate</span><span class="p">,</span><span class="w"></span>
9566 <span class="p">}</span><span class="w"></span>
9567
9568 <span class="k">extern</span><span class="w"> </span><span class="s">&quot;C&quot;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
9569 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">gtk_entry_new</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">GtkWidget</span><span class="p">;</span><span class="w"></span>
9570 <span class="p">}</span><span class="w"></span>
9571 </code></pre></div>
9572
9573 <p>And the idiomatic bindings? Stay tuned!</p></content><category term="misc"></category><category term="gnome"></category><category term="gobject-introspection"></category><category term="rust"></category></entry><entry><title>Librsvg's build infrastructure: Autotools and Rust</title><link href="https://people.gnome.org/~federico/blog/librsvg-build-infrastructure.html" rel="alternate"></link><published>2017-09-01T18:15:29-05:00</published><updated>2017-11-11T09:37:08-06:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-09-01:/~federico/blog/librsvg-build-infrastructure.html</id><summary type="html"><p>Today I released <a href="https://mail.gnome.org/archives/desktop-devel-list/2017-September/msg00008.html">librsvg 2.41.1</a>, and it's a big release!
9574 Apart from all the Rust goodness, and the large number of bug fixes, I
9575 am very happy with the way the build system works these days. I've
9576 found it invaluable to have good examples of Autotools incantations to …</p></summary><content type="html"><p>Today I released <a href="https://mail.gnome.org/archives/desktop-devel-list/2017-September/msg00008.html">librsvg 2.41.1</a>, and it's a big release!
9577 Apart from all the Rust goodness, and the large number of bug fixes, I
9578 am very happy with the way the build system works these days. I've
9579 found it invaluable to have good examples of Autotools incantations to
9580 copy&amp;paste, so hopefully this will be useful to someone else.</p>
9581 <p>There are some subtleties that a "good" autotools setup demands, and
9582 so far I think librsvg is doing well:</p>
9583 <ul>
9584 <li>
9585 <p>The <code>configure</code> script checks for <code>cargo</code> and <code>rustc</code>.</p>
9586 </li>
9587 <li>
9588 <p>"<code>make distcheck</code>" works. This means that the build can be
9589 performed with <code>builddir != srcdir</code>, and also that <code>make check</code> runs
9590 the available tests and they all pass.</p>
9591 </li>
9592 <li>
9593 <p>The <code>rsvg_internals</code> library is built with Rust, and our
9594 <code>Makefile.am</code> calls <code>cargo build</code> with the correct options. It is
9595 able to handle debug and release builds.</p>
9596 </li>
9597 <li>
9598 <p>"<code>make clean</code>" cleans up the Rust build directories as well.</p>
9599 </li>
9600 <li>
9601 <p>If you change a <code>.rs</code> file and type <code>make</code>, only the necessary stuff
9602 gets rebuilt.</p>
9603 </li>
9604 <li>
9605 <p>Etcetera. I think librsvg feels like a normal autotool'ed library.
9606 Let's see how this is done.</p>
9607 </li>
9608 </ul>
9609 <h1>Librsvg's basic autotools setup</h1>
9610 <p>Librsvg started out with a fairly traditional autotools setup with a
9611 <a href="https://git.gnome.org/browse/librsvg/tree/configure.ac?h=2.41.1"><code>configure.ac</code></a> and <a href="https://git.gnome.org/browse/librsvg/tree/Makefile.am?h=2.41.1"><code>Makefile.am</code></a>. For
9612 historical reasons the <code>.[ch]</code> source files live in the toplevel
9613 <code>librsvg/</code> directory, not in a <code>src</code> subdirectory or something like
9614 that.</p>
9615 <div class="highlight"><pre><span></span><code>librsvg
9616 ├ configure.ac
9617 ├ Makefile.am
9618 ├ *.[ch]
9619 ├ src/
9620 ├ doc/
9621 ├ tests/
9622 └ win32/
9623 </code></pre></div>
9624
9625 <h1>Adding Rust to the build</h1>
9626 <p>The Rust source code lives in <a href="https://git.gnome.org/browse/librsvg/tree/rust?h=2.41.1"><code>librsvg/rust</code></a>; that's
9627 where <a href="https://git.gnome.org/browse/librsvg/tree/rust/Cargo.toml?h=2.41.1"><code>Cargo.toml</code></a> lives, and of course there is the conventional
9628 <code>src</code> subdirectory with the <code>*.rs</code> files.</p>
9629 <div class="highlight"><pre><span></span><code>librsvg
9630 ├ configure.ac
9631 ├ Makefile.am
9632 ├ *.[ch]
9633 ├ src/
9634 ├ rust/ &lt;--- this is new!
9635 │ ├ Cargo.toml
9636 │ └ src/
9637 ├ doc/
9638 ├ tests/
9639 └ win32/
9640 </code></pre></div>
9641
9642 <h2>Detecting the presence of <code>cargo</code> and <code>rustc</code> in <code>configure.ac</code></h2>
9643 <p>This goes in <code>configure.ac</code>:</p>
9644 <div class="highlight"><pre><span></span><code>AC_CHECK_PROG<span class="o">(</span>CARGO, <span class="o">[</span>cargo<span class="o">]</span>, <span class="o">[</span>yes<span class="o">]</span>, <span class="o">[</span>no<span class="o">])</span>
9645 AS_IF<span class="o">(</span><span class="nb">test</span> x<span class="nv">$CARGO</span> <span class="o">=</span> xno,
9646 AC_MSG_ERROR<span class="o">([</span>cargo is required. Please install the Rust toolchain from https://www.rust-lang.org/<span class="o">])</span>
9647 <span class="o">)</span>
9648 AC_CHECK_PROG<span class="o">(</span>RUSTC, <span class="o">[</span>rustc<span class="o">]</span>, <span class="o">[</span>yes<span class="o">]</span>, <span class="o">[</span>no<span class="o">])</span>
9649 AS_IF<span class="o">(</span><span class="nb">test</span> x<span class="nv">$RUSTC</span> <span class="o">=</span> xno,
9650 AC_MSG_ERROR<span class="o">([</span>rustc is required. Please install the Rust toolchain from https://www.rust-lang.org/<span class="o">])</span>
9651 <span class="o">)</span>
9652 </code></pre></div>
9653
9654 <p>These two try to execute <code>cargo</code> and <code>rustc</code>, respectively, and abort
9655 with an error message if they are not present.</p>
9656 <h2>Supporting debug or release mode for the Rust build</h2>
9657 <p>One can call cargo like "<code>cargo build --release</code>" to turn on expensive
9658 optimizations, or normally like just "<code>cargo build</code>" to build with
9659 debug information. That is, the latter is the default: if you don't
9660 pass any options, cargo does a debug build.</p>
9661 <p>Autotools and C compilers normally work a bit differently; one must
9662 call the configure script like "<code>CFLAGS='-g -O0' ./configure</code>" for a
9663 debug build, or "<code>CFLAGS='-O2 -fomit-frame-pointer' ./configure</code>" for
9664 a release build.</p>
9665 <p>Linux distros already have all the infrastructure to pass the
9666 appropriate <code>CFLAGS</code> to <code>configure</code>. We need to be able to pass the
9667 appropriate flag to Cargo. My main requirement for this was:</p>
9668 <ul>
9669 <li>Distros shouldn't have to substantially change their RPM specfiles
9670 (or whatever) to accomodate the Rust build.</li>
9671 <li>I assume that distros will want to make release builds by default.</li>
9672 <li>I as a developer am comfortable with passing extra options to make
9673 debug builds on my machine.</li>
9674 </ul>
9675 <p>The scheme in librsvg lets you run "<code>configure --enable-debug</code>" to
9676 make it call a plain <code>cargo build</code>, or a plain "<code>configure</code>" to make
9677 it use <code>cargo build --release</code> instead. The <code>CFLAGS</code> are passed as
9678 usual through an environment variable. This way, distros don't have
9679 to change their packaging to keep on making release builds as usual.</p>
9680 <p>This goes in <code>configure.ac</code>:</p>
9681 <div class="highlight"><pre><span></span><code>dnl Specify --enable-debug to make a development release. By default,
9682 dnl we build <span class="k">in</span> public release mode.
9683
9684 AC_ARG_ENABLE<span class="o">(</span>debug,
9685 AC_HELP_STRING<span class="o">([</span>--enable-debug<span class="o">]</span>,
9686 <span class="o">[</span>Build Rust code with debugging information <span class="o">[</span><span class="nv">default</span><span class="o">=</span>no<span class="o">]])</span>,
9687 <span class="o">[</span><span class="nv">debug_release</span><span class="o">=</span><span class="nv">$enableval</span><span class="o">]</span>,
9688 <span class="o">[</span><span class="nv">debug_release</span><span class="o">=</span>no<span class="o">])</span>
9689
9690 AC_MSG_CHECKING<span class="o">(</span>whether to build Rust code with debugging information<span class="o">)</span>
9691 <span class="k">if</span> <span class="nb">test</span> <span class="s2">&quot;x</span><span class="nv">$debug_release</span><span class="s2">&quot;</span> <span class="o">=</span> <span class="s2">&quot;xyes&quot;</span> <span class="p">;</span> <span class="k">then</span>
9692 AC_MSG_RESULT<span class="o">(</span>yes<span class="o">)</span>
9693 <span class="nv">RUST_TARGET_SUBDIR</span><span class="o">=</span>debug
9694 <span class="k">else</span>
9695 AC_MSG_RESULT<span class="o">(</span>no<span class="o">)</span>
9696 <span class="nv">RUST_TARGET_SUBDIR</span><span class="o">=</span>release
9697 <span class="k">fi</span>
9698 AM_CONDITIONAL<span class="o">([</span>DEBUG_RELEASE<span class="o">]</span>, <span class="o">[</span><span class="nb">test</span> <span class="s2">&quot;x</span><span class="nv">$debug_release</span><span class="s2">&quot;</span> <span class="o">=</span> <span class="s2">&quot;xyes&quot;</span><span class="o">])</span>
9699
9700 AC_SUBST<span class="o">([</span>RUST_TARGET_SUBDIR<span class="o">])</span>
9701 </code></pre></div>
9702
9703 <p>This defines an Automake conditional called <code>DEBUG_RELEASE</code>, which we
9704 will use in <code>Makefile.am</code> later.</p>
9705 <p>It also causes <code>@RUST_TARGET_SUBDIR@</code> to be substituted in Makefile.am
9706 with either <code>debug</code> or <code>release</code>; we will see what these are about.</p>
9707 <h2>Adding Rust source files</h2>
9708 <p>The <code>librsvg/rust/src</code> directory has all the <code>*.rs</code> files, and cargo
9709 tracks their dependencies and whether they need to be rebuilt if one changes.
9710 However, since that directory is not tracked by <code>make</code>, it won't
9711 rebuild things if a Rust source file changes! So, we need to tell our
9712 <code>Makefile.am</code> about those files:</p>
9713 <div class="highlight"><pre><span></span><code><span class="nv">RUST_SOURCES</span> <span class="o">=</span> <span class="se">\</span>
9714 rust/build.rs <span class="se">\</span>
9715 rust/Cargo.toml <span class="se">\</span>
9716 rust/src/aspect_ratio.rs <span class="se">\</span>
9717 rust/src/bbox.rs <span class="se">\</span>
9718 rust/src/cnode.rs <span class="se">\</span>
9719 rust/src/color.rs <span class="se">\</span>
9720 ...
9721
9722 <span class="nv">RUST_EXTRA</span> <span class="o">=</span> <span class="se">\</span>
9723 rust/Cargo.lock
9724
9725 <span class="nv">EXTRA_DIST</span> <span class="o">+=</span> <span class="k">$(</span>RUST_SOURCES<span class="k">)</span> <span class="k">$(</span>RUST_EXTRA<span class="k">)</span>
9726 </code></pre></div>
9727
9728 <p>It's a bit unfortunate that the change tracking is duplicated in the
9729 <code>Makefile</code>, but we are already used to listing all the C source files
9730 in there, anyway.</p>
9731 <p>Most notably, the <code>rust</code> subdirectory is <em>not</em> listed in the <code>SUBDIRS</code>
9732 in <code>Makefile.am</code>, since there is no <code>rust/Makefile</code> at all!</p>
9733 <h2>Cargo release or debug build?</h2>
9734 <div class="highlight"><pre><span></span><code><span class="cp">if DEBUG_RELEASE</span>
9735 <span class="nv">CARGO_RELEASE_ARGS</span><span class="o">=</span>
9736 <span class="cp">else</span>
9737 <span class="nv">CARGO_RELEASE_ARGS</span><span class="o">=</span>--release
9738 <span class="cp">endif</span>
9739 </code></pre></div>
9740
9741 <p>We will call <code>cargo build</code> with that argument later.</p>
9742 <h2>Verbose or quiet build?</h2>
9743 <p>Librsvg uses <code>AM_SILENT_RULES([yes])</code> in <code>configure.ac</code>. This lets
9744 you just run "<code>make</code>" for a quiet build, or "<code>make V=1</code>" to get the
9745 full command lines passed to the compiler. Cargo supports something
9746 similar, so let's add it to <code>Makefile.am</code>:</p>
9747 <div class="highlight"><pre><span></span><code><span class="nv">CARGO_VERBOSE</span> <span class="o">=</span> <span class="k">$(</span>cargo_verbose_<span class="k">$(</span>V<span class="k">))</span>
9748 <span class="nv">cargo_verbose_</span> <span class="o">=</span> <span class="k">$(</span>cargo_verbose_<span class="k">$(</span>AM_DEFAULT_VERBOSITY<span class="k">))</span>
9749 <span class="nv">cargo_verbose_0</span> <span class="o">=</span>
9750 <span class="nv">cargo_verbose_1</span> <span class="o">=</span> --verbose
9751 </code></pre></div>
9752
9753 <p>This expands the <code>V</code> variable to empty, <code>0</code>, or <code>1</code>. The result of
9754 expanding <em>that</em> gives us the final command-line argument in the
9755 <code>CARGO_VERBOSE</code> variable.</p>
9756 <h2>What's the filename of the library we are building?</h2>
9757 <div class="highlight"><pre><span></span><code><span class="nv">RUST_LIB</span><span class="o">=</span>@abs_top_builddir@/rust/target/@RUST_TARGET_SUBDIR@/librsvg_internals.a
9758 </code></pre></div>
9759
9760 <p>Remember our <code>@RUST_TARGET_SUBDIR@</code> from <code>configure.ac</code>? If you call
9761 plain "<code>cargo build</code>", it will put the binaries in
9762 <code>rust/target/debug</code>. But if you call "<code>cargo build --release</code>", it
9763 will put the binaries in <code>rust/target/release</code>.</p>
9764 <p>With the bit above, the <code>RUST_LIB</code> variable now has the correct path
9765 for the built library. The <code>@abs_top_builddir@</code> makes it work when
9766 the build directory is not the same as the source directory.</p>
9767 <h2>Okay, so how do we call <code>cargo</code>?</h2>
9768 <div class="highlight"><pre><span></span><code><span class="nf">@abs_top_builddir@/rust/target/@RUST_TARGET_SUBDIR@/librsvg_internals.a</span><span class="o">:</span> <span class="k">$(</span><span class="nv">RUST_SOURCES</span><span class="k">)</span>
9769 <span class="nb">cd</span> <span class="k">$(</span>top_srcdir<span class="k">)</span>/rust <span class="o">&amp;&amp;</span> <span class="se">\</span>
9770 <span class="nv">CARGO_TARGET_DIR</span><span class="o">=</span>@abs_top_builddir@/rust/target cargo build <span class="k">$(</span>CARGO_VERBOSE<span class="k">)</span> <span class="k">$(</span>CARGO_RELEASE_ARGS<span class="k">)</span>
9771 </code></pre></div>
9772
9773 <p>We make the funky library filename depend on <code>$(RUST_SOURCES)</code>.
9774 That's what will cause <code>make</code> to rebuild the Rust library if one of
9775 the Rust source files changes.</p>
9776 <p>We override the <code>CARGO_TARGET_DIR</code> with Automake's preference, and
9777 call <code>cargo build</code> with the correct arguments.</p>
9778 <h2>Linking into the main C library</h2>
9779 <div class="highlight"><pre><span></span><code><span class="err">librsvg_@RSVG_API_MAJOR_VERSION@</span><span class="nv">_la_LIBADD</span> <span class="o">=</span> <span class="se">\</span>
9780 <span class="k">$(</span>LIBRSVG_LIBS<span class="k">)</span> <span class="se">\</span>
9781 <span class="k">$(</span>LIBM<span class="k">)</span> <span class="se">\</span>
9782 <span class="k">$(</span>RUST_LIB<span class="k">)</span>
9783 </code></pre></div>
9784
9785 <p>This expands our <code>$(RUST_LIB)</code> from above into our linker line, along
9786 with librsvg's other dependencies.</p>
9787 <h2><code>make check</code></h2>
9788 <p>This is our hook so that <code>make check</code> will cause <code>cargo test</code> to run:</p>
9789 <div class="highlight"><pre><span></span><code><span class="nf">check-local</span><span class="o">:</span>
9790 <span class="nb">cd</span> <span class="k">$(</span>srcdir<span class="k">)</span>/rust <span class="o">&amp;&amp;</span> <span class="se">\</span>
9791 <span class="nv">CARGO_TARGET_DIR</span><span class="o">=</span>@abs_top_builddir@/rust/target cargo <span class="nb">test</span>
9792 </code></pre></div>
9793
9794 <h2><code>make clean</code></h2>
9795 <p>Same thing for <code>make clean</code> and <code>cargo clean</code>:</p>
9796 <div class="highlight"><pre><span></span><code><span class="nf">clean-local</span><span class="o">:</span>
9797 <span class="nb">cd</span> <span class="k">$(</span>top_srcdir<span class="k">)</span>/rust <span class="o">&amp;&amp;</span> <span class="se">\</span>
9798 <span class="nv">CARGO_TARGET_DIR</span><span class="o">=</span>@abs_top_builddir@/rust/target cargo clean
9799 </code></pre></div>
9800
9801 <h1>Vendoring dependencies</h1>
9802 <p>Linux distros probably want Rust packages to come bundled with their
9803 dependencies, so that they can replace them later with newer/patched
9804 versions.</p>
9805 <p>Here is a hook so that <code>make dist</code> will cause <code>cargo vendor</code> to be
9806 run before making the tarball. That command will creates a
9807 <code>rust/vendor</code> directory with a copy of all the Rust crates that
9808 librsvg depends on.</p>
9809 <div class="highlight"><pre><span></span><code><span class="nv">RUST_EXTRA</span> <span class="o">+=</span> rust/cargo-vendor-config
9810
9811 <span class="nf">dist-hook</span><span class="o">:</span>
9812 <span class="o">(</span><span class="nb">cd</span> <span class="k">$(</span>distdir<span class="k">)</span>/rust <span class="o">&amp;&amp;</span> <span class="se">\</span>
9813 cargo vendor -q <span class="o">&amp;&amp;</span> <span class="se">\</span>
9814 mkdir .cargo <span class="o">&amp;&amp;</span> <span class="se">\</span>
9815 cp cargo-vendor-config .cargo/config<span class="o">)</span>
9816 </code></pre></div>
9817
9818 <p>The tarball needs to have a <code>rust/.cargo/config</code> to know where to find
9819 the vendored sources (i.e. the embedded dependencies), but we don't
9820 want <em>that</em> in our development source tree. Instead, we generate it
9821 from a <a href="https://git.gnome.org/browse/librsvg/tree/rust/cargo-vendor-config?h=2.41.1"><code>rust/cargo-vendor-config</code></a> file in our
9822 source tree:</p>
9823 <div class="highlight"><pre><span></span><code><span class="c1"># This is used after `cargo vendor` is run from `make dist`.</span>
9824 <span class="c1">#</span>
9825 <span class="c1"># In the distributed tarball, this file should end up in</span>
9826 <span class="c1"># rust/.cargo/config</span>
9827
9828 <span class="k">[source.crates-io]</span>
9829 <span class="n">registry</span> <span class="o">=</span> <span class="s">&#39;https://github.com/rust-lang/crates.io-index&#39;</span>
9830 <span class="n">replace-with</span> <span class="o">=</span> <span class="s">&#39;vendored-sources&#39;</span>
9831
9832 <span class="k">[source.vendored-sources]</span>
9833 <span class="n">directory</span> <span class="o">=</span> <span class="s">&#39;./vendor&#39;</span>
9834 </code></pre></div>
9835
9836 <h1>One last thing</h1>
9837 <p>If you put this in your <code>Cargo.toml</code>, release binaries will be a lot
9838 smaller. This turns on link-time optimizations (LTO), which removes
9839 unused functions from the binary.</p>
9840 <div class="highlight"><pre><span></span><code><span class="k">[profile.release]</span>
9841 <span class="n">lto</span> <span class="o">=</span> <span class="kc">true</span>
9842 </code></pre></div>
9843
9844 <h1>Summary and thanks</h1>
9845 <p>I think the above is some good boilerplate that you can put in your
9846 <code>configure.ac</code> / <code>Makefile.am</code> to integrate a Rust sub-library into
9847 your C code. It handles <code>make</code>-y things like <code>make clean</code> and <code>make
9848 check</code>; debug and release builds; verbose and quiet builds;
9849 <code>builddir != srcdir</code>; all the goodies.</p>
9850 <p>I think the only thing I'm missing is to check for the <code>cargo-vendor</code>
9851 binary. I'm not sure how to only check for that if I'm the one making
9852 tarballs... maybe an <code>--enable-maintainer-mode</code> flag?</p>
9853 <p>This would definitely not have been possible without prior work.
9854 Thanks to everyone who figured out Autotools before me, so I could
9855 cut&amp;paste your goodies:</p>
9856 <p><em>Update 2017/Nov/11:</em> Fixed the initialization of <code>RUST_EXTRA</code>; thanks
9857 to Tobias Mueller for catching this.</p>
9858 <ul>
9859 <li><a href="https://www.figuiere.net/hub/blog/?2016/10/07/862-rust-and-automake">Hubert Figuière's "Rust and Automake"</a></li>
9860 <li><a href="http://lukenukem.co.nz/gsoc/2017/05/17/gso_2.html">Luke Nukem's "Autotools and Rust"</a></li>
9861 <li><a href="https://github.com/endlessm/ostree/commit/9169268c31df31cc09495e2a04c30cd251f22b5d">OStree's incantation for <code>cargo vendor</code></a></li>
9862 <li><a href="https://blog.ometer.com/2017/01/10/dear-package-managers-dependency-resolution-results-should-be-in-version-control/">Havoc's "Cargo.lock should be in version control"</a></li>
9863 </ul></content><category term="misc"></category><category term="librsvg"></category><category term="rust"></category><category term="gnome"></category><category term="autotools"></category></entry><entry><title>How Glib-rs works, part 2: Transferring lists and arrays</title><link href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-2.html" rel="alternate"></link><published>2017-08-28T20:26:47-05:00</published><updated>2017-08-28T20:26:47-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-08-28:/~federico/blog/how-glib-rs-works-part-2.html</id><summary type="html"><p>(<a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">First part</a> of the series, with index to all the articles)</p>
9864 <p>In the <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">first part</a>, we saw how <a href="http://gtk-rs.org/docs/glib/">glib-rs</a> provides
9865 the <a href="http://gtk-rs.org/docs/glib/translate/trait.FromGlib.html"><code>FromGlib</code></a> and <a href="http://gtk-rs.org/docs/glib/translate/trait.ToGlib.html"><code>ToGlib</code></a> traits to let Rust
9866 code convert from/to Glib's simple types, like to convert from a Glib
9867 <code>gboolean</code> to a Rust <code>bool</code> and vice-versa. We also …</p></summary><content type="html"><p>(<a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">First part</a> of the series, with index to all the articles)</p>
9868 <p>In the <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">first part</a>, we saw how <a href="http://gtk-rs.org/docs/glib/">glib-rs</a> provides
9869 the <a href="http://gtk-rs.org/docs/glib/translate/trait.FromGlib.html"><code>FromGlib</code></a> and <a href="http://gtk-rs.org/docs/glib/translate/trait.ToGlib.html"><code>ToGlib</code></a> traits to let Rust
9870 code convert from/to Glib's simple types, like to convert from a Glib
9871 <code>gboolean</code> to a Rust <code>bool</code> and vice-versa. We also saw the special
9872 needs of strings; since they are passed by reference and are not
9873 copied as simple values, we can use
9874 <a href="http://gtk-rs.org/docs/glib/translate/trait.FromGlibPtrNone.html"><code>FromGlibPtrNone</code></a> and
9875 <a href="http://gtk-rs.org/docs/glib/translate/trait.FromGlibPtrFull.html"><code>FromGlibPtrFull</code></a> depending on what kind of
9876 <em>ownership transfer</em> we want, none for "just make it look like we are
9877 using a borrowed reference", or full for "I'll take over the data and
9878 free it when I'm done". Going the other way around, we can use
9879 <a href="http://gtk-rs.org/docs/glib/translate/trait.ToGlibPtr.html"><code>ToGlibPtr</code></a> and its methods to pass things from Rust <em>to</em>
9880 Glib.</p>
9881 <p>In this part, we'll see the tools that glib-rs provides to do
9882 conversions of more complex data types. We'll look at two cases:</p>
9883 <ul>
9884 <li>
9885 <p><a href="#null-term-string-array">Passing null-terminated arrays of strings</a>
9886 from Glib to Rust</p>
9887 </li>
9888 <li>
9889 <p><a href="#passing-glists">Passing <code>GList</code>s to Rust</a></p>
9890 </li>
9891 </ul>
9892 <p>And one final case just in passing:</p>
9893 <ul>
9894 <li><a href="#passing-containers-to-glib">Passing containers from Rust to Glib</a></li>
9895 </ul>
9896 <h1>Passing arrays from Glib to Rust</h1>
9897 <p>We'll look at the case for transferring null-terminated arrays of
9898 strings, since it's an interesting one. There are other traits to
9899 convert from Glib arrays whose length is known, not implied with a
9900 NULL element, but for now we'll only look at arrays of strings.</p>
9901 <h2 id="null-term-string-array">Null-terminated arrays of strings</h2>
9902 <p>Look at this function for <code>GtkAboutDialog</code>:</p>
9903 <div class="highlight"><pre><span></span><code><span class="cm">/**</span>
9904 <span class="cm"> * gtk_about_dialog_add_credit_section:</span>
9905 <span class="cm"> * @about: A #GtkAboutDialog</span>
9906 <span class="cm"> * @section_name: The name of the section</span>
9907 <span class="cm"> * @people: (array zero-terminated=1): The people who belong to that section</span>
9908 <span class="cm"> * ...</span>
9909 <span class="cm"> */</span>
9910 <span class="kt">void</span>
9911 <span class="n">gtk_about_dialog_add_credit_section</span> <span class="p">(</span><span class="n">GtkAboutDialog</span> <span class="o">*</span><span class="n">about</span><span class="p">,</span>
9912 <span class="k">const</span> <span class="n">gchar</span> <span class="o">*</span><span class="n">section_name</span><span class="p">,</span>
9913 <span class="k">const</span> <span class="n">gchar</span> <span class="o">**</span><span class="n">people</span><span class="p">)</span>
9914 </code></pre></div>
9915
9916 <p>You would use this like</p>
9917 <div class="highlight"><pre><span></span><code><span class="k">const</span> <span class="n">gchar</span> <span class="o">*</span><span class="n">translators</span><span class="p">[]</span> <span class="o">=</span> <span class="p">{</span>
9918 <span class="s">&quot;Alice &lt;alice@example.com&gt;&quot;</span><span class="p">,</span>
9919 <span class="s">&quot;Bob &lt;bob@example.com&gt;&quot;</span><span class="p">,</span>
9920 <span class="s">&quot;Clara &lt;clara@example.com&gt;&quot;</span><span class="p">,</span>
9921 <span class="nb">NULL</span>
9922 <span class="p">};</span>
9923
9924 <span class="n">gtk_about_dialog_add_credit_section</span> <span class="p">(</span><span class="n">my_about_dialog</span><span class="p">,</span> <span class="n">_</span><span class="p">(</span><span class="s">&quot;Translators&quot;</span><span class="p">),</span> <span class="n">translators</span><span class="p">);</span>
9925 </code></pre></div>
9926
9927 <p>The function expects an array of <code>gchar *</code>, where the last element is
9928 a NULL. Instead of passing an explicit length for the array, it's
9929 done implicitly by requiring a NULL pointer after the last element.
9930 The gtk-doc annotation says <code>(array zero-terminated=1)</code>. When we
9931 generate information for the GObject-Introspection Repository (GIR),
9932 this is what comes out:</p>
9933 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
9934 <span class="normal"> 2</span>
9935 <span class="normal"> 3</span>
9936 <span class="normal"> 4</span>
9937 <span class="normal"> 5</span>
9938 <span class="normal"> 6</span>
9939 <span class="normal"> 7</span>
9940 <span class="normal"> 8</span>
9941 <span class="normal"> 9</span>
9942 <span class="normal">10</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="nt">&lt;method</span> <span class="na">name=</span><span class="s">&quot;add_credit_section&quot;</span>
9943 <span class="na">c:identifier=</span><span class="s">&quot;gtk_about_dialog_add_credit_section&quot;</span>
9944 <span class="na">version=</span><span class="s">&quot;3.4&quot;</span><span class="nt">&gt;</span>
9945 ..
9946 <span class="nt">&lt;parameter</span> <span class="na">name=</span><span class="s">&quot;people&quot;</span> <span class="na">transfer-ownership=</span><span class="s">&quot;none&quot;</span><span class="nt">&gt;</span>
9947 <span class="nt">&lt;doc</span> <span class="na">xml:space=</span><span class="s">&quot;preserve&quot;</span><span class="nt">&gt;</span>The people who belong to that section<span class="nt">&lt;/doc&gt;</span>
9948 <span class="nt">&lt;array</span> <span class="na">c:type=</span><span class="s">&quot;gchar**&quot;</span><span class="nt">&gt;</span>
9949 <span class="nt">&lt;type</span> <span class="na">name=</span><span class="s">&quot;utf8&quot;</span> <span class="na">c:type=</span><span class="s">&quot;gchar*&quot;</span><span class="nt">/&gt;</span>
9950 <span class="nt">&lt;/array&gt;</span>
9951 <span class="nt">&lt;/parameter&gt;</span>
9952 </code></pre></div>
9953 </td></tr></table>
9954 <p>You can see the <code>transfer-ownership="none"</code> in line 5. This means
9955 that the function will not take ownership of the passed array; it will
9956 make its own copy instead. By convention, GIR assumes that arrays of
9957 strings are NULL-terminated, so there is no special annotation for
9958 that here. If we were implementing this function in Rust, how would we
9959 read that C array of UTF-8 strings and turn it into a Rust
9960 <code>Vec&lt;String&gt;</code> or something? Easy:</p>
9961 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">c_char_array</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">c_char</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w"> </span><span class="c1">// comes from Glib</span>
9962 <span class="kd">let</span><span class="w"> </span><span class="n">rust_translators</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">FromGlibPtrContainer</span>::<span class="n">from_glib_none</span><span class="p">(</span><span class="n">c_char_array</span><span class="p">);</span><span class="w"></span>
9963 <span class="c1">// rust_translators is a Vec&lt;String&gt;</span>
9964 </code></pre></div>
9965
9966 <p>Let's look at how this bad boy is implemented.</p>
9967 <h3>First stage: <code>impl FromGlibPtrContainer for Vec&lt;T&gt;</code></h3>
9968 <p>We want to go from a "<code>*mut *mut c_char</code>" (in C parlance, a "<code>gchar **</code>")
9969 to a <code>Vec&lt;String&gt;</code>. Indeed, there is an implementation of the
9970 <code>FromGlibPtrContainer</code> trait for <code>Vec</code>s
9971 <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L1136">here</a>. These are the first few lines:</p>
9972 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="o">&lt;</span><span class="n">P</span>: <span class="nc">Ptr</span><span class="p">,</span><span class="w"> </span><span class="n">PP</span>: <span class="nc">Ptr</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="nc">FromGlibPtrArrayContainerAsVec</span><span class="o">&lt;</span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="n">PP</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="n">FromGlibPtrContainer</span><span class="o">&lt;</span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="n">PP</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
9973 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none</span><span class="p">(</span><span class="n">ptr</span>: <span class="nc">PP</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
9974 <span class="w"> </span><span class="n">FromGlibPtrArrayContainerAsVec</span>::<span class="n">from_glib_none_as_vec</span><span class="p">(</span><span class="n">ptr</span><span class="p">)</span><span class="w"></span>
9975 <span class="w"> </span><span class="p">}</span><span class="w"></span>
9976 </code></pre></div>
9977
9978 <p>So... that <code>from_glib_none()</code> will return a <code>Vec&lt;T&gt;</code>, which is what we
9979 want. Let's look at the first few lines of <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L1087"><code>FromGlibPtrArrayContainerAsVec</code></a>:</p>
9980 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
9981 <span class="normal">2</span>
9982 <span class="normal">3</span>
9983 <span class="normal">4</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">FromGlibPtrArrayContainerAsVec</span><span class="o">&lt;</span><span class="cp">$ffi_name</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="cp">$name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
9984 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none_as_vec</span><span class="p">(</span><span class="n">ptr</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="bp">Self</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
9985 <span class="w"> </span><span class="n">FromGlibContainerAsVec</span>::<span class="n">from_glib_none_num_as_vec</span><span class="p">(</span><span class="n">ptr</span><span class="p">,</span><span class="w"> </span><span class="n">c_ptr_array_len</span><span class="p">(</span><span class="n">ptr</span><span class="p">))</span><span class="w"></span>
9986 <span class="w"> </span><span class="p">}</span><span class="w"></span>
9987 </code></pre></div>
9988 </td></tr></table>
9989 <p>Aha! This is inside a <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L1117">macro</a>, thus the <code>$ffi_name</code> garbage.
9990 It's done like that so the same trait can be implemented for <code>const</code> and
9991 <code>mut</code> pointers to <code>c_char</code>.</p>
9992 <p>See the call to <code>c_ptr_array_len()</code> in line 3? That's what figures
9993 out where the NULL pointer is at the end of the array: it figures out
9994 the array's length. </p>
9995 <h3>Second stage: <code>impl FromGlibContainerAsVec::from_glib_none_num_as_vec()</code></h3>
9996 <p>Now that the length of the array is known, the implementation calls
9997 <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L1038"><code>FromGlibContainerAsVec::from_glib_none_num_as_vec()</code></a></p>
9998 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
9999 <span class="normal"> 2</span>
10000 <span class="normal"> 3</span>
10001 <span class="normal"> 4</span>
10002 <span class="normal"> 5</span>
10003 <span class="normal"> 6</span>
10004 <span class="normal"> 7</span>
10005 <span class="normal"> 8</span>
10006 <span class="normal"> 9</span>
10007 <span class="normal">10</span>
10008 <span class="normal">11</span>
10009 <span class="normal">12</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">FromGlibContainerAsVec</span><span class="o">&lt;</span><span class="cp">$ffi_name</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="cp">$name</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10010 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none_num_as_vec</span><span class="p">(</span><span class="n">ptr</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="p">,</span><span class="w"> </span><span class="n">num</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="bp">Self</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10011 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">num</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">ptr</span><span class="p">.</span><span class="n">is_null</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10012 <span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">new</span><span class="p">();</span><span class="w"></span>
10013 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10014
10015 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">with_capacity</span><span class="p">(</span><span class="n">num</span><span class="p">);</span><span class="w"></span>
10016 <span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="mi">0</span><span class="o">..</span><span class="n">num</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10017 <span class="w"> </span><span class="n">res</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">from_glib_none</span><span class="p">(</span><span class="n">ptr</span>::<span class="n">read</span><span class="p">(</span><span class="n">ptr</span><span class="p">.</span><span class="n">offset</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">isize</span><span class="p">))</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="cp">$ffi_name</span><span class="p">));</span><span class="w"></span>
10018 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10019 <span class="w"> </span><span class="n">res</span><span class="w"></span>
10020 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10021 </code></pre></div>
10022 </td></tr></table>
10023 <p>Lines 3/4: If the number of elements is zero, or the array is NULL,
10024 return an empty <code>Vec</code>.</p>
10025 <p>Line 7: Allocate a <code>Vec</code> of suitable size.</p>
10026 <p>Lines 8/9: For each of the pointers in the C array, call
10027 <code>from_glib_none()</code> to convert it from a <code>*const c_char</code> to a <code>String</code>,
10028 like we saw in <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">the first part</a>.</p>
10029 <p>Done! We started with a <code>*mut *mut c_char</code> or a <code>*const *const
10030 c_char</code> and ended up with a <code>Vec&lt;String&gt;</code>, which is what we wanted.</p>
10031 <h1 id="passing-glists">Passing <code>GList</code>s to Rust</h1>
10032 <p>Some functions don't give you an array; they give you a <code>GList</code> or
10033 <code>GSList</code>. There is an implementation of
10034 <code>FromGlibPtrArrayContainerAsVec</code> <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L1254">that understands
10035 <code>GList</code></a>:</p>
10036 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">FromGlibPtrArrayContainerAsVec</span><span class="o">&lt;&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">GlibPtrDefault</span><span class="o">&gt;</span>::<span class="n">GlibType</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">GList</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"></span>
10037 <span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">GlibPtrDefault</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">FromGlibPtrNone</span><span class="o">&lt;&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">GlibPtrDefault</span><span class="o">&gt;</span>::<span class="n">GlibType</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">FromGlibPtrFull</span><span class="o">&lt;&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">GlibPtrDefault</span><span class="o">&gt;</span>::<span class="n">GlibType</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10038
10039 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none_as_vec</span><span class="p">(</span><span class="n">ptr</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">GList</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10040 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">num</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">g_list_length</span><span class="p">(</span><span class="n">ptr</span><span class="p">)</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="p">;</span><span class="w"></span>
10041 <span class="w"> </span><span class="n">FromGlibContainer</span>::<span class="n">from_glib_none_num</span><span class="p">(</span><span class="n">ptr</span><span class="p">,</span><span class="w"> </span><span class="n">num</span><span class="p">)</span><span class="w"></span>
10042 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10043 </code></pre></div>
10044
10045 <p>The <code>impl</code> declaration is pretty horrible, so just look at the
10046 method: <code>from_glib_none_as_vec()</code> takes in a <code>GList</code>, then calls
10047 <code>g_list_length()</code> on it, and finally calls
10048 <code>FromGlibContainer::from_glib_none_num()</code> with the length it computed.</p>
10049 <h3>I have a Glib container and its length</h3>
10050 <p>In turn, that <code>from_glib_none_num()</code> goes <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L1122">here</a>:</p>
10051 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="o">&lt;</span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="n">PP</span>: <span class="nc">Ptr</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="nc">FromGlibContainerAsVec</span><span class="o">&lt;</span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="n">PP</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="n">FromGlibContainer</span><span class="o">&lt;</span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="n">PP</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10052 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none_num</span><span class="p">(</span><span class="n">ptr</span>: <span class="nc">PP</span><span class="p">,</span><span class="w"> </span><span class="n">num</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10053 <span class="w"> </span><span class="n">FromGlibContainerAsVec</span>::<span class="n">from_glib_none_num_as_vec</span><span class="p">(</span><span class="n">ptr</span><span class="p">,</span><span class="w"> </span><span class="n">num</span><span class="p">)</span><span class="w"></span>
10054 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10055 </code></pre></div>
10056
10057 <p>Okay, getting closer to the actual implementation.</p>
10058 <h3>Give me a vector already</h3>
10059 <p>Finally, we get to the function that <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L1211">walks the <code>GList</code></a>:</p>
10060 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
10061 <span class="normal"> 2</span>
10062 <span class="normal"> 3</span>
10063 <span class="normal"> 4</span>
10064 <span class="normal"> 5</span>
10065 <span class="normal"> 6</span>
10066 <span class="normal"> 7</span>
10067 <span class="normal"> 8</span>
10068 <span class="normal"> 9</span>
10069 <span class="normal">10</span>
10070 <span class="normal">11</span>
10071 <span class="normal">12</span>
10072 <span class="normal">13</span>
10073 <span class="normal">14</span>
10074 <span class="normal">15</span>
10075 <span class="normal">16</span>
10076 <span class="normal">17</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">FromGlibContainerAsVec</span><span class="o">&lt;&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">GlibPtrDefault</span><span class="o">&gt;</span>::<span class="n">GlibType</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">GList</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"></span>
10077 <span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">GlibPtrDefault</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">FromGlibPtrNone</span><span class="o">&lt;&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">GlibPtrDefault</span><span class="o">&gt;</span>::<span class="n">GlibType</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">FromGlibPtrFull</span><span class="o">&lt;&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">GlibPtrDefault</span><span class="o">&gt;</span>::<span class="n">GlibType</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10078
10079 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none_num_as_vec</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">ptr</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">GList</span><span class="p">,</span><span class="w"> </span><span class="n">num</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10080 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">num</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">ptr</span><span class="p">.</span><span class="n">is_null</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10081 <span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">new</span><span class="p">()</span><span class="w"></span>
10082 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10083 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">with_capacity</span><span class="p">(</span><span class="n">num</span><span class="p">);</span><span class="w"></span>
10084 <span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="mi">0</span><span class="o">..</span><span class="n">num</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10085 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">item_ptr</span>: <span class="o">&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">GlibPtrDefault</span><span class="o">&gt;</span>::<span class="n">GlibType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Ptr</span>::<span class="n">from</span><span class="p">((</span><span class="o">*</span><span class="n">ptr</span><span class="p">).</span><span class="n">data</span><span class="p">);</span><span class="w"></span>
10086 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="o">!</span><span class="n">item_ptr</span><span class="p">.</span><span class="n">is_null</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10087 <span class="w"> </span><span class="n">res</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">from_glib_none</span><span class="p">(</span><span class="n">item_ptr</span><span class="p">));</span><span class="w"></span>
10088 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10089 <span class="w"> </span><span class="n">ptr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="o">*</span><span class="n">ptr</span><span class="p">).</span><span class="n">next</span><span class="p">;</span><span class="w"></span>
10090 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10091 <span class="w"> </span><span class="n">res</span><span class="w"></span>
10092 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10093 </code></pre></div>
10094 </td></tr></table>
10095 <p>Again, ignore the horrible <code>impl</code> declaration and just look at
10096 <code>from_glib_none_num_as_vec()</code>.</p>
10097 <p>Line 4: that function takes in a <code>ptr</code> to a <code>GList</code>, and a <code>num</code> with
10098 the list's length, which we already computed above.</p>
10099 <p>Line 5: Return an empty vector if we have an empty list.</p>
10100 <p>Line 8: Allocate a vector of suitable capacity.</p>
10101 <p>Line 9: For each element, convert it with <code>from_glib_none()</code> and push
10102 it to the array.</p>
10103 <p>Line 14: Walk to the next element in the list.</p>
10104 <h1 id="passing-containers-to-glib">Passing containers from Rust to Glib</h1>
10105 <p>This post is getting a bit long, so I'll just mention this briefly.
10106 There is a trait <code>ToGlibContainerFromSlice</code> that takes a Rust slice,
10107 and can convert it to various Glib types.</p>
10108 <ul>
10109 <li>
10110 <p>To <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L616"><code>GSlist</code></a> and <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L566"><code>GList</code></a>. These have
10111 methods like <code>to_glib_none_from_slice()</code> and
10112 <code>to_glib_full_from_slice()</code></p>
10113 </li>
10114 <li>
10115 <p>To an <a href="https://github.com/gtk-rs/glib/blob/e46aa7f27cc74f3cdcb54d94feb4a8861df21c7f/src/translate.rs#L471">array of fundamental types</a>. Here, you can choose
10116 between <code>to_glib_none_from_slice()</code>, which gives you a <code>Stash</code> like
10117 we saw <a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">the last time</a>. Or, you can use
10118 <code>to_glib_full_from_slice()</code>, which gives you back a <code>g_malloc()</code>ed
10119 array with copied items. Finally, <code>to_glib_container_from_slice()</code>
10120 gives you back a <code>g_malloc()</code>ed array of <em>pointers</em> to values rather
10121 than plain values themselves. Which function you choose depends on
10122 which C API you want to call.</p>
10123 </li>
10124 </ul>
10125 <p>I hope this post gives you enough practice to be able to "follow the
10126 traits" for each of those if you want to look at the implementations.</p>
10127 <h1>Next up</h1>
10128 <p>Passing boxed types, like public structs.</p>
10129 <p>Passing reference-counted types.</p>
10130 <p>How glib-rs wraps GObjects.</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>How Glib-rs works, part 1: Type conversions</title><link href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html" rel="alternate"></link><published>2017-08-25T16:07:47-05:00</published><updated>2017-08-25T16:07:47-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-08-25:/~federico/blog/how-glib-rs-works-part-1.html</id><summary type="html"><ul>
10131 <li><a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">How Glib-rs works, part 1: Type conversions</a></li>
10132 <li><a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-2.html">How Glib-rs works, part 2: Transferring lists and arrays</a></li>
10133 <li><a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-3.html">How Glib-rs works, part 3: Boxed types</a></li>
10134 </ul>
10135 <p>During the <a href="https://wiki.gnome.org/Hackfests/Rust2017">GNOME+Rust hackfest in Mexico City</a>, <a href="http://smallcultfollowing.com/babysteps/">Niko Matsakis</a>
10136 started the implementation of <a href="http://smallcultfollowing.com/babysteps/blog/2017/05/02/gnome-class-integrating-rust-and-the-gnome-object-system/">gnome-class</a>, a procedural macro
10137 that will let people implement new GObject classes in …</p></summary><content type="html"><ul>
10138 <li><a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-1.html">How Glib-rs works, part 1: Type conversions</a></li>
10139 <li><a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-2.html">How Glib-rs works, part 2: Transferring lists and arrays</a></li>
10140 <li><a href="https://people.gnome.org/~federico/blog/how-glib-rs-works-part-3.html">How Glib-rs works, part 3: Boxed types</a></li>
10141 </ul>
10142 <p>During the <a href="https://wiki.gnome.org/Hackfests/Rust2017">GNOME+Rust hackfest in Mexico City</a>, <a href="http://smallcultfollowing.com/babysteps/">Niko Matsakis</a>
10143 started the implementation of <a href="http://smallcultfollowing.com/babysteps/blog/2017/05/02/gnome-class-integrating-rust-and-the-gnome-object-system/">gnome-class</a>, a procedural macro
10144 that will let people implement new GObject classes in Rust and export
10145 them to the world. Currently, if you want to write a new GObject
10146 (e.g. a new widget) and put it in a library so that it can be used
10147 from language bindings via GObject-Introspection, you have to do it in
10148 C. It would be nice to be able to do this in a safe language like
10149 Rust.</p>
10150 <h1>How would it be done by hand?</h1>
10151 <p>In a C implementation of a new GObject subclass, one calls things like
10152 <code>g_type_register_static()</code> and <code>g_signal_new()</code> by hand, while being
10153 careful to specify the correct <code>GType</code> for each value, and being
10154 super-careful about everything, as C demands.</p>
10155 <p>In Rust, one <em>can</em> in fact do exactly the same thing. You can call the
10156 same, low-level GObject and GType functions. You can use
10157 <code>#[repr(C)]</code>] for the instance and class structs that GObject will
10158 allocate for you, and which you then fill in.</p>
10159 <p>You can see an example of this in gst-plugins-rs. This is where it implements a <code>Sink</code>
10160 GObject, in Rust, by calling Glib functions by
10161 hand: <a href="https://github.com/sdroege/gst-plugin-rs/blob/782fe5dcc93dbd1c5501821659c8ae1e4331e494/gst-plugin/src/sink.rs#L297">struct declarations</a>, <a href="https://github.com/sdroege/gst-plugin-rs/blob/782fe5dcc93dbd1c5501821659c8ae1e4331e494/gst-plugin/src/sink.rs#L356"><code>class_init()</code> function</a>,
10162 <a href="https://github.com/sdroege/gst-plugin-rs/blob/782fe5dcc93dbd1c5501821659c8ae1e4331e494/gst-plugin/src/sink.rs#L479">registration of type and interfaces</a>.</p>
10163 <h1>How would it be done by a machine?</h1>
10164 <p>That's what Niko's gnome-class is about. During the hackfest it
10165 got to the point of being able to generate the code to create a new
10166 GObject subclass, register it, and export functions for methods. The
10167 syntax is not finalized yet, but it looks something like this:</p>
10168 <div class="highlight"><pre><span></span><code><span class="n">gobject_gen</span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10169 <span class="w"> </span><span class="n">class</span><span class="w"> </span><span class="n">Counter</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10170 <span class="w"> </span><span class="k">struct</span> <span class="nc">CounterPrivate</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10171 <span class="w"> </span><span class="n">val</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"></span>
10172 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10173
10174 <span class="w"> </span><span class="n">signal</span><span class="w"> </span><span class="n">value_changed</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w"></span>
10175
10176 <span class="w"> </span><span class="k">fn</span> <span class="nf">set_value</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">v</span>: <span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10177 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">private</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">private</span><span class="p">();</span><span class="w"></span>
10178 <span class="w"> </span><span class="n">private</span><span class="p">.</span><span class="n">val</span><span class="p">.</span><span class="n">set</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w"></span>
10179 <span class="w"> </span><span class="c1">// private.emit_value_changed();</span>
10180 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10181
10182 <span class="w"> </span><span class="k">fn</span> <span class="nf">get_value</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w"></span>
10183 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">private</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">private</span><span class="p">();</span><span class="w"></span>
10184 <span class="w"> </span><span class="n">private</span><span class="p">.</span><span class="n">val</span><span class="p">.</span><span class="n">get</span><span class="p">()</span><span class="w"></span>
10185 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10186 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10187 <span class="p">}</span><span class="w"></span>
10188 </code></pre></div>
10189
10190 <p>I started adding support for declaring GObject signals — mainly being
10191 able to parse them from what goes inside <code>gobject_gen!()</code> — and then
10192 being able to call <code>g_signal_newv()</code> at the appropriate time during
10193 the <code>class_init()</code> implementation.</p>
10194 <h1>Types in signals</h1>
10195 <p>Creating a signal for a GObject class is basically like specifying a
10196 function prototype: the object will invoke a callback function with
10197 certain arguments and return value when the signal is emitted. For
10198 example, this is how <code>GtkButton</code> registers its <a href="https://git.gnome.org/browse/gtk+/tree/gtk/gtkwidget.c?id=e26b60d48cc29a319b3ccc6c92157d9da1f9ceba#n2000"><code>button-press-event</code>
10199 signal</a>:</p>
10200 <div class="highlight"><pre><span></span><code> <span class="n">button_press_event_id</span> <span class="o">=</span>
10201 <span class="n">g_signal_new</span> <span class="p">(</span><span class="n">I_</span><span class="p">(</span><span class="s">&quot;button-press-event&quot;</span><span class="p">),</span>
10202 <span class="p">...</span>
10203 <span class="n">G_TYPE_BOOLEAN</span><span class="p">,</span> <span class="cm">/* type of return value */</span>
10204 <span class="mi">1</span><span class="p">,</span> <span class="cm">/* how many arguments? */</span>
10205 <span class="n">GDK_TYPE_EVENT</span><span class="p">);</span> <span class="cm">/* type of first and only argument */</span>
10206 </code></pre></div>
10207
10208 <p><code>g_signal_new()</code> creates the signal and returns a <em>signal id</em>, an
10209 integer. Later, when the object wants to emit the signal, it uses
10210 that signal id like this:</p>
10211 <div class="highlight"><pre><span></span><code><span class="n">GtkEventButton</span> <span class="n">event</span> <span class="o">=</span> <span class="p">...;</span>
10212 <span class="n">gboolean</span> <span class="n">return_val</span><span class="p">;</span>
10213
10214 <span class="n">g_signal_emit</span> <span class="p">(</span><span class="n">widget</span><span class="p">,</span> <span class="n">button_press_event_id</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">event</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">return_val</span><span class="p">);</span>
10215 </code></pre></div>
10216
10217 <p>In the nice <code>gobject_gen!()</code> macro, if I am going to have a signal
10218 declaration like</p>
10219 <div class="highlight"><pre><span></span><code><span class="n">signal</span><span class="w"> </span><span class="n">button_press_event</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">event</span>: <span class="kp">&amp;</span><span class="nc">ButtonPressEvent</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">;</span><span class="w"></span>
10220 </code></pre></div>
10221
10222 <p>then I will need to be able to translate the type names for
10223 <code>ButtonPressEvent</code> and <code>bool</code> into something that <code>g_signal_newv()</code> will
10224 understand: I need the <a href="https://developer.gnome.org/gobject/stable/gobject-Type-Information.html#GType">GType</a> values for those. Fundamental
10225 types like <code>gboolean</code> get constants like <a href="https://developer.gnome.org/gobject/stable/gobject-Type-Information.html#G-TYPE-BOOLEAN:CAPS">G_TYPE_BOOLEAN</a>. Types
10226 that are defined at runtime, like <code>GDK_TYPE_EVENT</code>, get GType values
10227 generated at runtime, too, when one registers the type with
10228 <code>g_type_register_*()</code>.</p>
10229 <table>
10230 <thead>
10231 <tr>
10232 <th align="left">Rust type</th>
10233 <th align="left">GType</th>
10234 </tr>
10235 </thead>
10236 <tbody>
10237 <tr>
10238 <td align="left">i32</td>
10239 <td align="left">G_TYPE_INT</td>
10240 </tr>
10241 <tr>
10242 <td align="left">u32</td>
10243 <td align="left">G_TYPE_UINT</td>
10244 </tr>
10245 <tr>
10246 <td align="left">bool</td>
10247 <td align="left">G_TYPE_BOOLEAN</td>
10248 </tr>
10249 <tr>
10250 <td align="left">etc.</td>
10251 <td align="left">etc.</td>
10252 </tr>
10253 </tbody>
10254 </table>
10255 <h1>Glib types in Rust</h1>
10256 <p>How does <a href="http://gtk-rs.org/docs/glib/">glib-rs</a>, the Rust binding to Glib and GObject, handle
10257 types?</p>
10258 <h2>Going from Glib to Rust</h2>
10259 <p>First we need a way to convert Glib's types to Rust, and vice-versa.
10260 There is a trait to convert simple Glib types into Rust types:</p>
10261 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">FromGlib</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"></span>
10262 <span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib</span><span class="p">(</span><span class="n">val</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w"></span>
10263 <span class="p">}</span><span class="w"></span>
10264 </code></pre></div>
10265
10266 <p>This means, if I have a <code>T</code> which is a Glib type, this trait will give
10267 you a <code>from_glib()</code> function which will convert it to a Rust type
10268 which is <code>Sized</code>, i.e. a type whose size is known at compilation time.</p>
10269 <p>For example, this is how it is implemented for booleans:</p>
10270 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">FromGlib</span><span class="o">&lt;</span><span class="n">glib_ffi</span>::<span class="n">gboolean</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">bool</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10271 <span class="w"> </span><span class="cp">#[inline]</span><span class="w"></span>
10272 <span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib</span><span class="p">(</span><span class="n">val</span>: <span class="nc">glib_ffi</span>::<span class="n">gboolean</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w"></span>
10273 <span class="w"> </span><span class="o">!</span><span class="p">(</span><span class="n">val</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">GFALSE</span><span class="p">)</span><span class="w"></span>
10274 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10275 <span class="p">}</span><span class="w"></span>
10276 </code></pre></div>
10277
10278 <p>and you use it like this:</p>
10279 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">my_gboolean</span>: <span class="nc">glib_ffi</span>::<span class="n">gboolean</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">g_some_function_that_returns_gboolean</span><span class="w"> </span><span class="p">();</span><span class="w"></span>
10280
10281 <span class="kd">let</span><span class="w"> </span><span class="n">my_rust_bool</span>: <span class="kt">bool</span> <span class="o">=</span><span class="w"> </span><span class="n">from_glib</span><span class="w"> </span><span class="p">(</span><span class="n">my_gboolean</span><span class="p">);</span><span class="w"></span>
10282 </code></pre></div>
10283
10284 <p>Booleans in glib and Rust have <a href="https://people.gnome.org/~federico/news-2017-04.html#gboolean-is-not-rust-bool">different sizes</a>, and also
10285 different values. Glib's booleans use the C convention: 0 is false
10286 and anything else is true, while in Rust booleans are strictly <code>false</code>
10287 or <code>true</code>, and the size is undefined (with the current Rust ABI, it's
10288 one byte).</p>
10289 <h2>Going from Rust to Glib</h2>
10290 <p>And to go the other way around, from a Rust <code>bool</code> to a <code>gboolean</code>?
10291 There is this trait:</p>
10292 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">ToGlib</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10293 <span class="w"> </span><span class="k">type</span> <span class="nc">GlibType</span><span class="p">;</span><span class="w"></span>
10294
10295 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_glib</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">GlibType</span><span class="p">;</span><span class="w"></span>
10296 <span class="p">}</span><span class="w"></span>
10297 </code></pre></div>
10298
10299 <p>This means, if you have a Rust type that maps to a corresponding
10300 <code>GlibType</code>, this will give you a <code>to_glib()</code> function to do the
10301 conversion.</p>
10302 <p>This is the implementation for booleans:</p>
10303 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">ToGlib</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">bool</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10304 <span class="w"> </span><span class="k">type</span> <span class="nc">GlibType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">gboolean</span><span class="p">;</span><span class="w"></span>
10305
10306 <span class="w"> </span><span class="cp">#[inline]</span><span class="w"></span>
10307 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_glib</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">glib_ffi</span>::<span class="n">gboolean</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10308 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">GTRUE</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">GFALSE</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
10309 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10310 <span class="p">}</span><span class="w"></span>
10311 </code></pre></div>
10312
10313 <p>And it is used like this:</p>
10314 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">my_rust_bool</span>: <span class="kt">bool</span> <span class="o">=</span><span class="w"> </span><span class="kc">true</span><span class="p">;</span><span class="w"></span>
10315
10316 <span class="n">g_some_function_that_takes_gboolean</span><span class="w"> </span><span class="p">(</span><span class="n">my_rust_bool</span><span class="p">.</span><span class="n">to_glib</span><span class="w"> </span><span class="p">());</span><span class="w"></span>
10317 </code></pre></div>
10318
10319 <p>(If you are thinking "a function call to marshal a boolean" — note how
10320 the functions are inlined, and the optimizer basically compiles them
10321 down to nothing.)</p>
10322 <h2 id="pointer-types">Pointer types - from Glib to Rust</h2>
10323 <p>That's all very nice for simple types like booleans and ints.
10324 Pointers to other objects are slightly more complicated.</p>
10325 <p>GObject-Introspection allows one to specify how pointer arguments to
10326 functions are handled by using a <a href="https://developer.gnome.org/gi/stable/gi-GIArgInfo.html#GITransfer"><em>transfer</em></a> specifier. </p>
10327 <h3><code>(transfer none)</code></h3>
10328 <p>For example, if you call <code>gtk_window_set_title(window, "Hello")</code>, you
10329 would expect the function to make its own copy of the <code>"Hello"</code>
10330 string. In Rust terms, you would be passing it a simple <em>borrowed
10331 reference</em>. GObject-Introspection (we'll abbreviate it as GI) calls
10332 this <code>GI_TRANSFER_NOTHING</code>, and it's specified by using
10333 <code>(transfer none)</code> in the documentation strings for function arguments
10334 or return values.</p>
10335 <p>The corresponding trait to bring in pointers from Glib to Rust,
10336 without taking ownership, is this. It's <code>unsafe</code> because it will be
10337 used to de-reference pointers that come from the wild west:</p>
10338 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">FromGlibPtrNone</span><span class="o">&lt;</span><span class="n">P</span>: <span class="nc">Ptr</span><span class="o">&gt;</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"></span>
10339 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none</span><span class="p">(</span><span class="n">ptr</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w"></span>
10340 <span class="p">}</span><span class="w"></span>
10341 </code></pre></div>
10342
10343 <p>And you use it via this generic function:</p>
10344 <div class="highlight"><pre><span></span><code><span class="cp">#[inline]</span><span class="w"></span>
10345 <span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none</span><span class="o">&lt;</span><span class="n">P</span>: <span class="nc">Ptr</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="nc">FromGlibPtrNone</span><span class="o">&lt;</span><span class="n">P</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">ptr</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10346 <span class="w"> </span><span class="n">FromGlibPtrNone</span>::<span class="n">from_glib_none</span><span class="p">(</span><span class="n">ptr</span><span class="p">)</span><span class="w"></span>
10347 <span class="p">}</span><span class="w"></span>
10348 </code></pre></div>
10349
10350 <p>Let's look at how this works. Here is the <code>FromGlibPtrNone</code> trait
10351 implemented for strings.</p>
10352 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
10353 <span class="normal">2</span>
10354 <span class="normal">3</span>
10355 <span class="normal">4</span>
10356 <span class="normal">5</span>
10357 <span class="normal">6</span>
10358 <span class="normal">7</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">FromGlibPtrNone</span><span class="o">&lt;*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">String</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10359 <span class="w"> </span><span class="cp">#[inline]</span><span class="w"></span>
10360 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_none</span><span class="p">(</span><span class="n">ptr</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10361 <span class="w"> </span><span class="fm">assert!</span><span class="p">(</span><span class="o">!</span><span class="n">ptr</span><span class="p">.</span><span class="n">is_null</span><span class="p">());</span><span class="w"></span>
10362 <span class="w"> </span><span class="nb">String</span>::<span class="n">from_utf8_lossy</span><span class="p">(</span><span class="n">CStr</span>::<span class="n">from_ptr</span><span class="p">(</span><span class="n">ptr</span><span class="p">).</span><span class="n">to_bytes</span><span class="p">()).</span><span class="n">into_owned</span><span class="p">()</span><span class="w"></span>
10363 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10364 <span class="p">}</span><span class="w"></span>
10365 </code></pre></div>
10366 </td></tr></table>
10367 <p>Line 1: given a pointer to a <code>c_char</code>, the conversion to <code>String</code>...</p>
10368 <p>Line 4: check for NULL pointers</p>
10369 <p>Line 5: Use the CStr to wrap the C
10370 <code>ptr</code>, <a href="https://people.gnome.org/~federico/blog/correctness-in-rust-reading-strings.html">like we looked at last time</a>, validate it as UTF-8 and
10371 copy the string for us.</p>
10372 <p>Unfortunately, there's a copy involved in the last step. It may be
10373 possible to use <a href="https://doc.rust-lang.org/std/borrow/enum.Cow.html"><code>Cow&lt;&amp;str&gt;</code></a> there instead to avoid a copy if
10374 the <code>char*</code> from Glib is indeed valid UTF-8.</p>
10375 <h3><code>(transfer full)</code></h3>
10376 <p>And how about transferring ownership of the pointed-to value? There
10377 is this trait:</p>
10378 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">FromGlibPtrFull</span><span class="o">&lt;</span><span class="n">P</span>: <span class="nc">Ptr</span><span class="o">&gt;</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"></span>
10379 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_full</span><span class="p">(</span><span class="n">ptr</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w"></span>
10380 <span class="p">}</span><span class="w"></span>
10381 </code></pre></div>
10382
10383 <p>And the implementation for strings is as follows. In Glib's scheme of
10384 things, "transferring ownership of a string" means that the recipient
10385 of the string must eventually <code>g_free()</code> it.</p>
10386 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
10387 <span class="normal">2</span>
10388 <span class="normal">3</span>
10389 <span class="normal">4</span>
10390 <span class="normal">5</span>
10391 <span class="normal">6</span>
10392 <span class="normal">7</span>
10393 <span class="normal">8</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">FromGlibPtrFull</span><span class="o">&lt;*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">String</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10394 <span class="w"> </span><span class="cp">#[inline]</span><span class="w"></span>
10395 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">from_glib_full</span><span class="p">(</span><span class="n">ptr</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10396 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_glib_none</span><span class="p">(</span><span class="n">ptr</span><span class="p">);</span><span class="w"></span>
10397 <span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">g_free</span><span class="p">(</span><span class="n">ptr</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">_</span><span class="p">);</span><span class="w"></span>
10398 <span class="w"> </span><span class="n">res</span><span class="w"></span>
10399 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10400 <span class="p">}</span><span class="w"></span>
10401 </code></pre></div>
10402 </td></tr></table>
10403 <p>Line 1: given a pointer to a <code>c_char</code>, the conversion to <code>String</code>...</p>
10404 <p>Line 4: Do the conversion with <code>from_glib_none()</code> with the trait we
10405 saw before, put it in <code>res</code>.</p>
10406 <p>Line 5: Call <code>g_free()</code> on the original C string.</p>
10407 <p>Line 6: Return the <code>res</code>, a Rust string which we own.</p>
10408 <h2>Pointer types - from Rust to Glib</h2>
10409 <p>Consider the case where you want to pass a <code>String</code> from Rust to a Glib function
10410 that takes a <code>*const c_char</code> — in C parlance, a <code>char *</code>, without the
10411 Glib function acquiring ownership of the string. For example, assume
10412 that the C version of <code>gtk_window_set_title()</code> is in the <code>gtk_ffi</code>
10413 module. You may want to call it like this:</p>
10414 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">rust_binding_to_window_set_title</span><span class="p">(</span><span class="n">window</span>: <span class="kp">&amp;</span><span class="nc">Gtk</span>::<span class="n">Window</span><span class="p">,</span><span class="w"> </span><span class="n">title</span>: <span class="kp">&amp;</span><span class="nb">String</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10415 <span class="w"> </span><span class="n">gtk_ffi</span>::<span class="n">gtk_window_set_title</span><span class="p">(</span><span class="o">..</span><span class="p">.,</span><span class="w"> </span><span class="n">make_c_string_from_rust_string</span><span class="p">(</span><span class="n">title</span><span class="p">));</span><span class="w"></span>
10416 <span class="p">}</span><span class="w"></span>
10417 </code></pre></div>
10418
10419 <p>Now, what would that <code>make_c_string_from_rust_string()</code> look like?</p>
10420 <ul>
10421 <li>
10422 <p><strong>We have:</strong> a Rust <code>String</code> — UTF-8, known length, no nul terminator</p>
10423 </li>
10424 <li>
10425 <p><strong>We want:</strong> a <code>*const char</code> — nul-terminated UTF-8</p>
10426 </li>
10427 </ul>
10428 <p>So, let's write this:</p>
10429 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
10430 <span class="normal">2</span>
10431 <span class="normal">3</span>
10432 <span class="normal">4</span>
10433 <span class="normal">5</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">make_c_string_from_rust_string</span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="nb">String</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10434 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">cstr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">CString</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="p">[</span><span class="o">..</span><span class="p">]).</span><span class="n">unwrap</span><span class="p">();</span><span class="w"></span>
10435 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">ptr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cstr</span><span class="p">.</span><span class="n">into_raw</span><span class="p">()</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="p">;</span><span class="w"></span>
10436 <span class="w"> </span><span class="n">ptr</span><span class="w"></span>
10437 <span class="p">}</span><span class="w"></span>
10438 </code></pre></div>
10439 </td></tr></table>
10440 <p>Line 1: Take in a <code>&amp;String</code>; return a <code>*const c_char</code>.</p>
10441 <p>Line 2: Build a <a href="https://people.gnome.org/~federico/blog/correctness-in-rust-reading-strings.html"><code>CString</code></a> like we way a few days ago: this
10442 allocates a byte buffer with space for a nul terminator, and copies
10443 the string's bytes. We <code>unwrap()</code> for this simple example, because
10444 <code>CString::new()</code> will return an error if the <code>String</code> contained nul
10445 characters in the middle of the string, which C doesn't understand.</p>
10446 <p>Line 3: Call <code>into_raw()</code> to get a pointer to the byte buffer, and
10447 cast it to a <code>*const c_char</code>. <em>We'll need to free this value later.</em></p>
10448 <p>But this kind of sucks, because we the have to use this function, pass
10449 the pointer to a C function, and then reconstitute the <code>CString</code> so it
10450 can free the byte buffer:</p>
10451 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">make_c_string_from_rust_string</span><span class="p">(</span><span class="n">my_string</span><span class="p">);</span><span class="w"></span>
10452 <span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">c_function_that_takes_a_string</span><span class="p">(</span><span class="n">buf</span><span class="p">);</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
10453 <span class="kd">let</span><span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">CString</span>::<span class="n">from_raw</span><span class="p">(</span><span class="n">buf</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">c_char</span><span class="p">);</span><span class="w"></span>
10454 </code></pre></div>
10455
10456 <p>The solution that Glib-rs provides for this is very Rusty, and rather
10457 elegant.</p>
10458 <h3>Stashes</h3>
10459 <p>We want:</p>
10460 <ul>
10461 <li>A temporary place to put a piece of data</li>
10462 <li>A pointer to that buffer</li>
10463 <li>Automatic memory management for both of those</li>
10464 </ul>
10465 <p>Glib-rs defines a <code>Stash</code> for this:</p>
10466 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
10467 <span class="normal">2</span>
10468 <span class="normal">3</span>
10469 <span class="normal">4</span>
10470 <span class="normal">5</span>
10471 <span class="normal">6</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Stash</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="c1">// we have a lifetime</span>
10472 <span class="w"> </span><span class="n">P</span>: <span class="nb">Copy</span><span class="p">,</span><span class="w"> </span><span class="c1">// the pointer must be copy-able</span>
10473 <span class="w"> </span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">ToGlibPtr</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">P</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">(</span><span class="w"> </span><span class="c1">// Type for the temporary place</span>
10474 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="c1">// We store a pointer...</span>
10475 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="o">&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">ToGlibPtr</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">P</span><span class="o">&gt;&gt;</span>::<span class="n">Storage</span><span class="w"> </span><span class="c1">// ... to a piece of data with that lifetime ...</span>
10476 <span class="p">);</span><span class="w"></span>
10477 </code></pre></div>
10478 </td></tr></table>
10479 <p>... and the piece of data must be of of the <em>associated type</em>
10480 <code>ToGlibPtr::Storage</code>, which we will see shortly.</p>
10481 <p>This struct <code>Stash</code> goes along with the <code>ToGlibPtr</code> trait:</p>
10482 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">ToGlibPtr</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">P</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10483 <span class="w"> </span><span class="k">type</span> <span class="nc">Storage</span><span class="p">;</span><span class="w"></span>
10484
10485 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_glib_none</span><span class="p">(</span><span class="o">&amp;&#39;</span><span class="na">a</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Stash</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">;</span><span class="w"> </span><span class="c1">// returns a Stash whose temporary storage</span>
10486 <span class="w"> </span><span class="c1">// has the lifetime of our original data</span>
10487 <span class="p">}</span><span class="w"></span>
10488 </code></pre></div>
10489
10490 <p>Let's unpack this by looking at the implementation of the "transfer a
10491 String to a C function while keeping ownership":</p>
10492 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
10493 <span class="normal">2</span>
10494 <span class="normal">3</span>
10495 <span class="normal">4</span>
10496 <span class="normal">5</span>
10497 <span class="normal">6</span>
10498 <span class="normal">7</span>
10499 <span class="normal">8</span>
10500 <span class="normal">9</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">ToGlibPtr</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">String</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10501 <span class="w"> </span><span class="k">type</span> <span class="nc">Storage</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">CString</span><span class="p">;</span><span class="w"></span>
10502
10503 <span class="w"> </span><span class="cp">#[inline]</span><span class="w"></span>
10504 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_glib_none</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Stash</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="p">,</span><span class="w"> </span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10505 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">CString</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">[</span><span class="o">..</span><span class="p">]).</span><span class="n">unwrap</span><span class="p">();</span><span class="w"></span>
10506 <span class="w"> </span><span class="n">Stash</span><span class="p">(</span><span class="n">tmp</span><span class="p">.</span><span class="n">as_ptr</span><span class="p">(),</span><span class="w"> </span><span class="n">tmp</span><span class="p">)</span><span class="w"></span>
10507 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10508 <span class="p">}</span><span class="w"></span>
10509 </code></pre></div>
10510 </td></tr></table>
10511 <p>Line 1: We implement <code>ToGlibPtr&lt;'a *const c_char&gt;</code> for <code>String</code>,
10512 declaring the lifetime <code>'a</code> for the <code>Stash</code>.</p>
10513 <p>Line 2: Our temporary storage is a <code>CString</code>.</p>
10514 <p>Line 6: Make a CString like before.</p>
10515 <p>Line 7: Create the <code>Stash</code> with a pointer to the CString's contents,
10516 and the CString itself.</p>
10517 <h3><code>(transfer none)</code></h3>
10518 <p>Now, we can use "<code>.0</code>" to extract the first field from our <code>Stash</code>,
10519 which is precisely the pointer we want to a byte buffer:</p>
10520 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">my_string</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w"></span>
10521 <span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">c_function_which_takes_a_string</span><span class="p">(</span><span class="n">my_string</span><span class="p">.</span><span class="n">to_glib_none</span><span class="p">().</span><span class="mi">0</span><span class="p">);</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
10522 </code></pre></div>
10523
10524 <p>Now Rust knows that the temporary buffer inside the <code>Stash</code> has the lifetime of
10525 <code>my_string</code>, and it will free it automatically when the string goes
10526 out of scope. If we can accept the <code>.to_glib_none().0</code> incantation
10527 for "lending" pointers to C, this works perfectly.</p>
10528 <h3 id="ptr-transfer-full"><code>(transfer full)</code></h3>
10529 <p>And for transferring ownership to the C function? The <code>ToGlibPtr</code>
10530 trait has another method:</p>
10531 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">ToGlibPtr</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="n">P</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10532 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
10533
10534 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_glib_full</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">P</span><span class="p">;</span><span class="w"></span>
10535 <span class="p">}</span><span class="w"></span>
10536 </code></pre></div>
10537
10538 <p>And here is the implementation for strings:</p>
10539 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">ToGlibPtr</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">String</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10540 <span class="w"> </span><span class="k">fn</span> <span class="nf">to_glib_full</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10541 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10542 <span class="w"> </span><span class="n">glib_ffi</span>::<span class="n">g_strndup</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">as_ptr</span><span class="p">()</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="p">,</span><span class="w"> </span>
10543 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">size_t</span><span class="p">)</span><span class="w"></span>
10544 <span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">c_char</span><span class="w"></span>
10545 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10546 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10547 </code></pre></div>
10548
10549 <p>We basically <code>g_strndup()</code> the Rust string's contents from its byte
10550 buffer <em>and</em> its <code>len()</code>, and we can then pass this on to C. <em>That</em>
10551 code will be responsible for <code>g_free()</code>ing the C-side string.</p>
10552 <h1>Next up</h1>
10553 <p>Transferring lists and arrays. Stay tuned!</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category></entry><entry><title>Correctness in Rust: building strings</title><link href="https://people.gnome.org/~federico/blog/correctness-in-rust-reading-strings.html" rel="alternate"></link><published>2017-08-16T20:26:39-05:00</published><updated>2017-08-16T20:26:39-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-08-16:/~federico/blog/correctness-in-rust-reading-strings.html</id><summary type="html"><p>Rust tries to follow the "make illegal states unrepresentable" mantra
10554 in several ways. In this post I'll show several things related to the
10555 process of building strings, from bytes in memory, or from a file, or
10556 from <code>char *</code> things passed from C.</p>
10557 <h1>Strings in Rust</h1>
10558 <p>The easiest way to build …</p></summary><content type="html"><p>Rust tries to follow the "make illegal states unrepresentable" mantra
10559 in several ways. In this post I'll show several things related to the
10560 process of building strings, from bytes in memory, or from a file, or
10561 from <code>char *</code> things passed from C.</p>
10562 <h1>Strings in Rust</h1>
10563 <p>The easiest way to build a string is to do it directly at compile
10564 time:</p>
10565 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">my_string</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&quot;Hello, world!&quot;</span><span class="p">;</span><span class="w"></span>
10566 </code></pre></div>
10567
10568 <p>In Rust, strings are UTF-8. Here, the compiler checks our string
10569 literal is valid UTF-8. If we try to be sneaky and insert an
10570 invalid character...</p>
10571 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">my_string</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&quot;Hello \xf0&quot;</span><span class="p">;</span><span class="w"></span>
10572 </code></pre></div>
10573
10574 <p>We get a compiler error:</p>
10575 <div class="highlight"><pre><span></span><code>error: this form of character escape may only be used with characters in the range [\x00-\x7f]
10576 --&gt; foo.rs:2:30
10577 |
10578 2 | let my_string = &quot;Hello \xf0&quot;;
10579 | ^^
10580 </code></pre></div>
10581
10582 <p>Rust strings know their length, unlike C strings. They <em>can</em> contain
10583 a nul character in the middle, because they don't need a nul
10584 terminator at the end.</p>
10585 <div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">my_string</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&quot;Hello </span><span class="se">\x00</span><span class="s"> zero&quot;</span><span class="p">;</span><span class="w"></span>
10586 <span class="fm">println!</span><span class="p">(</span><span class="s">&quot;{}&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">my_string</span><span class="p">);</span><span class="w"></span>
10587 </code></pre></div>
10588
10589 <p>The output is what you expect:</p>
10590 <div class="highlight"><pre><span></span><code>$ ./foo | hexdump -C
10591 00000000 48 65 6c 6c 6f 20 00 20 7a 65 72 6f 0a |Hello . zero.|
10592 0000000d ^ note the nul char here
10593 $
10594 </code></pre></div>
10595
10596 <p>So, to summarize, in Rust:</p>
10597 <ul>
10598 <li>Strings are encoded in UTF-8</li>
10599 <li>Strings know their length</li>
10600 <li>Strings can have nul chars in the middle</li>
10601 </ul>
10602 <p>This is a bit different from C:</p>
10603 <ul>
10604 <li>Strings don't exist!</li>
10605 </ul>
10606 <p>Okay, just kidding. In C:</p>
10607 <ul>
10608 <li>A lot of software has standardized on UTF-8.</li>
10609 <li>Strings don't know their length - a <code>char *</code> is a raw pointer to the
10610 beginning of the string.</li>
10611 <li>Strings conventionally have a nul terminator, that is, a zero byte
10612 that marks the end of the string. Therefore, you can't have nul
10613 characters in the middle of strings.</li>
10614 </ul>
10615 <h1>Building a string from bytes</h1>
10616 <p>Let's say you have an array of bytes and want to make a string from
10617 them. Rust won't let you just cast the array, like C would. First
10618 you need to do UTF-8 validation. For example:</p>
10619 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
10620 <span class="normal"> 2</span>
10621 <span class="normal"> 3</span>
10622 <span class="normal"> 4</span>
10623 <span class="normal"> 5</span>
10624 <span class="normal"> 6</span>
10625 <span class="normal"> 7</span>
10626 <span class="normal"> 8</span>
10627 <span class="normal"> 9</span>
10628 <span class="normal">10</span>
10629 <span class="normal">11</span>
10630 <span class="normal">12</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">convert_and_print</span><span class="p">(</span><span class="n">bytes</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u8</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10631 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">from_utf8</span><span class="p">(</span><span class="n">bytes</span><span class="p">);</span><span class="w"></span>
10632 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10633 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">string</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="s">&quot;{}&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="p">),</span><span class="w"></span>
10634 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="s">&quot;{:?}&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">e</span><span class="p">)</span><span class="w"></span>
10635 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10636 <span class="p">}</span><span class="w"></span>
10637
10638 <span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10639 <span class="w"> </span><span class="n">convert_and_print</span><span class="p">(</span><span class="fm">vec!</span><span class="p">[</span><span class="mh">0x48</span><span class="p">,</span><span class="w"> </span><span class="mh">0x65</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6f</span><span class="p">]);</span><span class="w"></span>
10640 <span class="w"> </span><span class="n">convert_and_print</span><span class="p">(</span><span class="fm">vec!</span><span class="p">[</span><span class="mh">0x48</span><span class="p">,</span><span class="w"> </span><span class="mh">0x65</span><span class="p">,</span><span class="w"> </span><span class="mh">0xf0</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6f</span><span class="p">]);</span><span class="w"></span>
10641 <span class="p">}</span><span class="w"></span>
10642 </code></pre></div>
10643 </td></tr></table>
10644 <p>In lines 10 and 11, we call <code>convert_and_print()</code> with different
10645 arrays of bytes; the first one is valid UTF-8, and the second one
10646 isn't.</p>
10647 <p>Line 2 calls <a href="https://doc.rust-lang.org/std/string/struct.String.html#method.from_utf8"><code>String::from_utf8()</code></a>, which returns a <code>Result</code>,
10648 i.e. something with a success value or an error. In lines 3-5 we
10649 unpack this <code>Result</code>. If it's <code>Ok</code>, we print the converted string,
10650 which has been validated for UTF-8. Otherwise, we print the debug
10651 representation of the error.</p>
10652 <p>The program prints the following:</p>
10653 <div class="highlight"><pre><span></span><code>$ ~/foo
10654 Hello
10655 FromUtf8Error { bytes: [72, 101, 240, 108, 108, 111], error: Utf8Error { valid_up_to: 2, error_len: Some(1) } }
10656 </code></pre></div>
10657
10658 <p>Here, in the error case, the <a href="https://doc.rust-lang.org/std/str/struct.Utf8Error.html"><code>Utf8Error</code></a> tells us that the bytes
10659 are UTF-8 and are <code>valid_up_to</code> index 2; that is the first problematic
10660 index. We also get some extra information which lets the program know
10661 if the problematic sequence was incomplete and truncated at the end of
10662 the byte array, or if it's complete and in the middle.</p>
10663 <p>And for a "just make this printable, pls" API? We can
10664 use <a href="https://doc.rust-lang.org/std/string/struct.String.html#method.from_utf8_lossy"><code>String::from_utf8_lossy()</code></a>, which replaces invalid UTF-8
10665 sequences with <code>U+FFFD REPLACEMENT CHARACTER</code>:</p>
10666 <div class="highlight"><pre><span></span><code><span class="k">fn</span> <span class="nf">convert_and_print</span><span class="p">(</span><span class="n">bytes</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u8</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10667 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">from_utf8_lossy</span><span class="p">(</span><span class="o">&amp;</span><span class="n">bytes</span><span class="p">);</span><span class="w"></span>
10668 <span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="s">&quot;{}&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="p">);</span><span class="w"></span>
10669 <span class="p">}</span><span class="w"></span>
10670
10671 <span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10672 <span class="w"> </span><span class="n">convert_and_print</span><span class="p">(</span><span class="fm">vec!</span><span class="p">[</span><span class="mh">0x48</span><span class="p">,</span><span class="w"> </span><span class="mh">0x65</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6f</span><span class="p">]);</span><span class="w"></span>
10673 <span class="w"> </span><span class="n">convert_and_print</span><span class="p">(</span><span class="fm">vec!</span><span class="p">[</span><span class="mh">0x48</span><span class="p">,</span><span class="w"> </span><span class="mh">0x65</span><span class="p">,</span><span class="w"> </span><span class="mh">0xf0</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6c</span><span class="p">,</span><span class="w"> </span><span class="mh">0x6f</span><span class="p">]);</span><span class="w"></span>
10674 <span class="p">}</span><span class="w"></span>
10675 </code></pre></div>
10676
10677 <p>This prints the following:</p>
10678 <div class="highlight"><pre><span></span><code>$ ~/foo
10679 Hello
10680 He�llo
10681 </code></pre></div>
10682
10683 <h1>Reading from files into strings</h1>
10684 <p>Now, let's assume you want to read chunks of a file and put them into
10685 strings. Let's go from the low-level parts up to the high level "read
10686 a line" API.</p>
10687 <h2>Single bytes and single UTF-8 characters</h2>
10688 <p>When you open a <a href="https://doc.rust-lang.org/std/fs/struct.File.html"><code>File</code></a>, you get an object that implements the
10689 <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a> trait. In addition to the usual "read me some bytes" method,
10690 it can also give you back an iterator over <em>bytes</em>, or an iterator
10691 over UTF-8 <em>characters</em>.</p>
10692 <p>The <a href="https://doc.rust-lang.org/std/io/trait.Read.html#method.bytes"><code>Read.bytes()</code></a> method gives you back a <a href="https://doc.rust-lang.org/std/io/struct.Bytes.html"><code>Bytes</code></a> iterator,
10693 whose <code>next()</code> method returns <code>Result&lt;u8, io::Error&gt;</code>. When you ask
10694 the iterator for its next item, that <code>Result</code> means you'll get a byte
10695 out of it successfully, or an I/O error.</p>
10696 <p>In contrast, the <a href="https://doc.rust-lang.org/std/io/trait.Read.html#method.chars"><code>Read.chars()</code></a> method gives you back
10697 a <a href="https://doc.rust-lang.org/std/io/struct.Chars.html"><code>Chars</code></a> iterator, and its <code>next()</code> method returns
10698 <code>Result&lt;char, CharsError&gt;</code>, not <code>io::Error</code>. This
10699 extended <a href="https://doc.rust-lang.org/std/io/enum.CharsError.html"><code>CharsError</code></a> has a <code>NotUtf8</code> case, which you get back
10700 when <code>next()</code> tries to read the next UTF-8 sequence from the file and
10701 the file has invalid data. <code>CharsError</code> also has a case for normal
10702 I/O errors.</p>
10703 <h2>Reading lines</h2>
10704 <p>While you could build a UTF-8 string one character at a time, there
10705 are more efficient ways to do it.</p>
10706 <p>You can create a <a href="https://doc.rust-lang.org/std/io/struct.BufReader.html"><code>BufReader</code></a>, a buffered reader, out of anything
10707 that implements the <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a> trait. <code>BufReader</code> has a
10708 convenient <a href="https://doc.rust-lang.org/std/io/trait.BufRead.html#method.read_line"><code>read_line()</code></a> method, to which you pass a mutable
10709 String and it returns a <code>Result&lt;usize, io::Error&gt;</code> with either the
10710 number of bytes read, or an error.</p>
10711 <p>That method is declared in the <a href="https://doc.rust-lang.org/std/io/trait.BufRead.html"><code>BufRead</code></a> trait, which <code>BufReader</code>
10712 implements. Why the separation? Because other concrete structs also
10713 implement <code>BufRead</code>, such as <a href="https://doc.rust-lang.org/std/io/struct.Cursor.html"><code>Cursor</code></a> — a nice wrapper that lets
10714 you use a vector of bytes like an I/O <code>Read</code> or <code>Write</code>
10715 implementation, similar to <a href="https://developer.gnome.org/gio/stable/GMemoryInputStream.html"><code>GMemoryInputStream</code></a>.</p>
10716 <p>If you prefer an iterator rather than the <code>read_line()</code> function,
10717 <code>BufRead</code> also gives you a <a href="https://doc.rust-lang.org/std/io/trait.BufRead.html#method.lines"><code>lines()</code></a> method, which gives you back
10718 a <a href="https://doc.rust-lang.org/std/io/struct.Lines.html"><code>Lines</code></a> iterator.</p>
10719 <p>In both cases — the <code>read_line()</code> method or the <code>Lines</code> iterator, the
10720 error that you can get back can be of <a href="https://doc.rust-lang.org/std/io/enum.ErrorKind.html"><code>ErrorKind</code></a><code>::InvalidData</code>,
10721 which indicates that there was an invalid UTF-8 sequence in the line
10722 to be read. It can also be a normal I/O error, of course.</p>
10723 <h1>Summary so far</h1>
10724 <p>There is no way to build a <code>String</code>, or a <code>&amp;str</code> slice, from invalid
10725 UTF-8 data. All the methods that let you turn bytes into string-like
10726 things perform validation, and return a <code>Result</code> to let you know if
10727 your bytes validated correctly.</p>
10728 <p>The exceptions are in the <code>unsafe</code> methods,
10729 like <a href="https://doc.rust-lang.org/std/string/struct.String.html#method.from_utf8_unchecked"><code>String::from_utf8_unchecked()</code></a>. You should really only use
10730 them if you are <em>absolutely sure</em> that your bytes were validated as
10731 UTF-8 beforehand.</p>
10732 <p>There is no way to bring in data from a file (or anything file-like,
10733 that implements the <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a> trait) and turn it into a <code>String</code>
10734 without going through functions that do UTF-8 validation. There is
10735 not an unsafe "read a line" API without validation — you would have to
10736 build one yourself, but the I/O hit is probably going to be slower than
10737 validating data in memory, anyway, so you may as well validate.</p>
10738 <h1>C strings and Rust</h1>
10739 <p>For unfortunate historical reasons, C flings around <code>char *</code> to mean
10740 different things. In the context of Glib, it can mean</p>
10741 <ul>
10742 <li>A valid, nul-terminated UTF-8 sequence of bytes (a "normal string")</li>
10743 <li>A nul-terminated file path, which has no meaningful encoding</li>
10744 <li>A nul-terminated sequence of bytes, not validated as UTF-8.</li>
10745 </ul>
10746 <p>What a particular <code>char *</code> means depends on which API you got it from.</p>
10747 <h2>Bringing a string from C to Rust</h2>
10748 <p>From Rust's viewpoint, getting a raw <code>char *</code> from C (a "<code>*const
10749 c_char</code>" in Rust parlance) means that it gets a pointer to a buffer of
10750 unknown length.</p>
10751 <p>Now, that may not be entirely accurate:</p>
10752 <ul>
10753 <li>You may indeed only have a pointer to a buffer of unknown length</li>
10754 <li>You may have a pointer to a buffer, and also know its length
10755 (i.e. the offset at which the nul terminator is)</li>
10756 </ul>
10757 <p>The Rust standard library provides a <a href="https://doc.rust-lang.org/std/ffi/struct.CStr.html"><code>CStr</code></a> object, which means,
10758 "I have a pointer to an array of bytes, and I know its length, and I
10759 know the last byte is a nul".</p>
10760 <p><code>CStr</code> provides an <a href="https://doc.rust-lang.org/std/ffi/struct.CStr.html#method.from_ptr"><code>unsafe from_ptr()</code></a> constructor which takes a
10761 raw pointer, and walks the memory to which it points until it finds a
10762 nul byte. You <em>must</em> give it a valid pointer, and you had better
10763 guarantee that there is a nul terminator, or <code>CStr</code> will walk until
10764 the end of your process' address space looking for one.</p>
10765 <p>Alternatively, if you know the length of your byte array, and you know
10766 that it has a nul byte at the end, you can
10767 call <a href="https://doc.rust-lang.org/std/ffi/struct.CStr.html#method.from_bytes_with_nul"><code>CStr::from_bytes_with_nul()</code></a>. You pass it a <code>&amp;[u8]</code> slice;
10768 the function will check that a) the last byte in that slice is indeed
10769 a nul, and b) there are no nul bytes in the middle.</p>
10770 <p>The unsafe version of this last function
10771 is <a href="https://doc.rust-lang.org/std/ffi/struct.CStr.html#method.from_bytes_with_nul_unchecked"><code>unsafe CStr::from_bytes_with_nul_unchecked()</code></a>: it also takes
10772 an <code>&amp;[u8]</code> slice, but <em>you</em> must guarantee that the last byte is a nul
10773 and that there are no nul bytes in the middle.</p>
10774 <p><em>I really like that the Rust documentation tells you when functions
10775 are not "instantaneous" and must instead walks arrays, like to do
10776 validation or to look for the nul terminator above.</em></p>
10777 <h2>Turning a CStr into a string-like</h2>
10778 <p>Now, the above indicates that a <code>CStr</code> is a nul-terminated array of
10779 bytes. We have no idea what the bytes inside look like; we just know
10780 that they don't contain any other nul bytes.</p>
10781 <p>There is a <a href="https://doc.rust-lang.org/std/ffi/struct.CStr.html#method.to_str"><code>CStr::to_str()</code></a> method, which returns a
10782 <code>Result&lt;&amp;str, Utf8Error&gt;</code>. It performs UTF-8 validation on the array
10783 of bytes. If the array is valid, the function just returns a slice of
10784 the validated bytes minus the nul terminator (i.e. just what you
10785 expect for a Rust string slice). Otherwise, it returns an <code>Utf8Error</code>
10786 with the details like we discussed before.</p>
10787 <p>There is also <a href="https://doc.rust-lang.org/std/ffi/struct.CStr.html#method.to_string_lossy"><code>CStr::to_string_lossy()</code></a> which does the
10788 replacement of invalid UTF-8 sequences like we discussed before.</p>
10789 <h1>Conclusion</h1>
10790 <p>Strings in Rust are UTF-8 encoded, they know their length, and they
10791 can have nul bytes in the middle.</p>
10792 <p>To build a string from raw bytes, you must go through functions that
10793 do UTF-8 validation and tell you if it failed. There are unsafe
10794 functions that let you skip validation, but then of course you are on
10795 your own.</p>
10796 <p>The low-level functions which read data from files operate on bytes.
10797 On top of those, there are convenience functions to read validated
10798 UTF-8 characters, lines, etc. All of these tell you when there was
10799 invalid UTF-8 or an I/O error.</p>
10800 <p>Rust lets you wrap a raw <code>char *</code> that you got from C into something
10801 that can later be validated and turned into a string. Anything that
10802 manipulates a raw pointer is <code>unsafe</code>; this includes the "wrap me this
10803 pointer into a C string abstraction" API, and the "build me an array
10804 of bytes from this raw pointer" API. Later, you can validate <em>those</em>
10805 as UTF-8 and build real Rust strings — or know if the validation
10806 failed.</p>
10807 <p>Rust builds these little "corridors" through the API so that illegal
10808 states are unrepresentable.</p></content><category term="misc"></category><category term="rust"></category></entry><entry><title>GUADEC 2017 presentation</title><link href="https://people.gnome.org/~federico/blog/guadec-2017.html" rel="alternate"></link><published>2017-08-09T20:20:49-05:00</published><updated>2017-08-09T20:20:52-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-08-09:/~federico/blog/guadec-2017.html</id><summary type="html"><p>During GUADEC this year I gave a presentation
10809 called
10810 <a href="https://people.gnome.org/~federico/blog/docs/fmq-porting-c-to-rust.pdf">Replacing C library code with Rust: what I learned with librsvg</a>.
10811 This is the PDF file; be sure to scroll past the full-page
10812 presentation pages until you reach the speaker's notes, especially for
10813 the code sections!</p>
10814 <p><a href="https://people.gnome.org/~federico/blog/docs/fmq-porting-c-to-rust.pdf"><img alt="Replacing C library code with Rust - link to PDF" src="https://people.gnome.org/~federico/blog/images/fmq-porting-c-to-rust.png"></a></p>
10815 <p>You can also get the …</p></summary><content type="html"><p>During GUADEC this year I gave a presentation
10816 called
10817 <a href="https://people.gnome.org/~federico/blog/docs/fmq-porting-c-to-rust.pdf">Replacing C library code with Rust: what I learned with librsvg</a>.
10818 This is the PDF file; be sure to scroll past the full-page
10819 presentation pages until you reach the speaker's notes, especially for
10820 the code sections!</p>
10821 <p><a href="https://people.gnome.org/~federico/blog/docs/fmq-porting-c-to-rust.pdf"><img alt="Replacing C library code with Rust - link to PDF" src="https://people.gnome.org/~federico/blog/images/fmq-porting-c-to-rust.png"></a></p>
10822 <p>You can also get the <a href="https://people.gnome.org/~federico/blog/docs/fmq-porting-c-to-rust.odp">ODP file</a> for the presentation. This is
10823 released under a <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC-BY-SA license</a>.</p>
10824 <p>For the presentation, my daughter Luciana made some drawings of
10825 Ferris, the Rust mascot, also released under the same license:</p>
10826 <p><a href="https://people.gnome.org/~federico/blog/docs/ferris-1.png"><img alt="Ferris says hi" src="https://people.gnome.org/~federico/blog/docs/ferris-1-thumb.png"></a>
10827 <a href="https://people.gnome.org/~federico/blog/docs/ferris-2.png"><img alt="Ferris busy at work" src="https://people.gnome.org/~federico/blog/docs/ferris-2-thumb.png"></a>
10828 <a href="https://people.gnome.org/~federico/blog/docs/ferris-3.png"><img alt="Ferris makes a mess" src="https://people.gnome.org/~federico/blog/docs/ferris-3-thumb.png"></a>
10829 <a href="https://people.gnome.org/~federico/blog/docs/ferris-4.png"><img alt="Ferris presents her work" src="https://people.gnome.org/~federico/blog/docs/ferris-4-thumb.png"></a></p></content><category term="misc"></category><category term="gnome"></category><category term="guadec"></category><category term="librsvg"></category><category term="rust"></category><category term="talks"></category></entry><entry><title>Surviving a rust-cssparser API break</title><link href="https://people.gnome.org/~federico/blog/surviving-rust-cssparser-api-break.html" rel="alternate"></link><published>2017-08-01T06:02:30-05:00</published><updated>2017-08-01T06:02:37-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-08-01:/~federico/blog/surviving-rust-cssparser-api-break.html</id><summary type="html"><p>Yesterday I looked into updating librsvg's Rust dependencies. There have
10830 been some API breaks (!!!) in the unstable libraries that it uses
10831 since the last time I locked them. This post is about an interesting
10832 case of API breakage.</p>
10833 <p><a href="https://github.com/servo/rust-cssparser">rust-cssparser</a> is the crate that <a href="https://servo.org/">Servo</a> uses for parsing
10834 CSS. Well, more …</p></summary><content type="html"><p>Yesterday I looked into updating librsvg's Rust dependencies. There have
10835 been some API breaks (!!!) in the unstable libraries that it uses
10836 since the last time I locked them. This post is about an interesting
10837 case of API breakage.</p>
10838 <p><a href="https://github.com/servo/rust-cssparser">rust-cssparser</a> is the crate that <a href="https://servo.org/">Servo</a> uses for parsing
10839 CSS. Well, more like <em>tokenizing</em> CSS: you give it a string, it
10840 gives you back tokens, and you are supposed to compose CSS selector
10841 information or other CSS values from the tokens.</p>
10842 <p>Librsvg uses rust-cssparser now for most of the micro-languages in
10843 SVG's attribute values, instead of its old, fragile C parsers. I hope
10844 to be able to use it in conjunction with Servo's <a href="https://github.com/servo/servo/tree/master/components/selectors">rust-selectors</a>
10845 crate to fully parse CSS data and replace <a href="https://git.gnome.org/browse/libcroco">libcroco</a>.</p>
10846 <p>A few months ago, rust-cssparser's API looked more or less like the
10847 following. This is the old representation of a <code>Token</code>:</p>
10848 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Token</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10849 <span class="w"> </span><span class="c1">// an identifier</span>
10850 <span class="w"> </span><span class="n">Ident</span><span class="p">(</span><span class="n">Cow</span><span class="o">&lt;&#39;</span><span class="na">a</span><span class="p">,</span><span class="w"> </span><span class="kt">str</span><span class="o">&gt;</span><span class="p">),</span><span class="w"></span>
10851
10852 <span class="w"> </span><span class="c1">// a plain number</span>
10853 <span class="w"> </span><span class="n">Number</span><span class="p">(</span><span class="n">NumericValue</span><span class="p">),</span><span class="w"></span>
10854
10855 <span class="w"> </span><span class="c1">// a percentage value normalized to [0.0, 1.0]</span>
10856 <span class="w"> </span><span class="n">Percentage</span><span class="p">(</span><span class="n">PercentageValue</span><span class="p">),</span><span class="w"></span>
10857
10858 <span class="w"> </span><span class="n">WhiteSpace</span><span class="p">(</span><span class="o">&amp;&#39;</span><span class="na">a</span><span class="w"> </span><span class="kt">str</span><span class="p">),</span><span class="w"></span>
10859 <span class="w"> </span><span class="n">Comma</span><span class="p">,</span><span class="w"></span>
10860
10861 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
10862 <span class="p">}</span><span class="w"></span>
10863 </code></pre></div>
10864
10865 <p>That is, a <code>Token</code> can be an <code>Ident</code>ifier with a string name, or a
10866 <code>Number</code>, a <code>Percentage</code>, whitespace, a comma, and many others.</p>
10867 <p>On top of that is the old API for a <code>Parser</code>, which you construct with
10868 a string and then it gives you back tokens:</p>
10869 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Parser</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10870 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">new</span><span class="p">(</span><span class="n">input</span>: <span class="kp">&amp;</span><span class="o">&#39;</span><span class="na">i</span> <span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Parser</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="p">,</span><span class="w"> </span><span class="o">&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10871
10872 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">Token</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
10873
10874 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
10875 <span class="p">}</span><span class="w"></span>
10876 </code></pre></div>
10877
10878 <p>This means the following. You create the parser out of a string slice
10879 with <code>new()</code>. You can then extract a <code>Result</code> with a <code>Token</code>
10880 sucessfully, or with an empty error value. The parser uses a lifetime
10881 <code>'i</code> on the string from which it is constructed: the <code>Token</code>s that
10882 return identifiers, for example, could return sub-string slices that
10883 come from the original string, and the parser has to be marked with a
10884 lifetime so that it does not outlive its underlying string.</p>
10885 <p>A few commits later, rust-cssparser got changed to return detailed
10886 error values, so that instead of <code>()</code> you get a a <code>BasicParseError</code>
10887 with sub-cases like <code>UnexpectedToken</code> or <code>EndOfInput</code>.</p>
10888 <p>After the changes to the error values for results, I didn't pay much
10889 attention to rust-cssparser for while. Yesterday I wanted to update
10890 librsvg to use the newest rust-cssparser, and had some interesting
10891 problems.</p>
10892 <p>First, <code>Parser::new()</code> was changed from taking just a <code>&amp;str</code> slice to
10893 taking a <code>ParserInput</code> struct. This is an implementation detail which
10894 lets the parser cache the last token it saw. Not a big deal:</p>
10895 <div class="highlight"><pre><span></span><code><span class="c1">// instead of constructing a parser like</span>
10896 <span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">parser</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Parser</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="n">my_string</span><span class="p">);</span><span class="w"></span>
10897
10898 <span class="c1">// you now construct it like</span>
10899 <span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">input</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ParserInput</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="n">my_string</span><span class="p">);</span><span class="w"></span>
10900 <span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">parser</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Parser</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">input</span><span class="p">);</span><span class="w"></span>
10901 </code></pre></div>
10902
10903 <p>I am not completely sure why this is exposed to the public API, since
10904 Rust won't allow you to have two mutable references to a
10905 <code>ParserInput</code>, and the only consumer of a (mutable) <code>ParserInput</code> is
10906 the <code>Parser</code>, anyway.</p>
10907 <p>However, the <code>parser.next()</code> function changed:</p>
10908 <div class="highlight"><pre><span></span><code><span class="c1">// old version</span>
10909 <span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">Token</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
10910
10911 <span class="c1">// new version</span>
10912 <span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;&amp;</span><span class="n">Token</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">BasicParseError</span><span class="o">&lt;&#39;</span><span class="na">i</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"></span>
10913 <span class="c1">// note this bad boy here -------^</span>
10914 </code></pre></div>
10915
10916 <p>The successful <code>Result</code> from <code>next()</code> is now a <em>reference</em> to a
10917 <code>Token</code>, not a plain <code>Token</code> value which you now own. The parser is
10918 giving you a borrowed reference to its internally-cached token.</p>
10919 <p>My parsing functions for the old API looked similar to the
10920 following. This is a function that parses a string into an angle; it
10921 can look like <code>"45deg"</code> or <code>"1.5rad"</code>, for example.</p>
10922 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
10923 <span class="normal"> 2</span>
10924 <span class="normal"> 3</span>
10925 <span class="normal"> 4</span>
10926 <span class="normal"> 5</span>
10927 <span class="normal"> 6</span>
10928 <span class="normal"> 7</span>
10929 <span class="normal"> 8</span>
10930 <span class="normal"> 9</span>
10931 <span class="normal">10</span>
10932 <span class="normal">11</span>
10933 <span class="normal">12</span>
10934 <span class="normal">13</span>
10935 <span class="normal">14</span>
10936 <span class="normal">15</span>
10937 <span class="normal">16</span>
10938 <span class="normal">17</span>
10939 <span class="normal">18</span>
10940 <span class="normal">19</span>
10941 <span class="normal">20</span>
10942 <span class="normal">21</span>
10943 <span class="normal">22</span>
10944 <span class="normal">23</span>
10945 <span class="normal">24</span>
10946 <span class="normal">25</span>
10947 <span class="normal">26</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">parse_angle_degrees</span><span class="w"> </span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span> <span class="o">&lt;</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">ParseError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10948 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">parser</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Parser</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="n">s</span><span class="p">);</span><span class="w"></span>
10949
10950 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">token</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parser</span><span class="p">.</span><span class="n">next</span><span class="w"> </span><span class="p">()</span><span class="w"></span>
10951 <span class="w"> </span><span class="p">.</span><span class="n">map_err</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
10952
10953 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">token</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10954 <span class="w"> </span><span class="n">Token</span>::<span class="n">Number</span><span class="w"> </span><span class="p">(</span><span class="n">NumericValue</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">})</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="w"> </span><span class="p">(</span><span class="n">value</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">),</span><span class="w"></span>
10955
10956 <span class="w"> </span><span class="n">Token</span>::<span class="n">Dimension</span><span class="w"> </span><span class="p">(</span><span class="n">NumericValue</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">unit</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10957 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">;</span><span class="w"></span>
10958
10959 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">unit</span><span class="p">.</span><span class="n">as_ref</span><span class="w"> </span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10960 <span class="w"> </span><span class="s">&quot;deg&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="w"> </span><span class="p">(</span><span class="n">value</span><span class="p">),</span><span class="w"></span>
10961 <span class="w"> </span><span class="s">&quot;grad&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="w"> </span><span class="p">(</span><span class="n">value</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mf">360.0</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mf">400.0</span><span class="p">),</span><span class="w"></span>
10962 <span class="w"> </span><span class="s">&quot;rad&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Ok</span><span class="w"> </span><span class="p">(</span><span class="n">value</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mf">180.0</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">PI</span><span class="p">),</span><span class="w"></span>
10963 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Err</span><span class="w"> </span><span class="p">(</span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="w"></span>
10964 <span class="w"> </span><span class="p">}</span><span class="w"></span>
10965 <span class="w"> </span><span class="p">},</span><span class="w"></span>
10966
10967 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Err</span><span class="w"> </span><span class="p">(</span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="w"></span>
10968 <span class="w"> </span><span class="p">}.</span><span class="n">and_then</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">r</span><span class="o">|</span><span class="w"></span>
10969 <span class="w"> </span><span class="n">parser</span><span class="p">.</span><span class="n">expect_exhausted</span><span class="w"> </span><span class="p">()</span><span class="w"></span>
10970 <span class="w"> </span><span class="p">.</span><span class="n">map</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">r</span><span class="p">)</span><span class="w"></span>
10971 <span class="w"> </span><span class="p">.</span><span class="n">map_err</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">)))</span><span class="w"></span>
10972 <span class="p">}</span><span class="w"></span>
10973 </code></pre></div>
10974 </td></tr></table>
10975 <p>This is a bit ugly, but it was the first version that passed the
10976 tests. Lines 4 and 5 mean, "get the first token or return an error".
10977 Line 17 means, "anything except <code>deg</code>, <code>grad</code>, or <code>rad</code> for the units
10978 causes the <code>match</code> expression to generate an error". Finally, I was
10979 feeling very proud of using <code>and_then()</code> in line 22, with
10980 <code>parser.expect_exhausted()</code>, to ensure that the parser would not find
10981 any more tokens after the angle/units.</p>
10982 <p>However, in the new version of rust-cssparser, Parser.next() gives
10983 back a <code>Result</code> with a <code>&amp;Token</code> success value — a <em>reference</em> to a
10984 token —, while the old version returned a plain <code>Token</code>. No problem,
10985 I thought, I'm just going to de-reference the value in the <code>match</code> and
10986 be done with it:</p>
10987 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">token</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parser</span><span class="p">.</span><span class="n">next</span><span class="w"> </span><span class="p">()</span><span class="w"></span>
10988 <span class="w"> </span><span class="p">.</span><span class="n">map_err</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
10989
10990 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="o">*</span><span class="n">token</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10991 <span class="w"> </span><span class="c1">// ^ dereference here...</span>
10992 <span class="w"> </span><span class="n">Token</span>::<span class="n">Number</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">,</span><span class="w"></span>
10993
10994 <span class="w"> </span><span class="n">Token</span>::<span class="n">Dimension</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="k">ref</span><span class="w"> </span><span class="n">unit</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
10995 <span class="w"> </span><span class="c1">// ^ avoid moving the unit value</span>
10996 </code></pre></div>
10997
10998 <p>The compiler complained elsewhere. The whole function now looked like
10999 this:</p>
11000 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
11001 <span class="normal"> 2</span>
11002 <span class="normal"> 3</span>
11003 <span class="normal"> 4</span>
11004 <span class="normal"> 5</span>
11005 <span class="normal"> 6</span>
11006 <span class="normal"> 7</span>
11007 <span class="normal"> 8</span>
11008 <span class="normal"> 9</span>
11009 <span class="normal">10</span>
11010 <span class="normal">11</span>
11011 <span class="normal">12</span>
11012 <span class="normal">13</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">parse_angle_degrees</span><span class="w"> </span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span> <span class="o">&lt;</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">ParseError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11013 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">parser</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Parser</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="n">s</span><span class="p">);</span><span class="w"></span>
11014
11015 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">token</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parser</span><span class="p">.</span><span class="n">next</span><span class="w"> </span><span class="p">()</span><span class="w"></span>
11016 <span class="w"> </span><span class="p">.</span><span class="n">map_err</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
11017
11018 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">token</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11019 <span class="w"> </span><span class="c1">// ...</span>
11020 <span class="w"> </span><span class="p">}.</span><span class="n">and_then</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">r</span><span class="o">|</span><span class="w"></span>
11021 <span class="w"> </span><span class="n">parser</span><span class="p">.</span><span class="n">expect_exhausted</span><span class="w"> </span><span class="p">()</span><span class="w"></span>
11022 <span class="w"> </span><span class="p">.</span><span class="n">map</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">r</span><span class="p">)</span><span class="w"></span>
11023 <span class="w"> </span><span class="p">.</span><span class="n">map_err</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">)))</span><span class="w"></span>
11024 <span class="p">}</span><span class="w"></span>
11025 </code></pre></div>
11026 </td></tr></table>
11027 <p>But in line 4, <code>token</code> is now a reference to something that lives
11028 inside <code>parser</code>, and <code>parser</code> is therefore borrowed mutably. The
11029 compiler didn't like that line 10 (the call to
11030 <code>parser.expect_exhausted()</code>) was trying to borrow <code>parser</code> mutably
11031 again.</p>
11032 <p>I played a bit with creating a temporary scope around the assignment
11033 to <code>token</code> so that it would only borrow <code>parser</code> mutably inside that
11034 scope. Things ended up like this, without the call to <code>and_then()</code>
11035 after the <code>match</code>:</p>
11036 <table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
11037 <span class="normal"> 2</span>
11038 <span class="normal"> 3</span>
11039 <span class="normal"> 4</span>
11040 <span class="normal"> 5</span>
11041 <span class="normal"> 6</span>
11042 <span class="normal"> 7</span>
11043 <span class="normal"> 8</span>
11044 <span class="normal"> 9</span>
11045 <span class="normal">10</span>
11046 <span class="normal">11</span>
11047 <span class="normal">12</span>
11048 <span class="normal">13</span>
11049 <span class="normal">14</span>
11050 <span class="normal">15</span>
11051 <span class="normal">16</span>
11052 <span class="normal">17</span>
11053 <span class="normal">18</span>
11054 <span class="normal">19</span>
11055 <span class="normal">20</span>
11056 <span class="normal">21</span>
11057 <span class="normal">22</span>
11058 <span class="normal">23</span>
11059 <span class="normal">24</span>
11060 <span class="normal">25</span>
11061 <span class="normal">26</span>
11062 <span class="normal">27</span>
11063 <span class="normal">28</span>
11064 <span class="normal">29</span>
11065 <span class="normal">30</span></pre></div></td><td class="code"><div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">angle_degrees</span><span class="w"> </span><span class="p">(</span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span> <span class="o">&lt;</span><span class="kt">f64</span><span class="p">,</span><span class="w"> </span><span class="n">ParseError</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11066 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">input</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ParserInput</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="n">s</span><span class="p">);</span><span class="w"></span>
11067 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">parser</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Parser</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">input</span><span class="p">);</span><span class="w"></span>
11068
11069 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">angle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11070 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">token</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parser</span><span class="p">.</span><span class="n">next</span><span class="w"> </span><span class="p">()</span><span class="w"></span>
11071 <span class="w"> </span><span class="p">.</span><span class="n">map_err</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
11072
11073 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="o">*</span><span class="n">token</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11074 <span class="w"> </span><span class="n">Token</span>::<span class="n">Number</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">,</span><span class="w"></span>
11075
11076 <span class="w"> </span><span class="n">Token</span>::<span class="n">Dimension</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="k">ref</span><span class="w"> </span><span class="n">unit</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11077 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">;</span><span class="w"></span>
11078
11079 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">unit</span><span class="p">.</span><span class="n">as_ref</span><span class="w"> </span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11080 <span class="w"> </span><span class="s">&quot;deg&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"></span>
11081 <span class="w"> </span><span class="s">&quot;grad&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mf">360.0</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mf">400.0</span><span class="p">,</span><span class="w"></span>
11082 <span class="w"> </span><span class="s">&quot;rad&quot;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mf">180.0</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">PI</span><span class="p">,</span><span class="w"></span>
11083 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">Err</span><span class="w"> </span><span class="p">(</span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected &#39;deg&#39; | &#39;grad&#39; | &#39;rad&#39;&quot;</span><span class="p">))</span><span class="w"></span>
11084 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11085 <span class="w"> </span><span class="p">},</span><span class="w"></span>
11086
11087 <span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">Err</span><span class="w"> </span><span class="p">(</span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="w"></span>
11088 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11089 <span class="w"> </span><span class="p">};</span><span class="w"></span>
11090
11091 <span class="w"> </span><span class="n">parser</span><span class="p">.</span><span class="n">expect_exhausted</span><span class="w"> </span><span class="p">().</span><span class="n">map_err</span><span class="w"> </span><span class="p">(</span><span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="w"> </span><span class="n">ParseError</span>::<span class="n">new</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;expected angle&quot;</span><span class="p">))</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
11092
11093 <span class="w"> </span><span class="nb">Ok</span><span class="w"> </span><span class="p">(</span><span class="n">angle</span><span class="p">)</span><span class="w"></span>
11094 <span class="p">}</span><span class="w"></span>
11095 </code></pre></div>
11096 </td></tr></table>
11097 <p>Lines 5 through 25 are basically</p>
11098 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">angle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11099 <span class="w"> </span><span class="c1">// parse out the angle; return if error</span>
11100 <span class="w"> </span><span class="p">};</span><span class="w"></span>
11101 </code></pre></div>
11102
11103 <p>And after <em>that</em> is done, I test for <code>parser.expect_exhausted()</code>.
11104 There is no chaining of results with helper functions; instead it's
11105 just going through each token linearly.</p>
11106 <p>The API break was annoying to deal with, but fortunately the calling
11107 code ended up cleaner, and I didn't have to change anything in the
11108 tests. I hope rust-cssparser can stabilize its API for consumers that
11109 are not Servo.</p></content><category term="misc"></category><category term="rust"></category><category term="gnome"></category><category term="librsvg"></category></entry><entry><title>Legacy Systems as Old Cities</title><link href="https://people.gnome.org/~federico/blog/legacy-systems-as-old-cities.html" rel="alternate"></link><published>2017-06-28T21:26:00-05:00</published><updated>2017-07-10T22:40:24-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-06-28:/~federico/blog/legacy-systems-as-old-cities.html</id><summary type="html"><p><em>I just realized that I only tweeted about this a couple of months ago,
11110 but never blogged about it. Shame on me!</em></p>
11111 <p>I wrote an article, <a href="https://recompilermag.com/issues/issue-4/legacy-systems-as-old-cities/">Legacy Systems as Old Cities</a>, for The
11112 Recompiler magazine. Is GNOME, now at 20 years old, legacy software?
11113 Is it different from mainframe software …</p></summary><content type="html"><p><em>I just realized that I only tweeted about this a couple of months ago,
11114 but never blogged about it. Shame on me!</em></p>
11115 <p>I wrote an article, <a href="https://recompilermag.com/issues/issue-4/legacy-systems-as-old-cities/">Legacy Systems as Old Cities</a>, for The
11116 Recompiler magazine. Is GNOME, now at 20 years old, legacy software?
11117 Is it different from mainframe software because "everyone" can change
11118 it? Does long-lived software have the same patterns of change as
11119 cities and physical artifacts? Can we learn from the building trades
11120 and urbanism for maintaining software in the long term? <em>Could we
11121 turn legacy software into a good legacy?</em></p>
11122 <p>You can read the article <a href="https://recompilermag.com/issues/issue-4/legacy-systems-as-old-cities/">here</a>.</p>
11123 <p>Also, let me take this opportunity to recommend <a href="https://recompilermag.com">The Recompiler</a>
11124 magazine. It is the most enjoyable technical publication I read.
11125 Their <a href="https://recompilermag.com/podcast/">podcast</a> is also excellent!</p>
11126 <p><strong>Update 2017/06/10</strong> - Spanish version of the article, <a href="legacy-systems-as-old-cities-es.html">Los Sistemas Heredados como Ciudades Viejas</a></p></content><category term="misc"></category><category term="recompiler"></category><category term="gnome"></category><category term="urbanism"></category></entry><entry><title>Setting Alt-Tab behavior in gnome-shell</title><link href="https://people.gnome.org/~federico/blog/alt-tab.html" rel="alternate"></link><published>2017-06-22T10:25:02-05:00</published><updated>2017-06-22T10:25:06-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-06-22:/~federico/blog/alt-tab.html</id><summary type="html"><p>After updating my distro a few months ago, I somehow lost my tweaks to
11127 the Alt-Tab behavior in gnome-shell.</p>
11128 <p>The default is to have <code>Alt-Tab</code> switch you between applications in the
11129 current workspace. One can use <code>Alt-backtick</code> (or whatever key you
11130 have above Tab) to switch between windows in the …</p></summary><content type="html"><p>After updating my distro a few months ago, I somehow lost my tweaks to
11131 the Alt-Tab behavior in gnome-shell.</p>
11132 <p>The default is to have <code>Alt-Tab</code> switch you between applications in the
11133 current workspace. One can use <code>Alt-backtick</code> (or whatever key you
11134 have above Tab) to switch between windows in the current application.</p>
11135 <p>I prefer a Windows-like setup, where <code>Alt-Tab</code> switches between
11136 windows in the current workspace, regardless of the application to
11137 which they belong.</p>
11138 <p>Many moons ago there was a gnome-shell extension to change this
11139 behavior, but these days (GNOME 3.24) it can be done without
11140 extensions. It is a bit convoluted.</p>
11141 <h1>With the GUI</h1>
11142 <p>If you are using X instead of Wayland, this works:</p>
11143 <ol>
11144 <li>
11145 <p>Unset the <strong>Switch applications</strong> command. To do this, run
11146 <code>gnome-control-center</code>, go to <em>Keyboard</em>, and find the <em>Switch
11147 applications</em> command. Click on it, and hit <code>Backspace</code> in the
11148 dialog that prompts you for the keyboard shortcut. Click on the
11149 <em>Set</em> button.</p>
11150 </li>
11151 <li>
11152 <p>Set the <strong>Switch windows</strong> command. While still in the
11153 <em>Keyboard</em> settings, find the <em>Switch windows</em> command. Click on
11154 it, and hit <code>Alt-Tab</code>. Click <em>Set</em>.</p>
11155 </li>
11156 </ol>
11157 <p>That should be all you need, unless you are in Wayland. In that case,
11158 you need to do it on the command line.</p>
11159 <h1>With the command line, or in Wayland</h1>
11160 <p>The kind people on <a href="irc://irc.gnome.org/#gnome-hackers"><code>#gnome-hackers</code></a> tell me that as of GNOME
11161 3.24, changing <code>Alt-Tab</code> doesn't work on Wayland as in (2) above,
11162 because the compositor captures the <code>Alt-Tab</code> key when you type it
11163 inside the dialog that prompts you for a keyboard shortcut. In that
11164 case, you have to change the configuration keys directly instead of
11165 using the GUI:</p>
11166 <div class="highlight"><pre><span></span><code>gsettings <span class="nb">set</span> org.gnome.desktop.wm.keybindings switch-applications <span class="s2">&quot;[]&quot;</span>
11167 gsettings <span class="nb">set</span> org.gnome.desktop.wm.keybindings switch-applications-backward <span class="s2">&quot;[]&quot;</span>
11168 gsettings <span class="nb">set</span> org.gnome.desktop.wm.keybindings switch-windows <span class="s2">&quot;[&#39;&lt;Alt&gt;Tab&#39;, &#39;&lt;Super&gt;Tab&#39;]&quot;</span>
11169 gsettings <span class="nb">set</span> org.gnome.desktop.wm.keybindings switch-windows-backward <span class="s2">&quot;[&#39;&lt;Alt&gt;&lt;Shift&gt;Tab&#39;, &#39;&lt;Super&gt;&lt;Shift&gt;Tab&#39;]&quot;</span>
11170 </code></pre></div>
11171
11172 <p>Of course the above also works in X, too.</p>
11173 <h1>Changing windows across all workspaces</h1>
11174 <p>If you'd like to switch between windows in all workspaces, rather than
11175 in the current workspace, find the <code>org.gnome.shell.window-switcher
11176 current-workspace-only</code> GSettings key and change it. You can do this
11177 in <code>dconf-editor</code>, or on the command line with</p>
11178 <div class="highlight"><pre><span></span><code>gsettings <span class="nb">set</span> org.gnome.shell.window-switcher current-workspace-only <span class="nb">true</span>
11179 </code></pre></div></content><category term="misc"></category><category term="gnome"></category><category term="gnome-shell"></category></entry><entry><title>Exploring Rust's standard library: system calls and errors</title><link href="https://people.gnome.org/~federico/blog/rust-libstd-syscalls-and-errors.html" rel="alternate"></link><published>2017-06-12T10:55:26-05:00</published><updated>2017-06-12T17:11:52-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-06-12:/~federico/blog/rust-libstd-syscalls-and-errors.html</id><summary type="html"><p>In this post I'll show you the code path that Rust takes inside its
11180 standard library when you open a file. I wanted to learn how Rust
11181 handles system calls and <code>errno</code>, and all the little subtleties of the
11182 POSIX API. This is what I learned!</p>
11183 <h1>The C side of …</h1></summary><content type="html"><p>In this post I'll show you the code path that Rust takes inside its
11184 standard library when you open a file. I wanted to learn how Rust
11185 handles system calls and <code>errno</code>, and all the little subtleties of the
11186 POSIX API. This is what I learned!</p>
11187 <h1>The C side of things</h1>
11188 <p>When you open a file, or create a socket, or do anything else that
11189 returns an object that can be accessed like a file, you get a <em>file
11190 descriptor</em> in the form of an <code>int</code>.</p>
11191 <div class="highlight"><pre><span></span><code><span class="cm">/* All of these return a int with a file descriptor, or</span>
11192 <span class="cm"> * -1 in case of error.</span>
11193 <span class="cm"> */</span>
11194 <span class="kt">int</span> <span class="nf">open</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">pathname</span><span class="p">,</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">,</span> <span class="p">...);</span>
11195 <span class="kt">int</span> <span class="nf">socket</span><span class="p">(</span><span class="kt">int</span> <span class="n">domain</span><span class="p">,</span> <span class="kt">int</span> <span class="n">type</span><span class="p">,</span> <span class="kt">int</span> <span class="n">protocol</span><span class="p">);</span>
11196 </code></pre></div>
11197
11198 <p>You get a nonnegative integer in case of success, or -1 in case of an
11199 error. If there's an error, you look at <code>errno</code>, which gives you an
11200 integer error code. </p>
11201 <div class="highlight"><pre><span></span><code><span class="kt">int</span> <span class="n">fd</span><span class="p">;</span>
11202
11203 <span class="nl">retry_open</span><span class="p">:</span>
11204 <span class="n">fd</span> <span class="o">=</span> <span class="n">open</span> <span class="p">(</span><span class="s">&quot;/foo/bar/baz.txt&quot;</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
11205 <span class="k">if</span> <span class="p">(</span><span class="n">fd</span> <span class="o">==</span> <span class="mi">-1</span><span class="p">)</span> <span class="p">{</span>
11206 <span class="k">if</span> <span class="p">(</span><span class="n">errno</span> <span class="o">==</span> <span class="n">ENOENT</span><span class="p">)</span> <span class="p">{</span>
11207 <span class="cm">/* File doesn&#39;t exist */</span>
11208 <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">errno</span> <span class="o">==</span> <span class="p">...)</span> <span class="p">[</span>
11209 <span class="p">...</span>
11210 <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">errno</span> <span class="o">==</span> <span class="n">EINTR</span><span class="p">)</span> <span class="p">{</span>
11211 <span class="k">goto</span> <span class="n">retry_open</span><span class="p">;</span> <span class="cm">/* interrupted system call; let&#39;s retry */</span>
11212 <span class="p">}</span>
11213 <span class="p">}</span>
11214 </code></pre></div>
11215
11216 <p>Many system calls can return <code>EINTR</code>, which means "interrupted system
11217 call", which means that <em>something</em> interrupted the kernel while it
11218 was doing your system call and it returned control to userspace, with
11219 the syscall unfinished. For example, your process may have received a
11220 Unix signal (e.g. you send it <code>SIGSTOP</code> by pressing Ctrl-Z on a
11221 terminal, or you resized the terminal and your process got a
11222 <code>SIGWINCH</code>). Most of the time <code>EINTR</code> means simply that you must
11223 retry the operation: if you Control-Z a program to suspend it, and
11224 then <code>fg</code> to continue it again; and if the program was in the middle
11225 of <code>open()</code>ing a file, you would expect it to continue at that exact
11226 point and to actually open the file. Software that doesn't check for
11227 <code>EINTR</code> can fail in very subtle ways!</p>
11228 <p>Once you have an open file descriptor, you can read from it:</p>
11229 <div class="highlight"><pre><span></span><code><span class="kt">ssize_t</span>
11230 <span class="nf">read_five_bytes</span> <span class="p">(</span><span class="kt">int</span> <span class="n">fd</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">buf</span><span class="p">)</span>
11231 <span class="p">{</span>
11232 <span class="kt">ssize_t</span> <span class="n">result</span><span class="p">;</span>
11233
11234 <span class="nl">retry</span><span class="p">:</span>
11235 <span class="n">result</span> <span class="o">=</span> <span class="n">read</span> <span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="mi">5</span><span class="p">);</span>
11236 <span class="k">if</span> <span class="p">(</span><span class="n">result</span> <span class="o">==</span> <span class="mi">-1</span><span class="p">)</span> <span class="p">{</span>
11237 <span class="k">if</span> <span class="p">(</span><span class="n">errno</span> <span class="o">==</span> <span class="n">EINTR</span><span class="p">)</span> <span class="p">{</span>
11238 <span class="k">goto</span> <span class="n">retry</span><span class="p">;</span>
11239 <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
11240 <span class="k">return</span> <span class="mi">-1</span><span class="p">;</span> <span class="cm">/* the caller should cherk errno */</span>
11241 <span class="p">}</span>
11242 <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
11243 <span class="k">return</span> <span class="n">result</span><span class="p">;</span> <span class="cm">/* success */</span>
11244 <span class="p">}</span>
11245 <span class="p">}</span>
11246 </code></pre></div>
11247
11248 <p>... and one has to remember that if <code>read()</code> returns 0, it means we
11249 were at the end-of-file; if it returns less than the number of bytes
11250 requested it means we were close to the end of file; if this is a
11251 nonblocking socket and it returns <code>EWOULDBLOCK</code> or <code>EAGAIN</code> then one
11252 must decide to retry the operation or actually wait and try again
11253 later.</p>
11254 <p>There is a lot of buggy software written in C that tries to use the
11255 POSIX API directly, and gets these subtleties wrong. Most programs
11256 written in high-level languages use the I/O facilities provided by
11257 their language, which hopefully make things easier.</p>
11258 <h1>I/O in Rust</h1>
11259 <p>Rust makes <a href="https://doc.rust-lang.org/book/first-edition/error-handling.html">error handling</a> convenient and safe. If you decide to
11260 ignore an error, the code <em>looks</em> like it is ignoring the error
11261 (e.g. you can grep for <code>unwrap()</code> and find lazy code). The
11262 code actually <em>looks better</em> if it doesn't ignore the error and
11263 properly propagates it upstream (e.g. you can use the <code>?</code> shortcut to
11264 propagate errors to the calling function).</p>
11265 <p>I keep recommending <a href="http://joeduffyblog.com/2016/02/07/the-error-model/">this article on error models</a> to people; it
11266 discusses POSIX-like error codes vs. exceptions vs. more modern
11267 approaches like Haskell's and Rust's - definitely worth studying over
11268 a few of days (also, see Miguel's valiant effort to <a href="https://github.com/migueldeicaza/NStack">move C# I/O away
11269 from exceptions for I/O errors</a>).</p>
11270 <p>So, what happens when one opens a file in Rust, from the toplevel API
11271 down to the system calls? Let's go down the rabbit hole.</p>
11272 <p>You can open a file like this:</p>
11273 <div class="highlight"><pre><span></span><code><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">fs</span>::<span class="n">File</span><span class="p">;</span><span class="w"></span>
11274
11275 <span class="k">fn</span> <span class="nf">main</span><span class="w"> </span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11276 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">File</span>::<span class="n">open</span><span class="w"> </span><span class="p">(</span><span class="s">&quot;foo.txt&quot;</span><span class="p">);</span><span class="w"></span>
11277 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
11278 <span class="p">}</span><span class="w"></span>
11279 </code></pre></div>
11280
11281 <p>This does <em>not</em> give you a raw file descriptor; it gives you an
11282 <code>io::Result&lt;fs::File, io::Error&gt;</code>, which you must pick apart to see if
11283 you actually got back a File that you can operate on, or an error.</p>
11284 <p>Let's look at the <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/fs.rs#L235">implementation of <code>File::open()</code> and <code>File::create()</code></a>.</p>
11285 <div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">File</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11286 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">open</span><span class="o">&lt;</span><span class="n">P</span>: <span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">Path</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">path</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">File</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11287 <span class="w"> </span><span class="n">OpenOptions</span>::<span class="n">new</span><span class="p">().</span><span class="n">read</span><span class="p">(</span><span class="kc">true</span><span class="p">).</span><span class="n">open</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">as_ref</span><span class="p">())</span><span class="w"></span>
11288 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11289
11290 <span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">create</span><span class="o">&lt;</span><span class="n">P</span>: <span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">Path</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">path</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">File</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11291 <span class="w"> </span><span class="n">OpenOptions</span>::<span class="n">new</span><span class="p">().</span><span class="n">write</span><span class="p">(</span><span class="kc">true</span><span class="p">).</span><span class="n">create</span><span class="p">(</span><span class="kc">true</span><span class="p">).</span><span class="n">truncate</span><span class="p">(</span><span class="kc">true</span><span class="p">).</span><span class="n">open</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">as_ref</span><span class="p">())</span><span class="w"></span>
11292 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11293 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
11294 <span class="p">}</span><span class="w"></span>
11295 </code></pre></div>
11296
11297 <p>Here, <code>OpenOptions</code> is an auxiliary struct that implements a "builder"
11298 pattern. Instead of passing bitflags for the various
11299 <code>O_CREATE/O_APPEND/etc.</code> flags from the <code>open(2)</code> system call, one
11300 builds a struct with the desired options, and finally calls <code>.open()</code>
11301 on it.</p>
11302 <p>So, let's look at the <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/fs.rs#L670">implementation of <code>OpenOptions.open()</code></a>:</p>
11303 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">open</span><span class="o">&lt;</span><span class="n">P</span>: <span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">Path</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">path</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">File</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11304 <span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">_open</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">as_ref</span><span class="p">())</span><span class="w"></span>
11305 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11306
11307 <span class="w"> </span><span class="k">fn</span> <span class="nf">_open</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">path</span>: <span class="kp">&amp;</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">File</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11308 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">inner</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">fs_imp</span>::<span class="n">File</span>::<span class="n">open</span><span class="p">(</span><span class="n">path</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="mi">0</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
11309 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">File</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">inner</span>: <span class="nc">inner</span><span class="w"> </span><span class="p">})</span><span class="w"></span>
11310 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11311 </code></pre></div>
11312
11313 <p>See that <code>fs_imp::File::open()</code>? That's what we want: it's the
11314 platform-specific wrapper for opening files. Let's look
11315 at <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/sys/unix/fs.rs#L422">its implementation for Unix</a>:</p>
11316 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">open</span><span class="p">(</span><span class="n">path</span>: <span class="kp">&amp;</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">opts</span>: <span class="kp">&amp;</span><span class="nc">OpenOptions</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">File</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11317 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cstr</span><span class="p">(</span><span class="n">path</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
11318 <span class="w"> </span><span class="n">File</span>::<span class="n">open_c</span><span class="p">(</span><span class="o">&amp;</span><span class="n">path</span><span class="p">,</span><span class="w"> </span><span class="n">opts</span><span class="p">)</span><span class="w"></span>
11319 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11320 </code></pre></div>
11321
11322 <p>The first line, <code>let path = cstr(path)?</code> tries to convert a <code>Path</code>
11323 into a nul-terminated C string. The second line calls the following:</p>
11324 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">open_c</span><span class="p">(</span><span class="n">path</span>: <span class="kp">&amp;</span><span class="nc">CStr</span><span class="p">,</span><span class="w"> </span><span class="n">opts</span>: <span class="kp">&amp;</span><span class="nc">OpenOptions</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">File</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11325 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">flags</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">libc</span>::<span class="n">O_CLOEXEC</span><span class="w"> </span><span class="o">|</span><span class="w"></span>
11326 <span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">get_access_mode</span><span class="p">()</span><span class="o">?</span><span class="w"> </span><span class="o">|</span><span class="w"></span>
11327 <span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">get_creation_mode</span><span class="p">()</span><span class="o">?</span><span class="w"> </span><span class="o">|</span><span class="w"></span>
11328 <span class="w"> </span><span class="p">(</span><span class="n">opts</span><span class="p">.</span><span class="n">custom_flags</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">c_int</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="o">!</span><span class="n">libc</span>::<span class="n">O_ACCMODE</span><span class="p">);</span><span class="w"></span>
11329 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">fd</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cvt_r</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11330 <span class="w"> </span><span class="n">open64</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">as_ptr</span><span class="p">(),</span><span class="w"> </span><span class="n">flags</span><span class="p">,</span><span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">mode</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">c_int</span><span class="p">)</span><span class="w"></span>
11331 <span class="w"> </span><span class="p">})</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
11332 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">fd</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">FileDesc</span>::<span class="n">new</span><span class="p">(</span><span class="n">fd</span><span class="p">);</span><span class="w"></span>
11333
11334 <span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"></span>
11335
11336 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">File</span><span class="p">(</span><span class="n">fd</span><span class="p">))</span><span class="w"></span>
11337 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11338 </code></pre></div>
11339
11340 <p>Here, <code>let flags = ...</code> converts the <code>OpenOptions</code> we had in the
11341 beginning to an int with bit flags.</p>
11342 <p>Then, it does <code>let fd = cvt_r (LAMBDA)</code>, and that lambda function
11343 calls the actual <code>open64()</code> from libc (a Rust wrapper for the system's
11344 libc): it returns a file descriptor, or -1 on error. Why is this
11345 done in a lambda? Let's look at <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/sys/unix/mod.rs#L155"><code>cvt_r()</code></a>:</p>
11346 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">cvt_r</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">f</span>: <span class="nc">F</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"></span>
11347 <span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">IsMinusOne</span><span class="p">,</span><span class="w"></span>
11348 <span class="w"> </span><span class="n">F</span>: <span class="nb">FnMut</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"></span>
11349 <span class="p">{</span><span class="w"></span>
11350 <span class="w"> </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11351 <span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">cvt</span><span class="p">(</span><span class="n">f</span><span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11352 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="k">ref</span><span class="w"> </span><span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">e</span><span class="p">.</span><span class="n">kind</span><span class="p">()</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">ErrorKind</span>::<span class="n">Interrupted</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{}</span><span class="w"></span>
11353 <span class="w"> </span><span class="n">other</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">other</span><span class="p">,</span><span class="w"></span>
11354 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11355 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11356 <span class="p">}</span><span class="w"></span>
11357 </code></pre></div>
11358
11359 <p>Okay! Here <code>f</code> is the lambda that calls <code>open64()</code>; <code>cvt_r()</code> calls
11360 it in a loop and translates the POSIX-like result into something
11361 friendly to Rust. This loop is where it handles <code>EINTR</code>, which gets
11362 translated into <code>ErrorKind::Interrupted</code>. I suppose <code>cvt_r()</code> stands
11363 for <code>convert_retry()</code>? Let's look at
11364 the <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/sys/unix/mod.rs#L147">implementation of <code>cvt()</code></a>, which fetches the error code:</p>
11365 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">cvt</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">IsMinusOne</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11366 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">is_minus_one</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11367 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">io</span>::<span class="n">Error</span>::<span class="n">last_os_error</span><span class="p">())</span><span class="w"></span>
11368 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11369 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">t</span><span class="p">)</span><span class="w"></span>
11370 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11371 <span class="p">}</span><span class="w"></span>
11372 </code></pre></div>
11373
11374 <p>(The <code>IsMinusOne</code> shenanigans are just a Rust-ism to help convert
11375 multiple integer types without a lot of <code>as</code> casts.)</p>
11376 <p>The above means, if the POSIX-like result was -1, return an <code>Err()</code> from
11377 the last error returned by the operating system. That should surely
11378 be <code>errno</code> internally, correct? Let's look at
11379 the <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/io/error.rs#L268">implementation for <code>io::Error::last_os_error()</code></a>:</p>
11380 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">last_os_error</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Error</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11381 <span class="w"> </span><span class="n">Error</span>::<span class="n">from_raw_os_error</span><span class="p">(</span><span class="n">sys</span>::<span class="n">os</span>::<span class="n">errno</span><span class="p">()</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">i32</span><span class="p">)</span><span class="w"></span>
11382 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11383 </code></pre></div>
11384
11385 <p>We don't need to look at <code>Error::from_raw_os_error()</code>; it's just a
11386 conversion function from an <code>errno</code> value into a Rust enum value.
11387 However, let's look at <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/sys/unix/os.rs#L60"><code>sys::os::errno()</code></a>:</p>
11388 <div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">errno</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"></span>
11389 <span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11390 <span class="w"> </span><span class="p">(</span><span class="o">*</span><span class="n">errno_location</span><span class="p">())</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">i32</span><span class="w"></span>
11391 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11392 <span class="p">}</span><span class="w"></span>
11393 </code></pre></div>
11394
11395 <p>Here, <code>errno_location()</code> is an <code>extern</code> function defined in GNU libc
11396 (or whatever C library your Unix uses). It returns a pointer to the
11397 actual int which is the <code>errno</code> thread-local variable. Since non-C
11398 code can't use libc's global variables directly, there needs to be a
11399 way to get their addresses via function calls - that's what
11400 <code>errno_location()</code> is for.</p>
11401 <h2>And on Windows?</h2>
11402 <p>Remember the internal <code>File.open()</code>? This is what it looks
11403 like <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/sys/windows/fs.rs#L257">on Windows</a>:</p>
11404 <div class="highlight"><pre><span></span><code><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">open</span><span class="p">(</span><span class="n">path</span>: <span class="kp">&amp;</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">opts</span>: <span class="kp">&amp;</span><span class="nc">OpenOptions</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="n">File</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11405 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">to_u16s</span><span class="p">(</span><span class="n">path</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w"></span>
11406 <span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">handle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11407 <span class="w"> </span><span class="n">c</span>::<span class="n">CreateFileW</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">as_ptr</span><span class="p">(),</span><span class="w"></span>
11408 <span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">get_access_mode</span><span class="p">()</span><span class="o">?</span><span class="p">,</span><span class="w"></span>
11409 <span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">share_mode</span><span class="p">,</span><span class="w"></span>
11410 <span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">security_attributes</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">_</span><span class="p">,</span><span class="w"></span>
11411 <span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">get_creation_mode</span><span class="p">()</span><span class="o">?</span><span class="p">,</span><span class="w"></span>
11412 <span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="n">get_flags_and_attributes</span><span class="p">(),</span><span class="w"></span>
11413 <span class="w"> </span><span class="n">ptr</span>::<span class="n">null_mut</span><span class="p">())</span><span class="w"></span>
11414 <span class="w"> </span><span class="p">};</span><span class="w"></span>
11415 <span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">handle</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">c</span>::<span class="n">INVALID_HANDLE_VALUE</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11416 <span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">Error</span>::<span class="n">last_os_error</span><span class="p">())</span><span class="w"></span>
11417 <span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
11418 <span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">File</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">handle</span>: <span class="nc">Handle</span>::<span class="n">new</span><span class="p">(</span><span class="n">handle</span><span class="p">)</span><span class="w"> </span><span class="p">})</span><span class="w"></span>
11419 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11420 <span class="w"> </span><span class="p">}</span><span class="w"></span>
11421 </code></pre></div>
11422
11423 <p><code>CreateFileW()</code> is the Windows API function to open files. The
11424 conversion of error codes inside <code>Error::last_os_error()</code> happens
11425 analogously - it calls <code>GetLastError()</code> from the Windows API and
11426 converts it.</p>
11427 <h2>Can we not call C libraries?</h2>
11428 <p>The Rust/Unix code above depends on the system's libc for <code>open()</code> and
11429 <code>errno</code>, which are entirely C constructs. Libc is what actually does
11430 the system calls. There are efforts to make the Rust standard library
11431 <em>not</em> use libc and use syscalls directly.</p>
11432 <p>As an example, you can look at
11433 the <a href="https://github.com/rust-lang/rust/blob/3f8b93693da78c2cfe1d7f1dc6834c5ba61e0cc0/src/libstd/sys/redox/syscall/call.rs">Rust standard library for Redox</a>. Redox is a new operating
11434 system kernel entirely written in Rust. Fun times!</p>
11435 <p><strong>Update:</strong> If you want to see what a C-less libstd would look
11436 like, <a href="https://github.com/japaric/steed">take a look at steed</a>, an effort to reimplement Rust's libstd
11437 without C dependencies.</p>
11438 <h1>Conclusion</h1>
11439 <p>Rust is very meticulous about error handling, but it succeeds in
11440 making it pleasant to read. I/O functions give you back an
11441 <code>io::Result&lt;&gt;</code>, which you piece apart to see if it succeeded or got an
11442 error.</p>
11443 <p>Internally, and for each platform it supports, the Rust standard
11444 library translates <code>errno</code> from libc into an <code>io::ErrorKind</code> Rust
11445 enum. The standard library also automatically handles Unix-isms like
11446 retrying operations on <code>EINTR</code>.</p>
11447 <p>I've been enjoying reading the <a href="https://github.com/rust-lang/rust/tree/master/src/libstd">Rust standard library code</a>: it
11448 has taught me many Rust-isms, and it's nice to see how the
11449 hairy/historical libc constructs are translated into clean Rust
11450 idioms. I hope this little trip down the rabbit hole for the
11451 <code>open(2)</code> system call lets you look in other interesting places, too.</p></content><category term="misc"></category><category term="rust"></category></entry><entry><title>Moving to a new blog engine</title><link href="https://people.gnome.org/~federico/blog/new-blog.html" rel="alternate"></link><published>2017-06-09T09:08:15-05:00</published><updated>2017-06-09T09:13:41-05:00</updated><author><name>Federico Mena Quintero</name></author><id>tag:people.gnome.org,2017-06-09:/~federico/blog/new-blog.html</id><summary type="html"><p>In 2003 I wrote
11452 an
11453 <a href="https://people.gnome.org/~federico/misc/activity-log.el">Emacs script to write my blog and produce an RSS feed</a>.
11454 Back then, I seemed to write multiple short blog entries in a day
11455 rather than longer articles (<em>doing Mastodon before it was cool?</em>).
11456 But my blogging patterns have changed. I've been wanting to add …</p></summary><content type="html"><p>In 2003 I wrote
11457 an
11458 <a href="https://people.gnome.org/~federico/misc/activity-log.el">Emacs script to write my blog and produce an RSS feed</a>.
11459 Back then, I seemed to write multiple short blog entries in a day
11460 rather than longer articles (<em>doing Mastodon before it was cool?</em>).
11461 But my blogging patterns have changed. I've been wanting to add some
11462 more features to the script: moving to a page-per-post model, support
11463 for draft articles, tags, and syntax highlighting for code excerpts...</p>
11464 <p>This is a wheel that I do not find worth reinventing these days.
11465 After <a href="https://mastodon.social/@federicomena/8360985">asking on Mastodon</a> about static site
11466 generators (thanks to everyone who replied!), I've decided to give
11467 <a href="https://blog.getpelican.com/">Pelican</a> a try. I've reached the age where "obvious, beautiful
11468 documentation" is high on my list of things to look for when shopping
11469 for tools, and Pelican's docs are nice from the start.</p>
11470 <p>The old blog is still available <a href="https://people.gnome.org/~federico/news.html">in the old location</a>.</p>
11471 <p>If you find broken links, or stuff that doesn't work correctly here,
11472