tags/webunfolding disastershttp://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/unfolding disastersikiwiki2011-11-16T00:48:33ZPostfixhttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/Postfix/2011-11-16T00:48:33Z2011-11-16T00:48:33Z
<p>I spent some time today configuring <a href="http://www.postfix.org/">Postfix</a> so I could send mail
from home via <span class="createlink">SMTPS</span>. Verizon, our ISP, blocks port 25 to
external domains, forcing all outgoing mail through their
<code>outgoing.verizon.net</code> exchange server. In order to accept mail, they
also require you authenticate with your Verizon username and password,
so I wanted to use an encrypted connection.</p>
<p>For the purpose of this example, our Verizon username is <code>jdoe</code>, our
Verizon password is <code>YOURPASS</code>, you're running a local Postfix server
on <code>mail.example.com</code> for your site at <code>example.com</code>, and <code>12345</code> is a
free local port.</p>
<pre><code># cat /etc/postfix/main.cf
myhostname = mail.example.com
relayhost = [127.0.0.1]:12345
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/saslpass
sender_canonical_maps = hash:/etc/postfix/sender_canonical
# cat /etc/postfix/saslpass
[127.0.0.1]:12345 jdoe@verizon.net:YOURPASS
# postmap /etc/postfix/saslpass
# cat /etc/postfix/sender_canonical
root@mail.example.com jdoe@example.com
root@example.com jdoe@example.com
root@localhost jdoe@example.com
jdoe@mail.example.com jdoe@example.com
jdoe@localhost jdoe@example.com
# postmap /etc/postfix/sender_canonical
# cat /etc/stunnel/stunnel.conf
[smtp-tls-wrapper]
accept = 12345
client = yes
connect = outgoing.verizon.net:465
# /etc/init.d/stunnel restart
# postfix reload
</code></pre>
<p>Test with:</p>
<pre><code>$ echo 'testing 1 2' | sendmail you@somewhere.com
</code></pre>
<p>Here's what's going on:</p>
<ul>
<li>You hand an outgoing message to your local Postfix, which decides to
send it via port <code>12345</code> on your localhost (<code>127.0.0.1</code>) (<code>relayhost</code>).</li>
<li>Stunnel picks up the connection from Postfix, encrypts everything,
and forwards the connection to port 465 on <code>outgoing.verizon.net</code>
(<code>stunnel.conf</code>).</li>
<li>Postfix identifies itself as <code>mail.example.com</code> (<code>myhostname</code>), and
authenticates using your Verizon credentials (<code>smtp_sasl_…</code>).</li>
<li>Because Verizon is picky about the <code>From</code> addresses it will accept,
we use <code>sender_canonical</code> to map addresses to something simple that
we've tested.</li>
</ul>
<p>And that's it :p. If you're curious, there's more detail about all
the Postfix config options in the <a href="http://www.postfix.org/postconf.5.html">postconf</a> man page, and there's
good SASL information in the <a href="http://www.postfix.org/SASL_README.html">SASL_README</a>.</p>
<p>There's also a <a href="http://www.zulius.com/how-to/set-up-postfix-with-a-remote-smtp-relay-host/">blog post by Tim White</a> which I found useful.
Because Verizon lacks <a href="http://en.wikipedia.org/wiki/STARTTLS">STARTTLS</a> support, his approach didn't work
for me out of the box.</p>
SMTPhttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/SMTP/2011-11-16T00:45:35Z2011-11-16T00:45:35Z
<p>Verizon blocks outgoing connections on port 25 (<a href="http://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol">SMTP</a>) unless you
are connecting to their <code>outgoing.verizon.net</code> message exchange
server. This server requires authentication with your Verzon
username/password before it will accept your mail. For the purpose of
this example, our Verizon username is <code>jdoe</code>, our Verizon password is
<code>YOURPASS</code>, and were sending email from <code>me@example.com</code> to
<code>you@target.edu</code>.</p>
<pre><code>$ nc outgoing.verizon.net 25
220 vms173003pub.verizon.net -- Server ESMTP (...)
mail from: <jdoe@example.com>
550 5.7.1 Authentication Required
quit
221 2.3.0 Bye received. Goodbye.
</code></pre>
<p>Because authenticating over an unencrypted connection is a Bad Idea™,
I was looking for an encrypted way to send my outgoing email.
Unfortunately, Verizon's exchange server does not support <a href="http://en.wikipedia.org/wiki/STARTTLS">STARTTLS</a>
for encrypting connections to <code>outgoing.verizon.net:25</code>:</p>
<pre><code>$ nc outgoing.verizon.net 25
220 vms173003pub.verizon.net -- Server ESMTP (...)
ehlo example.com
250-vms173003pub.verizon.net
250-8BITMIME
250-PIPELINING
250-CHUNKING
250-DSN
250-ENHANCEDSTATUSCODES
250-HELP
250-XLOOP E9B7EB199A9B52CF7D936A4DD3199D6F
250-AUTH DIGEST-MD5 PLAIN LOGIN CRAM-MD5
250-AUTH=LOGIN PLAIN
250-ETRN
250-NO-SOLICITING
250 SIZE 20971520
starttls
533 5.7.1 STARTTLS command is not enabled.
quit
221 2.3.0 Bye received. Goodbye.
</code></pre>
<p>Verizon <a href="http://www22.verizon.com/residentialhelp/fiosinternet/email/setup+and+use/questionsone/86782.htm">recommends</a> pre-STARTTLS approach of wrapping the
whole SMTP connection in TLS (<a href="http://en.wikipedia.org/wiki/SMTPS">SMTPS</a>), which it provides via
<code>outgoing.verizon.net:465</code>:</p>
<pre><code>$ python -c 'from base64 import *; print b64encode("\0jdoe@verizon.net\0YOURPASS")'
AGpkb2VAdmVyaXpvbi5uZXQAWU9VUlBBU1M=
$ openssl s_client -connect outgoing.verizon.net:465
...
220 vms173013pub.verizon.net -- Server ESMTP (...)
ehlo example.com
250-vms173013pub.verizon.net
250-8BITMIME
250-PIPELINING
250-CHUNKING
250-DSN
250-ENHANCEDSTATUSCODES
250-HELP
250-XLOOP 9380A5843FE933CF9BD037667F4C950D
250-AUTH DIGEST-MD5 PLAIN LOGIN CRAM-MD5
250-AUTH=LOGIN PLAIN
250-ETRN
250-NO-SOLICITING
250 SIZE 20971520
auth plain AGpkb2VAdmVyaXpvbi5uZXQAWU9VUlBBU1M
235 2.7.0 plain authentication successful.
mail from: <me@example.com>
250 2.5.0 Address Ok.
rcpt to: <you@target.edu>
250 2.1.5 you@target.edu OK.
data
354 Enter mail, end with a single ".".
From: Me <me@example.com>
To: You <you@target.edu>
Subject: testing
hello world
.
250 2.5.0 Ok, envelope id 4BHMFEZ7PHSETMT6@vms173013.mailsrvcs.net
quit
221 2.3.0 Bye received. Goodbye.
closed
</code></pre>
<p>This works, but with the rise of STARTTLS, getting your local
<a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/Postfix/">Postfix</a> mail server to support SMTPS requires a bit of
<a href="http://www.postfix.org/TLS_README.html#client_smtps">fancyness</a> with <span class="createlink">stunnel</span>. The stunnel workaround is not too
complicated, but I also wanted to look into the <a href="http://tools.ietf.org/html/rfc4409">submission</a>
protocol (port 587), which adapts SMTP (designed for message transfer)
into a similar protocol for message submission. Unfortunately,
Verizon does not support STARTTLS here either.</p>
<pre><code>$ nc outgoing.verizon.net 587
220 vms173005.mailsrvcs.net -- Server ESMTP (...)
ehlo example.com
250-vms173005.mailsrvcs.net
250-8BITMIME
250-PIPELINING
250-CHUNKING
250-DSN
250-ENHANCEDSTATUSCODES
250-EXPN
250-HELP
250-XADR
250-XSTA
250-XCIR
250-XGEN
250-XLOOP DA941C5B31BE4B102BB69B809BC66C4A
250-AUTH DIGEST-MD5 PLAIN LOGIN CRAM-MD5
250-AUTH=LOGIN PLAIN
250-NO-SOLICITING
250 SIZE 20971520
starttls
533 5.7.1 STARTTLS command is not enabled.
quit
221 2.3.0 Bye received. Goodbye.
</code></pre>
<p>In conclusion, Verizon supports a number of email submission
standards, but the only secure approach is to use the outdated SMTPS.
See my <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/Postfix/">Postfix</a> post for details on configuring Postfix to use
Verizon's server for outgoing mail.</p>
<p>There are a number of good SMTP authentication tutorials out there. I
used <a href="http://qmail.jms1.net/test-auth.shtml">John Simpson</a> and <a href="http://www.fehcom.de/qmail/smtpauth.html">Erwin Hoffmann's</a> tutorials. For
cleaner examples of my testing tools (<code>nc</code> and <code>openssl s_client</code>),
see my <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/Simple_servers/">simple servers</a> post.</p>
Presentationshttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/Presentations/2011-05-03T22:13:49Z2011-05-03T11:02:25Z
<p>I generally write my presentations up in <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/LaTeX/">LaTeX</a> and use <a href="https://bitbucket.org/rivanvx/beamer/wiki/Home">beamer</a>
to convert them to PDFs for display. I don't really like PDFs though,
they're big and unwieldy. I was just looking around at alternative
methods and ran across the older <a href="http://www.shallowsky.com/linux/LinuxPresentations.html">Linux for Presentations
Mini-HOWTO</a> which lists a number of HTML-based presentation
formats. This is definately the way to go, and with MathML (via
itex2MML, see my <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/mdwn_itex/">mdwn itex</a> post for details) I can generate most
of the equations that I need to fit into a standard presentation.</p>
<p>The two grandaddy frameworks are <a href="http://meyerweb.com/eric/tools/s5/">S5</a> and <a href="http://www.w3.org/Talks/Tools/Slidy/">Slidy</a>. S5 goes back
to 2004, and Slidy has been around since 2005, so they both go back a
ways. The code you'll be writing is fairly similar in each of them,
so just pick whichever you like best. If it matters to you, Slidy is
a <a href="http://www.w3.org/Talks/Tools/">W3C recommendation</a></p>
<p>However, neither of these early projects seem to be actively
developed. For example, <a href="http://groups.google.com/group/s5project/browse_thread/thread/2b6c0a5c14e1b997">this thread</a> on the <a href="http://groups.google.com/group/s5project/">S5 Project
Google group</a> discusses fracturing in the S5 community. As
people try to work around issues with the older frameworks, they've
written up new frameworks. <a href="https://github.com/geraldb/s6">S6</a> wiki has some <a href="https://github.com/geraldb/slideshow/wiki">pointers to other
options</a>.</p>
<p>Personally, I'm going to go with <a href="https://github.com/geraldb/s6">S6</a>, since the project uses Git
for version control which is a <em>Good Thing</em>. There's also <a href="http://slideshow.rubyforge.org/">S9</a>,
which lets you build S6 presentations with some sort of wiki-syntax
instead of HTML, if that's appealing to you. If you need MathML on
Firefoxes before version 4 (which includes HTML 5 support), you'll
need a framework like Slidy that supports XHTML.</p>
<p>Finally, there's a <a href="http://groups.google.com/group/webslideshow">webslideshow</a> Google group that tracks
developments in this area.</p>
<p>Good luck!</p>
get_css.pyhttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/get_css/2011-01-09T12:24:31Z2011-01-09T12:24:31Z
<p>The <a href="http://www.drexel.edu/physics/">Drexel physics department</a> moved most of its content off of
the department servers and onto college servers this quarter. The
college servers manage their content with SiteCore, so there was a
reasonable amount of trouble getting everything over (see
<a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/SiteCorePy/">SiteCorePy</a>). Luckily, I got <em>lots</em> of help, and now I don't have
to worry about the content that has migrated :). However, not all of
the content made the switch.</p>
<p>We have a number of forms and databases that stayed on our department
servers, and it's my job to make sure those pages look similar to the
SiteCore pages that link to them. No problem, you say, just clone the
SiteCore page's CSS, and apply it to the local pages. That's exactly
what I want to do, but the jittery folks upstream keep changing the
CSS, so my cloned CSS gets out of sync fairly quickly. To minimize my
suffering, I've written a little script to automate the task of
cloning another page's CSS.</p>
<p><a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/get_css/get_css.py">get css.py</a> scrapes an (X)HTML page for stylesheets (assuming there
is no embedded styling in the HTML itself). It then downloads all
those CSS files, cleans them up with <a href="http://code.google.com/p/cssutils/">cssutils</a>, and saves a single
clone stylesheet mimicking their behaviour. It also downloads all
media referenced via <code>url(...)</code> entries in the CSS (e.g. background
images), and adjusts the CSS to point to the local copies.</p>
Friend of a Friendhttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/FOAF/2010-11-09T17:39:59Z2010-11-09T17:39:59Z
<p><a href="http://www.foaf-project.org/">FOAF</a> is an <a href="http://www.w3.org/RDF/">RDF</a> language describing social networks and related
information. FOAF tries to social networking available to the
<a href="http://en.wikipedia.org/wiki/Semantic_Web">Semantic Web</a>, making your identity and friendships machine
readable. An interesting FOAF application, if you have enough
tech-savy friends is <a href="http://esw.w3.org/Foaf%2Bssl">Foaf+SSL</a>. For an intuitive explanation of
how FOAF+SSL works, take a look at Henry Story's <a href="http://blogs.sun.com/bblfish/entry/the_foaf_ssl_paradigm_shift">paradigm shift</a>
post.</p>
<p>For the curious, I've created <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/FOAF/foaf.rdf">my own FOAF file</a>. You
should also check out the <a href="http://sioc-project.org/">SIOC Project</a>'s <a href="http://sioc-project.org/firefox">Semantic Radar</a>
plugin, which notifies you of the existence of interesting RDF data as
you browse the web (and inspired my <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/rel-vcs/">rel-vcs</a> plugin).</p>
Parallel computinghttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/Parallel_computing/2010-10-14T12:59:13Z2010-10-14T12:59:13Z
<p><span class="infobox">
Available in a <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../git/">git</a> repository.<br />
Repository: <a href="http://www.physics.drexel.edu/~wking/code/git/gitweb.cgi?p=parallel_computing.git" rel="vcs-git" title="parallel_computing repository">parallel_computing</a><br />
Author: W. Trevor King<br />
</span></p>
<p>In contrast to my <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/Course_website/">course website</a> project, which is mostly about
constructing a framework for automatically compiling and installing
<a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/LaTeX/">LaTeX</a> problem sets, <a href="http://www.physics.drexel.edu/directory/faculty/homepage/?lname=Valli%C3%A8res&fname=Michel">Prof. Vallières'</a> <a href="http://www.physics.drexel.edu/~valliere/PHYS405/">Parallel
Computing</a> course is basically an online textbook with a
large amount of example software. In order to balance between to
Prof. Vallières' original and my own aesthetic, I rolled a new
solution from scratch. See <a href="http://www.physics.drexel.edu/~wking/courses/phys405_f10/">my version of his Fall 2010 page</a>
for a live example.</p>
<p>Differences from my course website project:</p>
<ul>
<li>No PHP, since there is no dynamic content that cannot be handled
with <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/SSI/">SSI</a>.</li>
<li>Less installation machinery. Only a few build/cleanup scripts to
avoid versioning really tedious bits. The repository is designed to
be dropped into your <code>~/public_html/</code> whole, while the course
website project is designed to <code>rsync</code> the built components up as
they go live.</li>
<li>Less LaTeX, more XHTML. It's easier to edit XHTML than it is to
exit and compile LaTeX, and PDFs are large and annoying. As a
computing class, there are fewer graphics than there are in an
intro-physics class, so the extra power of LaTeX is not as useful.</li>
</ul>
Course websitehttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/Course_website/2010-10-14T12:55:17Z2010-10-14T12:55:17Z
<p><span class="infobox">
Available in a <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../git/">git</a> repository.<br />
Repository: <a href="http://www.physics.drexel.edu/~wking/code/git/gitweb.cgi?p=intro-physics.git" rel="vcs-git" title="intro-physics repository">intro-physics</a><br />
Author: W. Trevor King<br />
</span></p>
<p>Over a few years as a TA for assorted introductory physics classes,
I've assembled a nice website framework with lots of problems using my
<a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/LaTeX/">LaTeX</a> <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/problempack/">problempack</a> package, along with some handy <code>Makefiles</code>,
a bit of <span class="createlink">php</span>, and <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/SSI/">SSI</a>.</p>
<p>The result is the <code>intro-physics</code> package, which should make it very
easy to whip up a course website, homeworks, etc. for an introductory
mechanics or E&M class (321 problems implemented as of October 2010).
With a bit of work to write up problems, the framework could easily be
extended to other subjects.</p>
<p>The idea is that a course website consists of a small, static HTML
framework, and a bunch of content that is gradually filled in as the
semester/quarter progresses. I've put the HTML framework in the
<code>html/</code> directory, along with some of the write-once-per-course
content (e.g. Prof & TA info). See <code>html/README</code> for more information
on the layout of the HTML.</p>
<p>The rest of the directories contain the code for compiling material
that is deployed as the course progresses. The <code>announcements/</code>
directory contains the atom feed for the course, and possibly a list
of email addresses of people who would like to (or should) be notified
when new announcements are posted. The <code>latex/</code> directory contains
LaTeX source for the course documents for which it is available, and
the <code>pdf/</code> directory contains PDFs for which no other source is
available (e.g. scans, or PDFs sent in by Profs or TAs who neglected
to include their source code).</p>
<p>Note that because this framework assumes the HTML content will be
relatively static, it may not be appropriate for courses with large
amounts of textbook-style content, which will undergo more frequent
revision. It mayq also be excessive for courses that need less
compiled content. For an example of another framework, see my
<a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/Parallel_computing/">branch</a> of <a href="http://www.physics.drexel.edu/directory/faculty/homepage/?lname=Valli%C3%A8res&fname=Michel">Prof. Vallières'</a> <a href="http://www.physics.drexel.edu/~valliere/PHYS405/">Parallel
Computing</a> website.</p>
rel-vcshttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/rel-vcs/2010-11-09T17:39:59Z2010-10-08T03:20:04Z
<p>Since I publish a lot of <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/Git/">Git</a> packages, I was interested to read
about <a href="http://kitenet.net/~joey/">Joey Hess</a>' <a href="http://kitenet.net/~joey/rfc/rel-vcs/">rel=vcs-* microformat</a>. I think
recording the location of the repo sorcing a page is a great idea, but
with the link stashed in the page header, I could easily browse on by
without ever noticing that the link existed.</p>
<p>This looks like the same sort of problem that the <a href="http://sioc-project.org/firefox">Semantic Radar</a>
extension was designed to solve, except the SR extension notifies you
about RDF files (SIOC, <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/FOAF/">FOAF</a>, DOAP, etc.). I've altered the SR
extension to identify the <code>rel=vcs-*</code> tags.</p>
<p>The <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/rel-vcs/rel-vcs.xpi">rel-vcs</a> extension places an icon in your Firefox
statusbar which goes "hot" when a page has <code>rel=vcs-*</code> tags and "cold"
otherwise. When the icon is "hot", you can click on it to pop up a
list of rel-vcs links. Clicking on an item in the list will open that
URI in a new tab. Since Firefox can't speak <code>git://</code> etc., the new
tab will mostly be useful as a source of the URI for copy/pasting into
a <code>git clone ...</code> call or similar. Alternatively, you can consider
the "hot" icon as a suggestion to use <code>webcheckout</code> or other
<code>rel=vcs-*</code> consumer on the source page.</p>
Literate programminghttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/Literate_programming/2010-10-05T19:24:12Z2010-10-05T19:24:12Z
<p><a href="http://en.wikipedia.org/wiki/Literate_programming">Literate programming</a> is a philosophy of computer programming
based on the premise that a computer program should be written similar
to literature, with human readability as a primary goal.</p>
<p>Traditional programs have human language comments interspersed in
computer language code. Literate programs reverse this style, with
computer language "comments" interspersed in a human language essay.</p>
<p>An excellent (and simple) literate programming tool is <a href="http://www.eecs.harvard.edu/nr/noweb/">noweb</a>.
There is a hello-world example and some intro information on <a href="http://en.wikipedia.org/wiki/Noweb">Wikipedia</a>.
There are also official <a href="http://www.eecs.harvard.edu/nr/noweb/onepage.ps">quick</a> and <a href="http://www.linuxjournal.com/article/2188">6 page</a> introductions.</p>
eXtensible Style Language Transformshttp://www.physics.drexel.edu/~wking/unfolding-disasters/posts/XSLT/2011-07-01T18:55:13Z2010-10-05T19:18:38Z
<p>Often data is stored in
<abbr title="eXtensible Markup Language">XML</abbr> files must be
massaged into other formats (e.g. <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/DocBook_5/">DocBook to roff</a>).
There are well developed procedures for defining such transformations
(<a href="http://www.w3.org/TR/xslt">XSLT</a>) and a number of tools to apply them (e.g. <a href="http://www.xmlsoft.org/XSLT/">xsltproc</a>).</p>
<p>Besides the <a href="http://www.w3schools.com/xsl/">W3 tutorial</a>, there is also a nice
<a href="http://nwalsh.com/docs/tutorials/xsl/xsl/slides.html">introduction</a> by Paul Grosso and Norman Walsh. I've copied a
simple <a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/XSLT/chapter/">example</a> from this intro and also included a
<a href="http://www.physics.drexel.edu/~wking/unfolding-disasters/tags/web/../../posts/XSLT/code/">slightly more complicated setup</a> for generating online help
for a list of macros.</p>
<p>XSLT is also useful for standardizing XML content. For example, I was
recently trying to compare to <span class="createlink">Gramps</span> XML files, to see what had
changed between two database backups. Unfortunately, the backup XML
was not sorted by <code>id</code>, so there were many diff chunks due to node
shuffling that didn't represent any useful information. With the
following XSLT:</p>
<pre><code><?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- sort node children by their `id` attributes -->
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:for-each select="node()">
<xsl:sort select="@id" order="ascending"/>
<xsl:apply-templates select="."/>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
</code></pre>
<p>With the above saved as <code>sort-by-id.xsl</code>, you can sort <code>some.xml</code> using</p>
<pre><code>$ xsltproc --nonet --novalid sort-by-id.xsl some.xml
</code></pre>
<p>You can compare two <span class="createlink">Gramps</span> XML files with</p>
<pre><code>$ diff -u <(zcat a.gramps | xsltproc --nonet --novalid sort-by-id.xsl -)
<(zcat b.gramps | xsltproc --nonet --novalid sort-by-id.xsl -) | less
</code></pre>
<p>Jesper Tverskov has a nice page about <a href="http://www.xmlplease.com/xsltidentity">the identity template and
related tricks</a> if you want more examples of quasi-copy
transforms.</p>