Your daily cup of relativism

To content | To menu | To search

Thursday 9 April 2009

OCaml Journal

Alp Mestan has written a short review of the OCaml Journal. I can only echo his recommendation. I have just started learning OCaml, and can use as much exposure as possible to a variety of interesting problems and solutions. Last week I subscribed to the OCaml Journal, and that has been a very good decision.

Subscribers have access to the archive of previous articles, which are very diverse, both in subject and level. As a newcomer, I have enjoyed the articles about pattern matching, the OCaml standard library, and object-oriented programming in OCaml. I have started looking into the articles about combinators, parser combinators, and optimization. All articles are written in a clear manner and complemented by fun examples and plenty of code to tinker with. If you enjoy OCaml, and want to learn more by theory and practice, I can wholeheartedly recommend the OCaml Journal.

Saturday 4 April 2009

First OCaml steps: a memoizing longest common subsequence function

As I mentioned in my previous blog post, I just started studying OCaml. Since the OCaml for Scientists book is still in the delivery pipeline, I decided to make a headstart with the Introduction to Objective Caml book. It is a lot of fun trying the examples, and making some exercises from various OCaml courses on the web. Programming functionally requires a paradigm switch very much like when you start writing Prolog for the first time, and is a good brain exercise in itself. And what is a better way to get into shape than writing some basic algorithms that most of us have probably implemented often in an imperative language? One such simple algorithm is determining the longest common subsequence (LCS) of two sequences. If we model a sequence as a list, the algorithm very simple:

  • The longest common subsequence of a list and and empty list is the empty list.
  • The longest common subsequence of two lists with equal heads is the head concatenated with the LCS of the tails of the lists.
  • The longest common subsequence of two lists with different heads is the longest LCS of:
    • The first list and the tail of the second list.
    • The second list and the tail of the first list.

It should be noted that this approach only returns one LCS if there are multiple possible LCSes for two sequences.

We can almost directly implement this in OCaml code:

let rec lcs l1 l2 = match l1, l2 with
    [], _
  | _, [] -> []
  | h1::t1, h2::t2 when h1 = h2 -> h1 :: (lcs t1 t2)
  | h1::t1, h2::t2 -> let s1 = lcs t1 (h2::t2) in
    let s2 = lcs (h1::t1) t2 in
      if (List.length s1 > List.length s2) then s1
      else s2;;

Here you can see that lcs is a recursive function with two arguments. We can use a technique called pattern matching to match each of the cases listed in the description above. The first two patterns ([], _ and _, []) match if one of the lists is empty, and if l1, l2 matches one of these patterns, the expression after the arrow wil be evaluated and returned. In this case, it's just the empty list. The next pattern matches two non-empty list, but uses an additional pattern guard after the when keyword to restrict this case further. The list matching itself uses the so-called cons operator to split the list in a head and a tail (very much like [Head|Tail] in Prolog). The guard adds the requirement that the heads should be equal. If this is the case, we will return a list with the shared head as the list head, and a tail which is the LCS of the tail of the lists. Finally, the fourth pattern also matches two non-empty lists, but has no further guards. As a consequence of the preceding paterns, it will match to non-empty lists with inequal heads. In this case, we will calculate two LCSes as described above, and will return the longest.

Although this implementation has the elegance of almost directly representing the algorithm, it is also very slow. For example, computing the LCS of the sequences [2;7;5;6;7;4;3;2;9;6;4;2;7;3;2;7;8;1] and [3;5;8;2;3;4;5;7;2;8;3;2;3;8;7;3;5;5] took 50 seconds on my machine. Fortunately, we can optimize this quite easily: the LCS algorithm divides the problem in subproblems, and there are many overlapping subproblems. So, if we memoize the LCSes of subproblems, we don't have to calculate them over and over again. We could either directly introduce some storage for memoizing results in the lcs function directly, but why not make a separate memoization function that the lcs function can use? Such a memoization function should do the following given two arguments: look the arguments up and see if we already have a result for these arguments, and if not call the given function to calculate the result and store the arguments with the result. Of course, the argument/result storage needs to be persistent to the memoization function. This is my first attempt (I am not sure if it can be made more elegant, since I am explicitly naming two arguments here):

let memo2 f =
  let h = Hashtbl.create 13 in
  let rec fmem a1 a2 =
    try Hashtbl.find h (a1, a2) with
      | Not_found -> try Hashtbl.find h (a2, a1) with
          | Not_found ->
              let ha r = Hashtbl.add h (a1, a2) r; r in
                ha (f fmem a1 a2) in
    fmem;;

This function returns a function that takes two arguments (like the original lcs function), and evaluates in the following manner:

  • If the tuple of arguments (a1, a2) is known as a key in the hash table, return its value.
  • Otherwise, if the tuple of arguments (a2, a1) is known as a key in the hash table, return its value. We want to do this here, since the lcs function is symmetric for two given arguments.
  • Otherwise, call the function f, and restore the result in the hash table.

As you can see here, the memoizing function fmem is passed to f, since f itself will have to apply the memoization function for a recursive call:

let lcs f l1 l2 = match l1, l2 with
    [], _
  | _, [] -> []
  | h1::t1, h2::t2 when h1 = h2 -> h1 :: (f t1 t2)
  | h1::t1, h2::t2 -> let s1 = f t1 (h2::t2) in
    let s2 = f (h1::t1) t2 in
      if (List.length s1 > List.length s2) then s1
      else s2;;

With these functions implemented, we can make a memoizing lcs in one line:

let lcs_memo = memo2 lcs;;

And, as expected, the memoizing function is much faster, and completes on my machine in a fraction of a second.

Wednesday 1 April 2009

Learning OCaml

For some time now learning a functional language has been somewhere near the top of my priority list. I purchased the Programming in Scala book to learn Scala, which is a multi-paradigm language combining Object-Oriented programming and functional programming. Of course, a potential trap is using Scala as an extended Java (just as the inherent danger of learning C++ coming from a C background is seeing it as C with classes). Additionally, Scala currently only runs on top of the Java or .NET virtual machine, and would like to bind directly to other languages, primarily Python and C/C++. For these reasons I decided to move even further towards functional programming and settle for either Haskell or a member of the ML family.

Since I did not have any prior experience with either (except for some short Haskell labs), these considerations made me choose for OCaml:

  • OCaml has an eager evaluation regime by default, which is easier to reason about, especially coming from an imperative background.
  • While mutable state should be prevented in most cases, OCaml makes it easy to make data structures with mutable state. For some data structures mutable state is just more fit, or far more efficient.
  • OCaml supports object oriented programming, which can be handy at times.

I know that there are counter arguments against each point, and extensions (such as O'Haskell) to give Haskell the same characteristics. But I do not have any convictions with respect to these languages (yet ;)), so I am just picking the one I think I will be most comfortable with.

To learn the language, I ordered OCaml for Scientists by Jon Harrop, and downloaded the Introduction to Objective Caml book by Jason Hickey (which will hopefully be available in dead-tree format soon). I have made a separate blog category (OCaml) to report on my experiences with what seems to be an exciting language :).

Tuesday 11 November 2008

Crossing over

It's fun to see how far our favorite platform has come:

$ AR=i586-mingw32msvc-ar CXX=i586-mingw32msvc-g++ make -j3
[...]
$ ./train ~/brown/brown-all-simplified-train lexicon ngrams
$ ./evaluate lexicon ngrams ~/brown/brown-all-simplified-test 
Accuracy (known): 0.965297
Accuracy (unknown): 0.81575
Accuracy (overall): 0.959275

For the uninitiated: first, a MinGW cross-compiler is used to compile Win32 binaries of this particular program on GNU/Linux. Then, we execute two of the compiled programs with Wine, which is an implementation of the Win32 API for UNIX-like systems.

PS. Debian includes a MinGW cross-compiler package in their repositories. PS2. Wine is not invoked here directly, Debian installs a binary format handler for DOS/Windows binaries that invokes Wine.

Sunday 9 November 2008

Life after my thesis

It has been a while since I have updated my blog. In the meanwhile my thesis has been accepted, and I have graduated for my Master's degree. I am now working as a PhD student at the University of Groningen, where my main area of research is text generation from abstract dependency trees.

In my spare time I have been looking at the Scala language, which is a very powerful statically-typed language that leverages the Java VM and class library. For anybody interested in this language, I can wholeheartedly recommend the Programming Scala book by Martin Odersky, Lex Spoon, and Bill Venners. It's a very extensive guide to the language, written with great clarity and a large amount of examples. I can't wait until the dead-tree version is available.

Tuesday 19 August 2008

Holiday

My holiday has finally started (actually, it did last week, but I caught a fever): I have finished my Master's thesis, which was the last thing I had to complete for my Master's degree (pfew). Unfortunately, Liselotte has to finish a paper within a short time, so we had to postpone travel plans until the Christmas holiday. Things I am planning to do:

  • Late spring cleaning.
  • Catch up with the growing stack of albums I recently bought or got (Lumpy Gravy/Roxy & Elsewhere/One size fits all by Zappa, Trout mask replica by Beefheart, The Jewels/Grundstück by Einstürzende Neubauten, Modern Guilt by Beck, Double nickels on a dime by The Minutemen).
  • Maybe visit some concerts at the Noorderzon festival.
  • Continue my quest for regular climbing (mostly back to two times a week now).
  • Read the C++ GUI Programming with Qt4 book (I liked the Qt3 book).

Today I bought a new pair of climbing shoes. My previous pair was almost three years old, and it had not so nice holes. I'll snap a picture of both pairs ;). While I was at it I also bought 16 packets of chalk, oughta be enought for the next year :).

Wednesday 23 July 2008

Buggy campaign?

Vista ad Microsoft is at it again with a $300 million dollar ad campaign to counter the bad reputation of Windows Vista. While I am not a Windows user, I guess they have received an amount of scorn that is slightly out of proportion. Anyway, I think their ad campaign is a bit entertaining. It shows a renaissance-era ship (though I am not an expert in history), with the text At one point, everyone though the world was flat.. While this may have been true in ancient times, that belief was quickly revised when the classical Greeks started to study the Earth's shape. The idea that people believed that the earth was flat in medieval times is in fact a myth. So, no, not everyone did believe the world is flat when they started exploring the world centuries ago. Though, I have to admit that the imagery and slogan is is nice, and probably effective because many people are not aware of the fact that it is a myth.

Monday 21 July 2008

Jitar: a port of Sitar

Between other work I have made a port of my Sitar (C++) tagger to Java, this port is named Jitar. I took the opportunity to redesign some aspects:

  • Simplify the training data: for lexicon entries frequencies are stored now, rather than probabilities. This will allow us to use the same lexicon for the known word handler and the unknown word handler (which relies on suffix analysis). Since the CPU calculations often beat disk I/O, this does not lead to a longer startup time.
  • Store the suffixes for the unknown word handler in a tree. This makes the handler use less memory, and is faster.
  • Apply some more tweaks for unknown word handling. With these tweaks, the unknown word accuracy for our test set seems to be at the same level as TnT.

The Java port provides some other nice advantages as well, such as easy integration with programs written in other languages that run on top of a JVM (Groovy, Scala, JRuby, etc.). Jitar is also licensed under the Apache License 2.0, which allows use in FLOSS and proprietary software.

Jasper Spaans has agreed to help with the maintenance of Jitar (thanks!). I expect that we can tag a 0.0.1 version soon, and provide precompiled and source archives. I'd like to move the whole tagger to another (more general) namespace, make the training parameters less specific, add more assertions, and preferably unit tests. In the meanwhile, the code can be checked out from the development project of the Jitar project.

Saturday 21 June 2008

Asking a Windows refund, and getting it

This is an update to my previous post about getting a Windows license refund. In the meanwhile, some nice things have happened. First of all, my girlfriend was not the last to receive a refund for Windows Vista. Another Dutch Dell customer was refunded 220 Euro for Windows Vista Premium and Microsoft Works. In summary: he used a slightly modified e-mail based on the e-mail that we sent to Dell. I take my hat off for Dell Netherlands, they seem to grok their customers. Additionally, the Dutch "General Conditions" state the possibility to ask for a refund of the software (I am not sure if this paragraph existed as-is before). A quick and sloppy translation of the relevant part (please refer to the original Dutch text for an accurate formulation):

If you reject the conditions for the use of the software, and if you are a consumer, Dell accepts software returns within 7 working days after the delivery of the software, and Dell will refund the price you paid for the software.

As a final remark: it may be even better to ask your vendor over the phone if you can order a machine without Windows, this may be less work for both sides.

Thursday 29 May 2008

First Eee PC experiences

Eee PC - OpenAfter some consideration and waiting until it was stocked at a local vendor, I bought an Eee PC. Although my MacBook is pretty compact, I wanted a machine that I can use for calendaring, checking mail, and browsing. Since my phone does not offer these options, and are not that cheap without a new subscription, the Eee seemed to be a good choice. I opted for the non-surf variant with 4GB of flash storage an 512 MB RAM.

The standard Xandros-based distribution seems to be user-friendly and snappy. I was surprised to see that even OpenOffice.org booted up fairly quickly. However, I do not agree with some of the policies of Xandros (yes, I know, by buying the Eee, I paid the Xandros tax), and I prefer a system that is easy to customize for my own needs. Since the Debian community has worked hard on making Debian work well on the Eee PC, and provides an excellent Wiki page. The installation was a straight forward Debian-testing netinstall (the Eee doesn't boot from my 8GB USB memory stick, but 512MB and 1GB works fine). One of the upsides is that a driver for the wireless NIC is included in the default install, as well as some useful ACPI scripts to get the special keys of the Eee PC working. Virtually the only thing that I needed to change was the X configuration for touchpad scrolling.

Eee PC - ClosedI was a bit worried that the small screen would not be comfortable to use a normal desktop environment. So, I initially used the IceWM window manager with the GNOME network manager applet to get flexible en easy wireless connectivity. I ended up installing GNOME as well, and it turns that it mostly works fine with the screen size/resolution, although some dialogs are too large, and take some guesswork to tab through properly. Another problem is that the battery/power applet does not work well, because the machine reports the battery state in percentage rather than mAh.

All in all, it seems to be a good purchase, but I haven't tried the battery time and some other things yet. But I did accidentally drop it on the floor, and it still works :).

Tuesday 20 May 2008

Sitar: a simple part of speech tagger

Recently, I wrote a part of speech (POS) tagger in C++. A POS tagger assigns morphosyntactic labels to words, that can be used in subsequent processing, such as chunking or parsing. The tagger uses trigram Hidden Markov Models (HMM), combined with suffix analysis for unknown words. On my Brown corpus-based training set, it achieves an overall accuracy of 95.5% (74.8% for unknown words). When two parameters are hand-tuned, I achieved an accuracy of above 76% for unknown words.

The TnT tagger, which Sitar is partly modeled after, is more accurate in assigning tags to unknown words. So, this is an area which can use improvement (though, Sitar scores better than many other taggers that do not follow this methodology).

If you are interested in tinkering with Sitar, you may want to know that the source code is available under the LGPL. This license also allows for use with proprietary software, although I hope improvements are contributed. I hope this is useful to some people :).

Thursday 15 May 2008

Impact of the Debian OpenSSL vulnerability

We have posted a warning about the impact of the Debian OpenSSL vulnerability on the CentOS-announce list, but I think it is useful to repeat it here (for readers of CentOS Planet) as well:

A severe vulnerability was found in the random number generator (RNG)
of the Debian OpenSSL package, starting with version 0.9.8c-1 (and
similar packages in derived distributions such as Ubuntu). While this
bug is not present in the OpenSSL packages provided by CentOS, it may
still affect CentOS users.

The bug barred the OpenSSL random number generator from gaining enough
entropy required for generating unpredicatable keys. In fact it
appearss that the only source for entropy was the process ID of the
process generating a key, which is chosen from a very small range and
is predictable. As such, all keys generated using the Debian OpenSSL
library should be considered compromized. Programs that use OpenSSL
include OpenSSH and OpenVPN. Note that GnuPG and GNU TLS do not use
OpenSSL, so they are not affected.

This vulnerability can affect CentOS machines through the use of keys
that were generated with the OpenSSL package from Debian. For
instance, if a user uses OpenSSH public key authentication to log on
to a CentOS server, and this user generated the key pair with a
vulnerable OpenSSL library, the server is at heavy risk because the
key can be reproduced easily.

Additionally, all (good) DSA keys that were ever used on a vulnerable
Debian machine for signing or authentication should also be considered
compromized due to a known attack on DSA keys.

As a result of this bug, everyone should audit *every* key or
cerficicate that was generated with OpenSSL, to trace its origin and
make sure that it was not generated with a vulnerable Debian OpenSSL
package. Or in the case of DSA keys care should be taken that they
were not generated or used on a system with a vulnerable OpenSSL
package. Keys that are potentially compromised should be replaced with
strong keys.

The Debian Wiki[2] has a preliminary list of affected application. A
tool to detect potentially weak keys is also provided, but it contains
an incomplete list of affected keys and can give false positives.

The Metasploit project provides a full list of weak keys in various
configurations[3].

Questions on how this may affect CentOS users should be directed to
the CentOS users list. List subscription information is available
from:

http://lists.centos.org/mailman/listinfo/centos

With kind regards,
The CentOS Team

[1] http://www.debian.org/security/2008/dsa-1571
[2] http://wiki.debian.org/SSLkeys
[3] http://metasploit.com/users/hdm/tools/debian-openssl/

Wednesday 9 April 2008

CentOS vendor support

Official vendor support for an operating system contributes highly to the visibility of a system. Therefore it is very encouraging to see that VMWare is planning to support CentOS as a guest and host(?) system in its upcoming VMWare Workstation 6.5 product. Kudos go out to VMWare for planning to support CentOS, as well as releasing guest OS tools under a free software license.

Of course, we would love to see more vendors supporting CentOS. And given the fact that we try to be fully binary compatible with our upstream vendor, it should not require retraining of support personnel or much additional effort. It's surprising to see that some vendors do not support CentOS even when their infrastructure or developers rely on CentOS. Of course, many vendors will create their offerings based on customer demand. So, don't hesitate to speak up, and ask your software vendor to support CentOS. Maybe even drop a few lines on why you prefer CentOS over the operating systems that they do support (such as stability, long term support, etc.). Finally, let the community know if a major products starts supporting CentOS, other people may have been waiting for support as well (and as a kind "thank you" to that particular company).

Saturday 15 March 2008

C++ book recommendations

After having completed an excellent C++ course, I have been on the lookout for good books to venture deeper into the language. The following books turned out to be must-haves that I always try to keep within reach:

  • The C++ Standard Library - A Tutorial and Reference, Nicolai M. Josuttis
  • Beyond the C++ Standard Library: An Introduction to Boost, Björn Karlsson
  • C++ Templates - The Complete Guide, David Vandevoorde and Nicolai M. Josuttis
  • C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond, David Abrahams and Aleksey Gurtovoy

To people yet unfamiliar to C++, I have been recommending Accelerated C++, Practical Programming by Example by Andrew Koenig and Barbara E. Moo. I only had the opportunity to skim through this book, but it seems to be aimed at leveraging C++ features and the standard library right away, rather than building up to C++ from C.

Saturday 26 January 2008

Dell refunds Vista and Works license fee

Update #1: since some flaming already ensued I'd like to state first and foremost that the Dell representatives were very helpful and polite in handling this. They were open to this customer's wishes, and she was very satisfied with her purchase and the subsequent refund.
Update #2: After some requests, I have put the e-mails that were sent online.

Recently my girlfriend bought a new computer. She was looking for a model that supported GNU/Linux, and opted for a Dell Inspiron 530, one of the models that can be purchased with Ubuntu in the United States. Unfortunately, in The Netherlands no consumer models are available with Ubuntu or any other GNU/Linux distribution yet. So, with no other options available, she ordered the machine, which was very affordable and had good specs.

Since she had planned installing GNU/Linux all along, and she is not particularly fond of the though of paying the Microsoft tax for software she will wipe out right away, we took care to read the EULA that is shown the first time the machine. The license said that if the EULA is declined, the customer should contact the manufacturer (or installer) about their refund policy. By the way, the EULA box seems to have been engineered to let people accept the EULA as quickly as possible: the box in which the EULA is shown is very small, making it an uncomfortable read. Additionally, there is only a button to accept the EULA, so we appropriately used the power button as a reject button ;).

After forcefully rejecting the EULA, we cleaned the partition table and installed GNU/Linux (which, as expected, works great on the Inspiron 530). Once everything was configured, she wrote an e-mail to Dell's customer support. Since this is an English blog, I translated her e-mail:

Dear sir/madam,

A few days ago, I ordered a Dell computer. It was delivered yesterday, to my full satisfaction. The computer was pre-installed with Microsoft Windows Vista and Microsoft Works 8.0. Since I have installed GNU/Linux and declined the Windows license, I would like to make use of the refund option as described in the Windows and Works licenses.

I would like to inform how the refund procedure works, and would like to start it if possible.

Thanks in advance, With kind regards,

After a few days she received a reaction from Dell that stated that a refund would not be possible without returning the complete machine, because the license is inseparable from the hardware. In her answer she referred to previous cases where Dell Germany and Dell UK provided a refund to customers.

In the next reply a Dell representative answered that she was indeed eligible for a refund for both Windows Vista and Works. The combined refund is Euro 70 excluding tax. My conclusions:

  • This provides no guarantee that Dell will give refunds to other customers. But at the very least they seem to be open to consumer choice for GNU/Linux (they have been providing GNU/Linux on servers and workstations for a longer time). They are slowly introducing some models with GNU/Linux in the EU, and in this case they also provided a refund.
  • In the meanwhile I have heard from others that if you want a machine without Windows, it is often best to place an order by telephone to see if it is possible to order a machine without Windows, rather than using the website.
  • From this refund and other stories, it seems that the per-machine "Microsoft-tax" is about Euro 70 (excluding tax). That's quite much, try to get rid of it when you plan to erase any pre-installed system anyway. Aside the fact that it's better for your wallet, purchasing or asking for machines without Windows shows that there is customer demand for choice.

Thursday 24 January 2008

CentOS Projects

Those who are not actively monitoring the Wiki or project lists may be interested to hear that CentOS now more fornally hosts several subprojects with their own Subversion trees and ticket tracking. A list of projects is available on the Wiki. Currently there are four projects, which all potentially add a lot of value to CentOS:

  • The CentOS Live CD project will be creating live CDs of the CentOS system, starting with CentOS 5.1. The project is driven by Patrice Guay, who also created the CentOS 5.0 Live CD, and who has renewed the live CD infrastructure to use the Fedora livecd-tools.
  • Project Cranberry is working on a sysadmin toolkit, which will contain a specific set of packages aimed at system maintenance and recovery.
  • Dasha is a project that aims to bring more drivers to CentOS, which can either be drivers that were disabled in the upstream kernel, drivers backported from newer kernels, and third party drivers. Since CentOS aims at stability rather than being cutting edge, this project is a welcome addition for newer hardware.
  • Pandora is a project that works on a comfortable package browser for the CentOS repositories, that also aims to provide RSS feeds and future integration with the CentOS bugtracker.

Of course, we are always on the look-out for new contributors to the CentOS project and community, and working on CentOS projects is one of the possible ways to contribute. You can help projects by:

  • Testing code and packages produced by the projects, and submitting bug reports for problems that you encounter at the project's Trac site.
  • Contributing code to particular projects that you are interested in.
  • Proposing a new project and driving it, if it is accepted as a CentOS-hosted project.

Monday 21 January 2008

A slight forum recommendation

One of the nice things about the Libranet GNU/Linux distribution (I was employed by Libra Computer Systems Ltd. until its demise) was its friendly community. This could be witnessed on both the forums and mailing list, where people offered warm-hearted assistance to their fellow Libranet users. Unfortunately, this community mostly fell apart when Libra Computer Systems closed shop. Though, at that time Jeff Greer started the Linux Agora forum, which was formerly named DebianQuestions. Some ex-Libranetarians still frequent these forums, as well as newer community members. If you are looking for a kind, uncrowded GNU/Linux community, I can certainly recommend to check out Linux Agora.

Tuesday 11 December 2007

The last day with OS X

For the reasons outlined in my previous blog post, I have completely removed OS X and replaced it with Ubuntu 7.10. For my day to day use, modern GNU/Linux distributions are far more suitable. Besides that, the hardware vendor lock in and the loss of the possibility to fix bugs, make it even less attractive.

Sunday 9 December 2007

Five days with OS X: some frustrations, first reflections

I have to get my usual opensource *nix-ish tools running to do my daily work. There are basically three options: MacPorts, Fink, and pkgsrc. All three projects provide a ports-like system. Yesterday, I gave all three of them a shot. Fink was quickly dismissed, because some of the ports that I'd require are at fairly old versions. Many packages didn't compile well with pkgsrc. I am fairly familiar with pkgsrc, and I really love it, and it has always worked great for me on NetBSD and also pretty well on Linux. Unfortunately, I currently do not have the time to fix all packages that do not compile. MacPorts worked fairly well. One package failed to build, because the original site for the package was down temporarily (manually downloading the tarball from another site did the trick). Some other packages failed, because there were overlapping files between ports/packages. Installing with force did the trick there.

So far, MacPorts seems to be an excellent choice for running UNIXish applications on OS X. Unfortunately, it has the downsides inherent with a port collection: compile time. E.g. compiling Inkscape and all its dependencies required a few hours. An additional problem is that the X11 applications don't integrate well with OS X: the GTK+ applications have their native themes (though, a Aqua/Leopard styled theme engine would probably solve that). Besides that the performance of X11 applications seems to be subpar. E.g. rotating images in Inkscape gives very noticable flickering. As said in my previous post: this all seems to be a huge step back from APT/yum, where applications can be installed very easily, within a snap.

To look further than OS X I decided to install a Linux distribution as well (my most favorite system since ~1994). There is a small problem though, after booting a GNU/Linux with rEFIt, the keyboard can not be used at the ISOLinux prompt. Most distributions require the user to (at the very least) press enter to continue the booting process. Ubuntu is one of the exceptions: the live CD boots automatically after 30 seconds (IIRC, some other distributions like SUSE also do this, but none of the distributions that I normally use). Ubuntu seemed very snappy, even from the live CD. Post-install this Mac Mini seems to run Ubuntu faster than my other Core 2 Duo machine, maybe partly due to the excellent Intel-sponsored video drivers. An additional surprise was power management: the Mac Mini seems to use about 23 Watt of power when it is mostly idle (which is about the same as on idle OS X).

I slowly start to believe that Ubuntu is more user-friendly than OS X. I am not the typical desktop user. But OS X seems to be great if you use the i* applications or Adobe software, and the integration between various components of the desktop is very good. But if you want automatic (security) updates for all your software, look beyond the small set of Apple and third-party applications, let alone run non-Apple hardware, Ubuntu seems to be much closer to the holy grail of desktops. Especially if you would like to keep vendor choice (both of your hardware and OS).

Am I disappointed? No! OS X is a nice system, and I would like to explore it further. But apart from that: it's hard to get better hardware at that price, with only a fraction of the power use of a normal desktop machine. So, even if I end up running Linux on it only, the hardware is a good deal!

Maybe I should try Vista to complete my comparison ;). (No thanks!)

Friday 7 December 2007

Three days with OS X: the good and the bad

A short update on my first experiences with OS X. I had some pretty urgent work this week, and the good news is that I had no real problems getting stuff done. First off was a presentation that I had to finish. I prefer the LaTeX beamer document class for presentations over anything else. It lets me work on the actual content of slides, rather than formatting, and the class defaults are very sane in that they create very nice-looking slides. The MacTeX distribution was easy to set up, and provides TeX-live, Ghostscript, and some related stuff you may need.

The first less surprise came up running Mercurial, my favorite distributed SCM:

$ hg
[...]
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: UTF-8

This can be worked around by setting the LANG variable to 'c'. Of course, this is a bad solution, I still have to look into this. Though this is a minor problem compared to disk images (dmgs), let's state it right away: disk images suck! For the non-OS X user: these are images that get mounted when you click on them. Most third party software vendors provide their software as these disk images. Installation is usually done by opening the disk image, copying the disk image to the Applications directory, and unmounting the disk image. Besides the fact that you have to download disk images manually, application upgrades seem to be manually (usually). E.g. a security update was released for the Camino browser. I had to download the new disk image, open it, copy the new Camino folder to Applications folder, close the disk image. This is many steps back from APT and yum, where you can not only install your applications from repositories, but upgrade them with a single command as well. With Synaptic wrapped around it, APT is even very usable for non-expert users.

Yes, I know of the existance of Fink. Once they offer binary Leopard packages, I'll try it, because I'd be very happy to have a decent package manager. At least for the opensource applications that are usually provided with Linux distributions.

- page 1 of 2