Salvatore Iovene

Why most programmers are lousy

Mar 08, 2007

I’ve been in the IT field long enough to get to know many programmers, both experienced and just wannabies. During this time, I’ve realized that most of them are just bad programmers, simply said. I find myself agreeing with a brilliant post by Jeff Atwood, which alleges that programmers can’t program. What are the reasons for this? Many. Probably, IMHO, the main fault has to be addressed to the lousy education that people receive. But then again, the ability of giving education remains directly proportional to the ability of getting it, and where I see people complaining about low quality of education in University, I also see students with no interest in learning. Let’s see some of the reasons why programmers can’t really program.

Young people study Computer Science just because it’s a trend. It sounds almost unbelievable to me, but I must admit it’s mostly true. The vast majority of my old University mates just applied to the Computer Science department because… well: everybody was doing so. They followed the rest of the sheep.
Young people study Computer Science because they wouldn’t know what else to do. That’s really another strong source of applications to Computer Science. A lot of young people in their teenage years just don’t know what they want to do as grownups. Computer Science still seems to be a good career opportunity, so they just go for it.
Young people study Computer Science because they think it’s a sure way of getting a job. 10-something years ago there was a big boom, and if you just knew some HTML, were thought to be a computer guru. These types of belief mark a deep footprint on popular sayings, hence the wave of people applying to Computer Science just because they can work, is still there.
Many of today’s programmers, were doing nothing else than surfing the net or using Word till last year. Especially in small and vertical based markets, improvisation just rules. People learn something, and literally throw themselves on the field. Drawbacks for quality of their work are simply inevitable. This is not only a group of illiterate people that just jumped in to catch the big wave (what big wave, nowadays?), but people with no passion whatsoever. In other words, I don’t think it’s possible, nowadays, to become a great programmer if you didn’t start getting some interest in the field when you were very young, say about 10 years old (with the due exceptions, of course).
Many of today’s Computer Science students have no interest whatsoever in what they’re forcefully studying. Just put together the previous items in this list and what do you get? A bunch of people who just don’t care, who want to get their piece of paper (the degree) as soon as possible, and have absolutely no passion in what they learn. That’s the worst. I strongly believe that programming is not just a job like many others, but you need passion to get best at it.
A lot of programmers just don’t like to program. This goes for 100% of my ex University mates! Think of that: 100%. Of course it’s not the whole world but it makes a small statistics.
A lot of programmers just don’t get it. Not even the easy things. I was asked, few weeks ago, by a friend of mine who’s been studying Computer Science for now 4 years, what the difference is between a private and protected method in Java. Apparently reading the books isn’t enough anymore, nowadays. Another guy asked me: “I’ve studied pointers in C, and I think I understood them. Still I can’t find any use for them… are they really used at all?”.
Basically all of the programmers, or wannabe programmers, mentioned above, are miles away from the technical community. These people will totally ignore the existence of:
- Slashdot and similar
- RSS
- Usenet
- IRC (“Is that like MSN?”)
- SVN and similar

As you can see, a really strong point, in my opinion, is the lack of care and passion for the subject of programming itself. Lousy programmers are bound to program to take a wage home; good ones are bound to program for the sake of programming itself. Or course you can do that but still miss to be a good programmer, but all falls down to numbers.

The ultimate guide for UTF-8 in irssi and GNU/Screen

Mar 06, 2007

I’ve been having quite a lot of trouble, lately, configuring irssi to work well with UTF-8. Irssi’s documentation was quite incomplete, on the matter, or discouraging, and there wasn’t much on the Internet, so, after figuring out what the way is, I’ll share it here.

First of all, you’ve got to make sure that your system is configured for UTF-8 locales:

bash-3.1$ locale LANG=en_GB.utf8 LANGUAGE=en_GB.utf8 LC_CTYPE="en_GB.utf8"
LC_NUMERIC="en_GB.utf8" LC_TIME="en_GB.utf8" LC_COLLATE="en_GB.utf8"
LC_MONETARY="en_GB.utf8" LC_MESSAGES="en_GB.utf8" LC_PAPER="en_GB.utf8"
LC_NAME="en_GB.utf8" LC_ADDRESS="en_GB.utf8" LC_TELEPHONE="en_GB.utf8"
LC_MEASUREMENT="en_GB.utf8" LC_IDENTIFICATION="en_GB.utf8"
LC_ALL=en_GB.utf8

If the output of the locale doesn’t look like that, you want to reconfigure your locales. On Debian, wha you have do is:

sudo dpkg-reconfigure locales

Here’s some screenies of what to expect:

Generating locales (this might take a while)...  en_GB.ISO-8859-1... done
en_GB.ISO-8859-15... done en_GB.UTF-8... done en_US.ISO-8859-1... done
en_US.ISO-8859-15... done en_US.UTF-8... done Generation complete.

Perfect, now that our system is configured for UTF-8, we want to configure our terminal emulator. If you’re using xterm, you can invoke it with the -u8 switch, or just do uxterm, and that’s all that’s needed. If you’re using the gnome-terminal, go to the Terminal menu, then choose Set Character Encoding and then UTF-8. If UTF-8 doesn’t appear in the list, you may want to try to logout and login again. While you’re at it, in the GDM login manager, go to the Language option and choose UTF-8 there too, so that it will be default.

Now let’s take care of GNU/Screen. In order to enable UTF-8, all you have to do is launch it with the -U switch:

screen -U -S irc

irc is just the name I want to assign to that screen session. Notice that if you want to switch a living screen session to UTF-8, you could do it for each window, using the command CTRL-a : utf8 on.

Once your GNU/Screen is configured for UTF-8, you have to finally set up your irssi client. This was, for me, the tricky part, since the documentation is a bit unclear, and I didn’t realize that my irssi wasn’t built with recode support. To make sure that your irssi is, fire it up and give the command

/recode

If you get something like

Target                         Character set

then everything is alright, otherwise, if you get a No such command error, you will have to reinstall irssi with recode support.

Irssi UTF-8 support is made so that you are able to recode to different charsets, depending on the server or channel you’re chatting in. First let’s set up some general options:

/set term_charset UTF-8 /set recode_autodetect_utf8 ON /set recode_fallback
UTF-8 /set recode ON /set recode_out_default_charset UTF-8 /set
recode_transliterate ON

These options will be the default, unless overridden for specific servers or channels. What do they mean?

term_charset: this is the character set of your terminal emulator
recode_autodetect_utf8: irssi will recognize UTF-8 input automatically and treat it consequentially
recode_fallback: when we get some non-UTF-8 text from a chat peer, the text should be converted to this character set
recode: this enables the whole recode thing
recode_out_default_charset: this is very important: this is the default charset that you send out, unless differently specified by a server/channel rule (we will see that shortly)
recode_transliterate: this enables transliteration of the closest match: i.e. if someone sends you a character that’s not in your charset, it will be transliterate to the closest possible one, or with a question mark, if none found

Now, you probably need different recodes on different channels, because you may speak different languages on different channels. For example, I send out UTF-8 when typing on English speaking channels, and ISO-8859-1 or ISO-8859-15 when typing on Finnish or Italian speaking channels, so people on the other end will always get my characters right.

You need to add rules with the /recode command:

/recode add ircnet/foo ISO-8859-15 /recode add ircnet/bar ISO-8859-1
/recode add freenode/gee ISO-8859-1

Those command will make you “speak” ISO-8859-15 on #foo on IRCNet, and ISO-8859-1 on #bar and #gee in freenode. Everywhere else you will “speak” UTF-8.

And this is what we get: here I’m typing (er… I’m copy-pasting from Wikipedia) some text:

If you connect via SSH to a remote machine, where you run irssi inside screen, all you have to do is to set both systems to use UTF-8, as explained in the beginning of this article, and then set the terminal of the machine from which you SSH, to use UTF-8, as explained earlier.

Architecture of patching semantic versus logical content

Feb 19, 2007

Inspired by a certain patch that hit a darcs repository to which I concur, I would like to talk about one thing that developers don’t seem to get very often, when using revision control systems: the structure of your files in the repository should have nothing to do with the logical units that make your patches, or with the comment of your patches themselves.

Yesterday, I saw this patch hit the repository: “Adding Cloth.h to the repo“. The patch was adding an empty file, named Cloth.h. What’s wrong with this? A couple of things:

The patch adds no logical value unit to the repository, but merely a technical value, i.e. an information about the content of the repository itself, which is, then, absolutely redundant, as you could retrieve that information in a separate (and more proper way), which of course depends on the revision control system you are using. Indeed it was just a technical information. Furthermore, the fact that the file was added, would have been there and obvious also without having to dedicate a single patch to it.
The comment (“Adding Cloth.h to the repo“), once again, doesn’t make any logical sense of its own, as adds an information that was already available using the revision control system tools.

What is a better way to do that? A patch named “Preliminary support to clothes“, which would add the file Cloth.h with its content, even if not yet functional, makes perfect sense. It means that you’re adding some logical value to the repository, and the value that you’re adding has nothing to do with the way that value is represented (the file Cloth.h), or that it’s being actually added to a repository.

In other words, the form and content of patches should not only represent single units of implicit logical value, as discussed earlier, but should have no awareness whatsoever of being part of a revision control system, or being uploaded to repositories, contains file, or even being patches at all!

Please drop SVN

Feb 08, 2007

SVN might be stable, it might be mature, it might be successful, and it might be the winning source control system of the moment. There’s always a big risk of resulting unpopular, when criticizing something that actually did find its way to success, but I have to say that SVN sounds terribly antique sometimes.

I have already given a brief introduction to the Darcs source control system, and I would like here to talk about a very strong point it’s got against SVN.

Just yesterday, at work, I needed to commit certain modification to SVN. As I examined the diff of my local copy with:

svn diff

I realized that one of the file also contained some other modifications that I didn’t want to commit. After using Darcs for several months, I was suddenly hit by the shocking truth: SVN doesn’t allow interactive and partial patches, which Darcs names hunks.

What do you do in that case? Provided that there are people who actually abuse the Save as… function of their editor by saving multiple copies of the same file according to the logical patch they contain (which I find absolutely horrible), the quickest way I could find was to:

Making a diff: svn diff > logical_patch_1.diff
Edit the diff manually, until I had two files, which represented the two logical diffs
Revert the pristine: svn -R revert .
Apply the first diff: `patch -p0 < logical_patch_1.diff“
Commit: svn commit
Apply the second diff: `patch -p0 < logical_patch_2.diff“
Commit: svn commit

With Darcs, all you have to do is issue the darcs record command (which records your changes):

Record: darcs record -m "First logical patch (fixes bug 1234)"
Answer “yes” to the first hunk, and “no” to the second.
Record again: darcs record -m "Second logical patch (fixes bug 5555)"
Answer “yes” to the only hunk

Can you see the difference? It’s not just about the number of operations needed, but the quality of them, and the fact that Darcs is perfectly oriented to this kind of flexibility. Please consider switching to Darcs for your projects and work, as it’s a mature and better system.

ACE: The Adaptive Communication Environment

Feb 03, 2007

For some time, in the last months, I have been using, both for work and leisure purposes, the ACE library. ACE is a very powerful, useful and portable framework, oriented to networking, but that can be used for abstracting nearly any system dependent task. In my case, I’m using it for two client-server architecture projects, one of which is the MMORPG I’m working on.

ACE has had, for me, a pretty nasty learning curve: at first there are some good tutorial, and plenty of examples and test in the installation directory, but after a while you are probably going to need to purchase the books, to master it. I really can’t blame the author(s) for that, as ACE is an impressive (I mean it) work, and deserves some revenue.

Using ACE for my projects has turn out into an incredibly useful outcome: I do all my development on GNU/Linux machines (Debian Sarge at work, and Debian Unstable at home), but the code I write needs to be ported to the Win32 platform as well. I’m no Windows programmer, and no Windows user, and there are other people in my company who take care of integration with Win32 systems. After writing my client/server project for about two months, it started to get usable, and we decided to port it to Win32, and, to some extent, we were expecting some trouble in porting to Win32 an application that went on for two months and was made of roughly 10 thousands lines of code. The project is a server process plus a dynamic library communicating with it, that a UI client can use. Well, the porting to Win32 took no more than half an hour, and just a few changes needed to be made. Of course, during the development, I’ve been caring of not using any system dependent code, but the facts that it took so little to port, was simply amazing.

The ACE library has provided for me lots of platform independent things: a way to manage sockets and TCP connections, a way to manage loading of external programs, a way to manage threads (and related mutexes or locks), a way to manage logging, a way to manage tracing, and many more.

You can check this address for some good ACE tutorials, especially regarding the client/server communication. In my cases, I’ve gone for an approach orientated to a system that would handle one client connection in a dedicated thread. Thanks to ACE, this has been very easy and controllable. By controllable, I mean that I’m quite sure that the code I’ve produced, to that regard, is practically bug free. ACE helps you very well in taking care of all the errors and reacting accordingly. Due to the vast number of platforms ACE supports, it lacks exception handling, which can be considered a bad point, although necessary. To some extent, though, ACE can support exception handling, even if, for portability and integrity reasons, it’s advisable to let it go and rather use the classic return value checking approach. Nothing will anyway impede you in creating your own exception handling layer on top of your classes which manage ACE.

ACE is really strongly Object Oriented, which makes it perfectly suitable for large (but well engineered) projects. Needless to say, that ACE is not advised for very simple projects, unless you just want to take advantage of the system abstraction it provides. For larger projects, instead, you’d better be very careful and plan in advance. If you don’t know the system very well, you might end up making some wrong choices and wasting time. To this concern, I advise to read the books.

ACE is also very useful when it comes to logging, as it provides some really simple but powerful macros that can be used in debug mode, and that will produce no code at all if disabled. You can check this website for an introduction to the ACE logging facilities.

I will end this short article with a list of pros and cons about ACE, as I’ve found out during my experiences.

Pros

Very portable.
Very powerful.
Good initial learning curve.
Huge list of features.
Many examples.
Great mailing list support (even though they remind to reading the books too often).

Cons

API are not very well documented.
You need to purchase the books to master it.
No free binary releases.

What I advice it for

Any large project that need to manage networking and multithreading.

(Web) Standards actually do matter a lot

Jan 17, 2007

A week ago I stumbled upon this article from Stuart Brown, which actually even got a fair amount of Diggs. As soon as I read it, I felt obliged to reply extensively to it, as I think it mostly represent everything which is wrong in the current web design trends (or should I say all time web design trends?).

Something that really annoyed me about the article, was the lack of serious points and the abundance of useless words like “some standards evangelist”, “Obsessing over semantically correct markup”, “a few standards zealots”, “zealotry accomplishes nothing but a collection of smug faces and a collection of ‘XHTML 1.1 Compliant’ bylines”. Those were only fruitless diversions and not real arguments.

The authors gets one thing right, though, and it’s that

your users don’t care about XHTML, they care about how your site appears.

Alright, we all can agree that most of the average Internet users completely ignore everything about words like HTML, CSS, JavaScript, PHP and so on. We can’t, at these point blame the users: I don’t know anything about the inner workings of most of the things I use on a dally basis myself, so no big deal. In spite of this, serving the user is not quite equivalent to fooling him. Giving the user something that actually works, but whose inner workings are totally crippled, is absolutely wrong. The author of the mentioned article furthermore says:

If you can satisfy the usability needs of 100% of your users, yet your code doesn’t validate, then arguably you need do no more.

I’m really strong about my disagreement here. Validation serves a purpose: that is to minimize differences when rendering the website in different browsers. A well validated page doesn’t need any guess-based correction by the browser. And Stuart talks about usability, whereas a website that doesn’t validate will have a harder time in meeting usability requirements for impaired persons. On February 2006 there was a story about Target (a retailer), being sued because its pages were unaccessible by visually impaired users. The retailer was accused of violating the California Unruh Civil Rights Act, the California Disabled Persons Act and the Americans with Disabilities Act. Needless to say, the website still doesn’t validate (it doesn’t even have a DOCTYPE).

So, why else standards (and especially web standards) are needed?

Standards make it easier for your browsers to render the pages.

HTML or CSS are, in a way, standards. I.e. a set of rule, meant as a markup language specification, to be followed in order to design a web page. On top of these specifications, people who write browsers have done their work. Unfortunately, though, not all developers are able to, or care to, be strict regarding standards. That means that the browsers must have a way to correctly interpret some code even though it’s fundamentally incorrect. What’s the result of this? Browsers developers have to waste time into taking care of correcting the mistakes of web developers. This, in turn, gives web developers the chance to relax and write worse and worse code. And this makes browsers developers’ life worse, and their work slower. In the end the users get nothing but worse
Standards make life easier for users.

We can take this off the Web Standards examples, and move to some “real life” cases. Ironically enough, the other day I was staring at my toilet seat for a while, and noticed that the holes in the toilet in which you can screw your seat in, are at a fixed distance, and just slightly larger than longer, so to give a little margin: i.e. if you want to buy a new toilet seat, the distance between the screws with which you attach it to the toilet must be between X and X+1 cm. So, when I went to the shop to get a new toilet seat, I didn’t have to worry about the size. Vendors of toilets and vendors of seat had just agreed on a standard size. The result? No hassle for the user (me).
Search engines like standards.

Search engines don’t like finding errors. If they have trouble parsing your pages, they may not get to your carefully chosen keywords, won’t be able to find all those tightly focused meta tags. Your site will be crippled on any search engine results page. This is not good. A well designed website creates a clear channel, a road map, for search engines, so they know to go exactly where you want them to go and see exactly what you want them to see.

End of the list? Yes. This is all that matters, in the end, to the final user. He doesn’t care how, but in the end he gets better products. Of course there are lots of side effects, e.g.:

Standards minimize the differences in the way your pages appear in different browsers.
Standards improve the quality of your code, hence the quality of your product.
Code that adheres to standards is easier to maintain.
Validating your code during the development process eases the discovery of flaws and mistakes.
Standards increase your value.

Before anyone could say that nobody needs a new article about web standards, let me remind you that the following websites won’t validate:

This means that the standards-aware developer are just a tiny minority. The rest, obviously, don’t think that quality is a good asset for their websites. Writing non-validating code is a big step backwards, on the path of perfection.

Bibliography

Fixing NVIDIA driver after a xserver-xorg-core upgrade in Debian and Ubuntu

Jan 16, 2007

Using Debian Testing or Unstable, or a frequently upgraded version of Ubuntu, when doing an apt-get update && apt-get upgrade often will install a slightly newer version of xserver-xorg-code, and this will break the NVIDIA proprietary drivers, if you, like me, prefer to install them using the official NVIDIA installer. When this happens, at your next reboot, or next time you start X, this will crash.

Follow this instructions and you won’t need to reinstall the NVIDIA driver from scratch each time. First of all, stop your login manager (gdm assumed here):

/etc/init.d/gdm stop

Then move to:

cd /usr/lib/xorg/modules/extensions

Normally it should look like this:

total 956K
1 root root  19K 2007-01-09 21:13 libdbe.so
1 root root  34K 2007-01-09 21:13 libdri.so
1 root root 145K 2007-01-09 21:13 libextmod.so
1 root root   18 2007-01-15 20:42 libglx.so->libglx.so.1.0.9742
1 root root 676K 2007-01-15 20:42 libglx.so.1.0.9742
1 root root  28K 2007-01-09 21:13 librecord.so
1 root root  38K 2007-01-09 21:13 libxtrap.so

Notice the symbolic link from libglx.so to libglx.so.1.0.9742. In your case, instead, the installation of a newer xserver-xorg-core overwrote the libglx.so with the normal one provided by the X Server. What you have to do is simply restore the previous situation. Remove the libglx.so file:

sudo rm libglx.so

And make the symbolic link again:

sudo ln -s libglx.so.1.0.9746 libglx.so

Of course the version number, in my case 1.0.9746 may be different in your case. Now you can simply start the gdm login manager again:

sudo /etc/init.d/gdm start

Everything should be working again.

Thanks to http://osrevolution.wordpress.com/ for this.

How to write robust code

Jan 13, 2007

As software is one of the most important issues in our era, writing good robust programs is essential. This article is an in-depth essay focused on Object Oriented software and large projects. Everything said here, though, scales well to good directives for small projects as well.

Our time is dominated by software. There is basically software everywhere around us; most of the object you can see right now around you, have something to do with software, probably because they were created using some sort of machine. Given the importance of software nowadays, I just have to find bugs unacceptable. Of course you might argue that a small and rare bug is a minor software won’t harm anyone, and is not nearly as important as a bug that could affect the software of an airplane, and I’m going to agree with that. But as time goes by, everything has to be going towards perfection, and current trends about software seem to be going nowhere: there were bugs in software 30 years ago, and there are today. There was a time, in the beginning, where scientists thought that it would be relatively easy to write bug free programs right away, but then they realized pretty soon that it wasn’t quite so. After all, software is written by us human beings, and we are doomed to make mistakes or omissions. The point of this article is not that software should be always bug free, but that we, coders, should always get them to the minimum, and here I’m going to present some ways to deal with programming in general.

One huge problem, as I’ve faced quite often, is that as a program grows in size and dependencies, its developers start losing trace of its components, get further away from the big picture, and ease the introduction of bugs. Note, I’m not talking here about bugs caused by a single human error that can be labeled as a cheap error by anyone who would look at the code. I’m talking about the sort of nasty bugs that nobody can spot right away with a glance at the code. I’m talking about system wide bugs, usually emerging as a result of hardly related subsystems of the program. Usually connections between dependencies and libraries.

Anyway, the path to write bug free code, is the one you step when you write robust code. What do I mean by that? Robust code has some features:

Well designed
Neat and tidy
Well named
Well commented
Well tested
It never segfaults

As a result of some of these, robust code is also:

Exstensible
Reusable
Lasting in time

Well designed.

Having already talked about this somewhere else, I’ll be brief on this section. Writing a complex program, a program made of hundreds of thousands lines of code, is a damn complicated thing: it takes many people and a lot of time. Usually, the more people you involve in the project, the less robust code you’ll get in the end. People will use different conventions and different styles. For this reason, not only it’s crucial to hire the right developers, but it’s essential to have a very strict and detailed specification of the project. Programming is a creative work, no doubt, and coders need to have freedom so they can breathe. A constrained coder is a chained coder, hence a dead coder and a threat to the quality of the end product. But, in spite of how much we care for the freedom and openness of initiative from the developers, we have to be aware that loosing control means lowering the quality. A large project must be designed thoroughly and carefully, in every single details. Even though programmers love freedom, most of them also love exhaustive documentation. If you want to make a good coder happy, and get the best out of him, flood him with docs and specs. Nothing pisses off the good coder as the lack of documentation: it tears his motivation apart. “Why should I start to read their minds and run by guesses” - he thinks, “when they didn’t even get the time to write good specs?”. Furthermore, a project without good specs looks superficial, destined to failure and without a future. A very good coder is hardly going to stay in a company that doesn’t make good design for the projects. He will think that it’s a loser company, and start looking around.

But what does good design mean? A good design is:

Exhaustive
Non redundant
Non contradictory
Easy to understand
Related 1:1 to the implementation

We want to cover every possible outcome in our specification, let be them exhaustive so that nothing will be left to case. We don’t want to repeat the same information more than once, and be redundant for several reasons, e.g. information should be retrievable in exactly one place, and it would ease up contradictions. Documentation should be for the developers, i.e. written in the most straightforward way for the right audience: simplicity of language and straightforwardness of tables and schemes will spare some curses from the developers. Furthermore, as a specification is just a way to put a program in words before it’s written, developers should be easily able to translate what they see on paper to code. Think about a shopping list: when I get one, I just go to the shop and take care of translating each item on the list to a physical item in my shopping cart. Direct and easy.

Neat and tidy

A good definition of neat is: in a pleasingly orderly and clean condition. How does that apply to software? What is neat software? One nice word that I like in that definition is “pleasingly”. Neat software pleases the eye and the mind. Don’t want to be cocky here, but neat software is something written by a good programmer, and will be appreciated by another good programmer. If somebody known as a good programmer points at some software and says “That’s neat” and you find yourself looking at it and replying “Huh? That’s just code”, I’m sorry but chances are that you are not a good programmer. A good programmer appreciates the beauty of some code, both on a small scale and on a large scale. Neatness of software on a small scale means that you’re able to look at one function and appreciate the simplicity of it. Neat pieces of code are easily readable and use good name conventions. Please read this article if you want to know more about good code on a small scale. Neat code on a larger scale, on the other hand, means neat integration between components and subsystem of a project. A bad integration would mean, e.g., having a project-wide global variable that points to a certain subsystem, and using it everywhere in the project. Or having two subsystems that, in a messed and intertwined way, mutually call each other’s methods violating several layers of abstraction. Proving what neat code is, turns up to be very difficult. It’s a bit like the opposite of what happens with common logic: if I want to prove you that, say, lions exist, I can just go to Africa, pick one and show it to you, then say “That’s a lion, ergo lions exist”. But how can I prove that unicorns or dragon don’t exist? You probably agree that it’s much more difficult. It’s just the opposite with neat code. I can show you bad code, and you will easily agree that it’s bad. But looking at neat code doesn’t it prove it neat right away. It takes probably years and years of experience, writing a lot of code and reading a lot.

Well named

This topic has already been discussed here, but repetuta juvant. As code is managed by possibly dozens or more people, being understood is an important key to increase robustness of the code. Writing robust code also means writing code that will easily stay robust when other people will modify of expand it, unless they have no clue, of course. The most your code is understood by others, the most likely they will not break your ideas, and keep the code robust. There are several ways of making own code easily understood, and having a good, consistent and solid naming convention is one of them. Of course, as discussed later, code needs to be well documented also.

Well commented

I know, I know. Everybody says that you should comment your code. That’s what I say and that’s what I’ve been told. Still I’m now comment my own code enough as I should. Before you can then tell me “Who are you, then, to tell me to comment my code, if you don’t do it enough with yours?” let me remind you that we learn from mistakes. What they don’t tell you about the importance of commenting code, is some subtle and psychological little thing. If you are a bad programmer, you’ll never produce good code. But if you are a good programmer, sometimes being in a hurry will make you produce really bad code. There are two reasons why this can happen: 1) you are in a hurry because you’re late with your deadlines. With this, there’s nothing to do. 2) you are in a hurry because you’re just coding fast, on the rush of some ideas that flashed you. In this case, commenting your code a lot will improve drastically the quality of your code. Always write your comments before writing the actual code. This will make you realize it, if your function is not really going to do what it’s supposed to do. Writing the comment will also help you think more about what you’re doing, and being more conscious about it. It will keep your state of mind clear and precise. I strongly recommend using Doxygen to generate a browseable HTML version of your comments, especially if you’re writing a library. Otherwise, it’s still going to keep you on a professional line, which is always a good thing.

Well tested

Write and use unit tests. If your code is well designed, there are good chances that each function in your code, or each class, performs a specific task in a certain way, and nothing more. Given a certain input, it will reliably return the same output. Right? You have to make sure of that, by writing test cases. Testing the smallest units of your program doesn’t ensure that the whole is working perfectly, but helps. Possibly, append a hook to your Source Code Versioning System (SVN? Darcs?) so that the automatic testing suite will run automatically on the server that hosts your repository, before it accepts your patch. This is quite easy with Darcs.

It never segfaults

Of course this point applies to the languages that allow segmentation fault, or NullPointerException (in Java). It’s easy to get: if your code segfaults, there are no excuses. No matter how stupid the provided input was, your program should not segfault. A good practice, is that each and every function/method would check it’s argument before doing anything. A solid exception handling structure is required. Again, you can object that I’m not really saying anything useful here: “Of course programs shouldn’t segfault, I knew it!”, but think about it: it’s a matter of attitude. You want to write a perfect program, and there are some things you have to keep in mind. Be paranoid with segfaults will implicitly and secretly improve the general quality of your code, without you even noticing.

Conclusion

Writing perfect code is impossible. Especially as the code grows in size and number of programmers. Achieving the impossible, then, is beyond any good intentioned coder. What we can do, though, is just try to have the right attitude, which is about precision, care and, sometimes, paranoia. Writing complex programs is not an easy thing, and, as such, should be handled with extreme care.

Darcs - The source code management system of the future?

Dec 29, 2006

Having already mentioned some good practices for source code versioning and how important versioning is, in any case, I would like now to review and comment about what I find the best source code management system out there: darcs.

Darcs is a source control system written in Haskell (a functional language), and feature very solid mathematics bases, being completely engeneered on top of a “patch theory”. Not only darcs is straightforward and very easy to you, not only it’s very interactive and minimizes the chances of mistakes, but it also gives out features that the popular SVN doesn’t have. Here I’m going to show some use cases, and show how things are easier with darcs.

A quick intro

Before analyzing the key features, let’s have a brief start-up quick tutorial. The easiest way to get darcs, is to download a binary package. These packages contain a precompiled release of darcs, with everything needed statically linked inside. You only need o copy that somewhere in your $PATH, such as /usr/bin, /usr/local/bin, or whatever you have in your $PATH. Of course you can download the source code and build it yourself if you want.

Let’s now create a simple Hello World project, and use darcs to version it.

$ mkdir $HOME/projects/HelloWorld $ cd !$ $ darcs init

darcs init will create all the files necessary to source-control the code. You will find a new directory named _darcs.

Now we can write our HelloWorld.cc main file:

#include <iostream>
using namespace std;
int main(void) {
    cout << "Hello World!"<< endl;
}

Time to add the file to version control.

$ darcs add HelloWorld.cc

Alright, now we really get into darcs. First of all, in case you didn’t notice, darcs doesn’t really need any server at the other end, like SVN would need an SVN server, or CVS would need a CVS server. This means no hassle in installing and configuring a server. Later we will see how darcs manages collaboration with remote users.

Now it’s time to save our changes to the repository.

$ darcs record

Darcs needs to know what name (conventionally an email address) to use as
the patch author, e.g. 'Fred Bloggs <fred@bloggs.invalid>'.  If you provide
one now it will be stored in the file '_darcs/prefs/author' and used as a
default in the future.  To change your preferred author address, simply
delete or edit this file.

What is your email address?  Salvatore Iovene <salvatore@invalid.com>

addfile ./HelloWorld.cc Shall I record this change?(1/?)[ynWsfqadjkc], or ? for help: y

hunk ./HelloWorld.cc 1
+#include <iostream>
+
+using namespace std;
+
+int main(void) {
+   cout << "Hello World!" << endl;
+   return 0;
+}
+
Shall I record this change?(2/?)[ynWsfqadjkc], or ? for help: y
What is the patch name? First record.
Do you want to add a long comment? [yn] n
Finished recording patch 'First record.'

Some points worth inspection here:

Why did darcs what to know my email address? That’s because everything you commit (they are named patches) will be known as coming from you. If you’re working with several people, darcs has to know who is committing what. Furthermore, people downloading your repository can, e.g. make some changes and improvements, and then issue a darcs send which will send you the patch via email, and you can evaluate it and decide if apply it.
What is a hunk? A hunkis a piece of a patch, i.e. a certain modification in some source file. If you have a large file, foo.c, and modify a certain function bar() at the beginning of the file, and then a certain other function tar() at the end of the file, this will result in two hunks. What’s the advantage of all this? Since darcs is so interactive, you may decide to either apply both hunks in the same pathc, so answer ‘y’ to both, or realize that they logically belong to two different patches, so you will say ‘y’ to one of them, and ‘n’ to the other. Then, after finishing recording the first patch, you issue a darcs record again, and record the other hunk in a separate patch, with a separate name, that forms a logical unit per se.

Now let’s make a small change.

#include <iostream>

int main(void) {
    std::cout << "Hello World!" << std::endl;
    return 0;
}

As you can see, we have removed the using namespace std; declaration, and added the std:: namespace prefix to cout and endl. A very important darcs command is whatsnew, that shows us how the code differs from the repository.

There are two hunks, as expected. Let’s record the changes.

$ darcs record
hunk ./HelloWorld.cc 3
-using namespace std;
-
Shall I record this change?(1/?)[ynWsfqadjkc], or ? for help: y

hunk
./HelloWorld.cc 4
-   cout << "Hello World!" << endl;
+   std::cout << "Hello World!" << std::endl;
Shall I record this change?(2/?)[ynWsfqadjkc], or ? for help: y
What is the patch name? Removing the std namespace declaration.
Do you want to add a long comment? [yn]n
Finished recording patch 'Removing the std namespace declaration.'

Obviously those two hunks must form one single patch, because we don’t want any patch to leave the repository in a broken state. Now we get to the cool stuff. Darcs lets you unrecord your changes, i.e. interactively rollout the patches until you are satisfied. We might change our mind about the last patch, and think that using namespace std; is not tha bad after all. No problem.

$ darcs unrecord

Fri Dec 29 12:53:32 EET 2006 Salvatore Iovene <salvatore@invalid.com>
* Removing the std namespace declaration.
Shall I unrecord this patch?(1/2)[ynWvpxqadjk], or ? for help: y

Fri Dec 29 12:37:33 EET 2006 Salvatore Iovene <salvatore@invalid.com>
* First record.
Shall I unrecord this patch?(2/2)[ynWvpxqadjk], or ? for help: n

Finished unrecording

Now there we are again, back as if nothing happened.

Imagine you want to have a copy of your repository, maybe on a different partition of your disk, or maybe on a USB storage drive:

$ cd ..
$ mkdir RepoCopy
$ cd RepoCopy/
$ darcs init
$ darcs pull ../HelloWorld/

Fri Dec 29 12:37:33 EET 2006 Salvatore Iovene <salvatore@invalid.com>
* First record.
Shall I pull this patch?(1/1)[ynWvpxqadjk], or ? for help: y
Finished pulling and applying.

Another directory is not the only way you can move your repository around, you can use SSH to copy it to another machine, and HTTP to fetch it. This is actually the way you handle collaboration. Imagine you have a server somewhere, named www.server.com, and there you want to have your central repository, with which you can collaborate with your development peers.

$ darcs push \ username@www.server.com:/var/www/htdocs/HelloWorld/repo

This will ask you which patches you want to push to that server, one by one, in the usual darcs interactive mode. I’m assuming that the directory /var/www/htdocs/HelloWorld/ on the server, hosts the http://www.server.com/HelloWorld/ website. Everybody can now get a copy of your project just by doing this:

$ darcs get http://www.server.com/HelloWorld/repo

And anybody with an account on that server, will be able to push patches, if they of course have write permission to the directory where the repository is.

Where to go from here

Here follow some must-read links if you’re interested in darcs. Probably in the future I will write more about it. Thanks for reading.

Getting a project done using clever design

Dec 27, 2006

Sometimes, when coding a one-person-project at work or for fun, each one of us has found himself stuck at some point, and then the project eventually died out. Let’s have a look at some design and discipline recommendations that can help us achieving our goal.

Who doesn’t have a dream hidden in a drawer? Some of us coders have dreams about programs about which we’ve been thinking for a long time and perhaps never found the time to write. Now that we finally decide to settle down with it and write the thing down, we better not hurry. There are several complications that might arise during the creation of the project and most of them can be faced dealing with two issues: design and discipline. Let’s examine some of the possible issues.

Technical difficulties.

Not very unlikely, at some point of the development, we might find some obscure obstacles, impersonated by a very tough technical issue. I.e., we might think that it just can’t be done. Well sometimes it really can’t, but usually it’s a matter of technologies and the proper use of them. To address this kind of problem, there are basically two ways.

Read up.

Once you’ve determined what technologies you’re going to use, and what libraries, read up about them. “I’m going to learn them as I build up my project” is not a good strategy. You will probably realize that those are not the right technologies after all, or that you’re not using the right libraries. Or, most commonly, that you’re using your things in the wrong way. And that could be too late. This is a very common case of project failure. This is when you get to face an obstacle that seems insuperable, so you’ll waste a lot of time, and eventually get unmotivated. Start reading the websites of the technologies and libraries you want to involve, then read their documentation. You don’t need to read all the APIs of course, but at least you ought to read the technical overviews, the white papers and the general design perspectives. Possibly, buy a book about them and read it. Of course this will delay the start off of your project, but that’s better than investing 2 months in it and then have to trash everything. Typical case involve starting a new language, or a new framework. If you’re new to Hibernate don’t expect to just starting using it reading the Quick Start Tutorial. Most things, especially nowadays, aren’t just about using them. It’s mostly about understanding the big picture and knowing what the right angle from which approach them. The details will always come afterwards. Never ever underestimate the importance and the power of a good book on a certain matter. After having read up enough, you’re ready to start thinking about your application.
Design.

Never fail or omit to design your application (thoroughly) before hands. It might seem silly for small applications, but it never is. Draw down a scheme of the main components of the system, and remember: always divide and conquer. Start by drawing off a very big picture of the system. Ask yourself what components are involved, what technologies you’re going to use, what libraries you’re going to integrate. Once you have a clear and state-of-the-art idea of what the big picture is, you can start adding details. You can plan your database, for instance. Start that by individuating what the main components are, and how they interact with each other. Then you’re ready to design the database in more detail, i.e. specifying each field and relation. That will also help you entering more detail into the big picture of your application design. Start now considering all the components and their interactions. Define the objects you need and the way they communicate. Draw models and objects, possibly using UML. Be very careful and see to it that your model makes sense. Glitches and problems might (and should) already come up at this point. If your application takes less than 6-12 hours to be designed, than it’s either very simple, or you’re being superficial. Of course if we’re talking about Hello World, then you don’t need that much design, but this should be clear already!

If you’ve taken care of all this, the odds are with you and there are very little chances of failing for technical issues. Now it should just be about writing the actual code and you already know that it can be done, because you read up and designed the thing properly.

Scattered code

Although Spaghetti Code was The Way, some long time ago, Object Oriented Programming has now been out there long enough so that we all should be able to write decent Object Oriented code. Seriously, if you’re still writing procedural Spaghetti Code, you should really read some Object Oriented Programming books and start living in the present. If you don’t even know what Spaghetti Code is, chances are that it’s what you do, so Google it up. Having well structured, layered and maintainable code will help you write less code, write better code, and finish your project sooner. This should come out automatically from a good design, but it’s worth spending some words. The more you are able to do the following, the better your project will go on: write a class, double-check it, test it, never touch it again. This is the divide and conquer paradigm. Once you have some parts of your project that are actually finished, you will find a lot more motivation in continuing. Having to continuously modify parts of your code, going back and forth on those changes again and again, is not only frustrating, but damaging the very structure of your project. I’m sure you have run through more that one session of “refactoring” your code. Moving pieces around, redesign classes, putting order, etc. This means that you’re wasting time that you could’ve used on adding new features or getting close to something that’s releasable.

Lack of deadlines

Even if you’re working alone on an hobby project, set yourself deadlines and a roadmap. Reserve yourself some time each day in which you can work at your project. If you already know what version 0.3 will have, and what version 0.4 will look like, it’s more likely that you will stick to that. Give yourself small objectives and follow them, one after the other. Don’t push yourself too much, though.

Lack of professionalism

This is a very important point, even though you may think it’s not. Even if you’re working all by yourself, whether it is a work project or a hobby one, always be professional. Remember the following rules.

Use a source versioning system.

I don’t care if it’s CVS, SVN, Darcs, GIT or one of the many out there. Use one. Commit your work and keep track of the changes. This will not only help you not lose important code, and keep your work traced down correctly: it will also give you discipline and motivation. Read some best practices for SVN and other versioning systems.
Use a bug tracker.

There’s a vast choice: bugzilla, trac and the Bug Genie are only few of them. Bug tracking is terribly important. Maybe not in the very initial phase of your project, but as it becomes usable, bug tracking can’t be neglected. Even if you’re working alone. Track the bugs scrupulously and like you were working on the most important project of the World. Bugs tend to be forgotten of after few hours. Few days in the best case. This is also something that will keep you motivated. Acting professionally.
Keep a website and build up a community.

I assume that your project would be interesting to someone. The most motivating thing ever, is receiving encouragement and feedback from strangers. Keep a website of your project and let the community know what’s going on. Release it as soon as it’s ready and you’ll get feedback, eventually help. This is really highly motivating.

Lack of discipline

Another big hit for unfinished projects. Stay away from it for more than 1 months, and it’s over. Most of the times. After one month, you will have forgotten a lot of things about it, and just can’t get the focus again. Just can’t find your way through it again. The more the time goes by, the more you’ll get detached from it, and it will eventually end up in the well of the forgotten and unfinished projects. To find the right discipline, force yourself to keep an eye on it at least two times a week, for 2 hours each time. Of course, this depends highly on how much you care about finishing your project, but of course we’re discussing this assuming that you care a lot.

Keep all of these hints in mind, and with a good dose of determination you can accomplish everything.

« Page 13 / 14 »