Archive for April 2007

5 Good Reasons you should be business blogging

Tim Bray makes some good points about business and blogging and in particular how blogging has helped Sun.

See the original article here

  1. It’s helped improve Sun’s image. Three years ago we were seen as a big faceless lawyer-bound monolith; now the world sees that this is in fact an unruly tribe of people, many of them really bright, maniacally focused on the tech and biz of IT.

  2. Executives love being able to get their message out without having to route it through a journalist’s or analyst’s filtering function.

  3. We keep hearing anecdotal stories from salespeople about being able to get in front of some prospect, or route around some obstacle, because of something someone read on one of the blogs.

  4. We listen better. Like Bill Joy said, “Wherever you work, most of the smart people are somewhere else.” If I’m a smart person in Cleveland or Shanghai or Warsaw or Lima and I get a bright idea about something Sun should be doing, or notice with horror that Sun is doing something stupid, there’s no obvious way for an individual to talk to a big California computer company. On the other hand, if I’m reading some Sun blogger who writes about what I care about and I know the first.last@sun.com rule, it’s the work of minutes to fire off an email. I get these all the time and I bet there are a hundred or two a month, in aggregate across the bloggers.

  5. The morale-boost has been tremendous. Right at the moment, less than 10% of the workforce are actually committed bloggers to the extent of posting once a week or more; but the uplift from knowing that if you have something to say, it’s OK with the company for you to just go and say it to the world, that’s huge. Ask anyone who works here.

So get blogging! I hope I’m not quoting too much of his entry. I will say, I’m not a big fan of JRoller, we looked at it in the early days of BlogMatrix and the code base was rather disorganized and on the hardware we had Roller’s performance was not up to snuff (2+ years ago) I’m sure things have changed since then.

The Venn Diagrams of SQL Joins

This is a great post that illustrates SQL Relational Calculus using Venn Diagrams, very nice.

See it here!

Reasons to switch from CVS to SVN

A number of colleagues have asked me about SVN and is it worth switching from CVS, the short answer is YES! So in no particular order, plus a few things that aren’t so great about SVN here’s my list of reasons to switch.

  1. One of the big questions I get is “can you import your CVS history”, the answer is it is possible, I’ve never tried to, but the reality is once you’ve migrated after about 6 months, you probably don’t need your history anyways (if you’re luicky), remember you always have the CVS repository to query. See this link for more info: http://cvs2svn.tigris.org/
  2. SVN has ssh support built in, so no more opening an ssh tunnel in a separate process, a big plus if your dev server is somewhere else
  3. SVN has nice commit hooks, post and pre which are much cleaner to use than CVS
  4. SVN is language agnostic, but has great support for python and many other languages from a tools perspective
  5. SVN has a very nice versioning scheme which is a little more sensible and it works, CVS’s scheme always seemed a little broken, especially when you started branching.
  6. You can remove directories and files very easily in SVN, try removing a directory in CVS without being an administrator of the cvs root.
  7. No more #cvs add -kb  for binary files (or forgetting the -kb) SVN has great handling for binary files
  8. There is ongoing active development of SVN with an emerging ecosystem of svn tools and integration with most if not all leading IDE’s (CVS development has stagnated IMO)
  9. The SVN server supports existing authentication and authorization infrastructure via the Apache based server, protocols supported: LDAP, Active Directory, NTLM, X.509, etc.
  10. SVN has good support for UTF-8
  11. SVN has a very powerful per file properties that can be added, like managing line endings, keywords, mime-type. You can also add your own arbitrary metadata.
  12. SVN has smart branches that don’t make explicit copies of files until you start changing things, great if you have large projects.
  13. All the basic CVS commands you are use to work in SVN, plus SVN has the svn status command which is the equivalent to and much easier to type than cvs -q -n update
  14. There’s a Tortoise client for SVN, for all you graphical types

A few things that aren’t easy enough to do…

  1. SVN uses a property option to do file ignores which is easy when applying a .cvsignore file from a converted project, but is generally awkward on a one by one basis, to use an existing .cvsignore file: # svn propset svn:ignore -F .cvsignore .  and to  add a  single file # svn propset svn:ignore "filename" . or you can edit the ignore properties with # svn propedit svn:ignore "dir_name"
  2. Setting up keywords like $Id:$ is harder than it should be and the feature should be on by default. You need to set the global setting per developer or do it file by file (yuch.) To set Id and Revision on an individual file: svn propset svn:keywords "Id Revision" your_file.ex

The #1 rule for your businsess home page

I just saw the “10 rules for your small business home page” via digg, and in this Web 2.0 world I would say the #1 rule for your business is don’t do it yourself, start a blog! Keep your news and information up to date and then read the “10 rules for your …”, and by the way visit Onaspot.com real soon now.

At Onaspot.com you will be able to setup your small business blog along with your team and contacts to keep updated with all your latest news.

You can see Onamine.com now, it is as a prime example of what Onaspot.com will be like, Onamine.com is a vertical web site for Junior mining and resource exploration companies.

Amazon S3 to Apache Common Log Format Converter

I had put this together for a client a couple of months ago, and I’m just getting around to blogging about it now, at the time I couldn’t find any tools that might make this task easier, I assume folks who build web stats analyzers will deal with the S3 format natively, eventually.

Until that time, here’s a little converter to make Amazon S3 logs understandable to your favourite web log analyzer.

The work is done by one gnarly regular expression, which was easy to put together with the help of Pyreb.

You can download the python source here, and as always your mileage may vary.

Democamp 13

Date Tuesday, April 24, 2007
Time 6:30pm to 8:30pm
Location No Regrets 42 Mowat Ave
Expected Attendance All are welcome!

The space is on the small side so earlier is better.

You can see all the details here. 

The Joys of DirectIO

It’s been a long time since I’ve had to do any serious C programming, but sometimes you need to get close to the metal.

We’ve been having an issue with log files poisoning our file cache by sheer volume of useless pages being written and saved in cache therefore pushing good pages out of cache. Yes we have fairly verbose log files and yes we can change the log level, but we have so many different log files being written it’s kind of pointless.

So after a little research I came across DirectIO which is writing directly to the disk bypassing the file cache. Now before I go any further for the Linus fanboys/purists I tried doing all this with posix_fadvise and unfortunately it doesn’t seem to be honoured on Fedora Core 5, such the pity.

So off to scraping the metal with O_DIRECT, which after looking at few web pages seemed like this was going to be a cake walk, unfortunately that wasn’t my experience. There are also a dearth of complete examples on how to do direct IO. So. You can have the source if it will help you, there are a number of issues I ran across like have to align buffers in memory for DirectIO. You can get a tar of the source here.

I apologize for how ugly this code snipit looks but TinyMCE insists on reformatting any variation of raw HTML I try to use, a fix is in the works for this one.

/*****
 * write a page to memory, these pages require aligning the in memory buffer for
 * writing directly to the disk
 */
void bufwrite( char *buff, int fd )
{
    char *buf_unaligned = malloc((PSIZE * sizeof(char)) + PSIZE - 1);
    char *buf_aligned = (char *) (((unsigned long)buf_unaligned + PSIZE - 1) & (~(PSIZE-1)));

    memcpy( buf_aligned, buf, PSIZE );
    int result = write( fd, buf_aligned, PSIZE );
    if( result < PSIZE ) {
        fprintf( stderr, "\nbufwrite:Error bufwriting result = %d offset = %d\n", result, offset );
        perror( " bufwrite:" );
    }
    free( buf_unaligned );
}

I think the better solution is to have a dedicated partition marked as DirectIO, but it would appear in Fedora that you will
need to install a newer file system like ReiserFS or ZFS (as far as I can tell)

Other possible useful directIO links: