Archive for the ‘Perl’ Category

Why it is good to know a scripting language

March 24, 2009

Recently I was porting some code from Perl to Java. There was huge snippet of code in Perl in which a hash was defined and data was being added to it. I had to use the same data in my Java code also. This was around 80 lines of code.

To give an example :
From
$abrv{av} = ‘avenue’;
to
abbrevationsMap.put(“av”, “avenue”);

I messed around with Eclipse search?replace for sometime to do the transformation but could not get it done. I copied the 80 lines to a file. Opened up gvim and wrote the below script to do the transformation.

open(RH, "<foo.txt");
open(WH, ">bar.txt");

while () {
	chop:
	$_ =  &sanitise($_);
	next if $_ eq "";
	print WH $_;
}

sub sanitise {
	my $string = shift;
	if ($string =~ m/^#/) {
		return "";
	}

	$string =~ s/\$abrv/abbrevationsMap.put/;
	$string =~ s/'/"/g;
	$string =~ s/;/);/g;
	$string =~ s/\{/("/;
	$string =~ s/}/",/;
	$string =~ s/\s+=//;
	return $string;
}

I can hear people scream “I can do the same thing in Java”, etc etc. I totally agree with you folks, but you can do the same in Perl in lotsa less lines of code without using a heavy weight IDE.

Script for bulk CVS commit

March 12, 2009

CVS sucks.

1. There is no way to commit directories in bulk.
2. There is no way to delete directories.

At work I was developing a POC for search using Lucene. I kept working on it and it worked out pretty well. My manager asked me to check this into revision control as it looked really promising. Ok. I agree. I should have checked this into revision control from day one but I did not(Cursing myself). Now CVS does not allow me to check in files in bulk. I have to individually go and add each folder and then I have to commit the file individually. I did not have the patience nor the inclination to do this. So I scripted this up as below.


my @files = map {
chomp;
$_;
} `find`;

@files = grep {
$_ ne "." and $_ ne "..";
} @files;

my @dirs = grep {
-d $_;
} @files;

system(qq{cvs add $_}) for (@dirs);

my @files = grep {
not -d $_;
}

for (@files) {
system(qq{cvs add $_});
system(qq{cvs commit -m"First version of lucene based search framework" $_});
}

Ps: This will also commit the script file. If you do not want this to happen just add a condition to remove the script file.

Perl quirks

March 10, 2009

I have been coding in Perl for a few months now and below I list a few things which I find quirky. This is not an attempt at flaming or trolling or “X language is better then Y language” sort of argument. I will keep adding to this list as soon as I come across something new. So, think of this as a growing document.

Perl has a rich set of libraries in the form of CPAN which is a life saver, but as there is no coding standard associated with Perl, all the library APIs have their own naming conventions. For example ThisIsMethod(), this_is_method(), thisIsMethod(), thisismethod(), etc. When you include many libraries in your code, following your own coding standard becomes difficult. Contrast this with Java where all class names begin with Capital letters, method names have camel case etc.

Perl has the concept of default variables to reduce the burden of typing variable names.

my @array = (1, 2 ,3);
print $_ for (@array);

Here $_ is the default variable for for loops which “POINT”s to the current array element. (The emphasis on POINT will become clear when you read ahead). This default variable is provided only if you have not provided your own variable.

my @array = (1, 2 ,3);
for my $num (@array) {
print $_;
print $num;
}

Here $_ has no value.

Also you can declare your own variables with the same name as default varibale.


my $_ = 0
my @array = (1, 2 ,3);
for my (@array) {
print $_;
}

What do you think $_ inside the loop prints? If you think it is 0 then . The default variable $_ inside a loop shadows your declared variable $_. Once you come out of the loop block $_ is back to 0.

In Perl changing one variable does not change any other variable (leaving references out). But this is not followed in the case of arrays. Check out the code snippet below:

my @array = (1, 2 ,3);
for (@array) {
$_++;
}

Now your array has (2, 3, 4).

I hope now you are clear with the emphasis on POINTs in the previous paragraph.

Perl has a method called defined to check as to whether a variable is defined or not.

my $var = 0;

Now defined($var) is true.

But

my @array = ();
my %hash = ();

Testing the above for defined results in false.

Eliminating duplicates from a list

January 29, 2009

Say I have a list as follows

(1, 2, 2, 2, 2, 3, 3, 3, 4, 5, 5)

I want to remove duplicates from my list i.e I want the above list transformed to

(1, 2, 3, 4, 5)

Following Perl code is designed to do exactly that

sub removeDuplicates {
  my (@points) = @_;
  my $noOfPoints = scalar(@points);

  for  (my $j = 0; $j < $noOfPoints; ++$j) {
    for (my $i = 0; $i < $noOfPoints; ++$i) {
      next if $i == $j;
      delete $points[$j] if $points[$i] == $points[$j];
    }
  }

  @points = grep {defined $_} @points;
  return @points;
}


The main thing to note here is that when you delete an element from a list ‘undefined’ takes it’s place in the list. As far as I know this is the best way to delete elements from a list while iterating over it as it does not change the size of the list.

Say your list contains something else(a hash or an object) other than numbers and the comparison is dependent on some other criteria, then you could encapsulate the comparison in a function and pass the function reference to the method, and while evaluating the criteria for elimination you could call this function.

Edit:

A faster version of the for loop:

 for  (my $j = 0; $j < $noOfPoints - 1; ++$j) {
    for (my $i =  $j + 1; $i < $noOfPoints; ++$i) {
      next if $i == $j;
      delete $points[$j] if $points[$i] == $points[$j];
    }
  }

Perl file syntax check

December 30, 2008

If you want to check a perl file for syntax  without running the script, use the -c command line switch

perl -c foo.pl