Myth: Perl is only for parsing text files

Question

I keep seeing people trip over common misconceptions of how Perl exists and what it does.

There are generally 2 types of Perl Myth.

Type 1: Things that people think about the language itself, that are not true
Type 2: Behaviors people exhibit when using the language, which generally derive from lack of common sense.

The first one might be a bit straight forward, the second might need a little more explanation. An example of this is the types that inevitably result in people reinventing wheels due to not understanding what CPAN is, or how it works, such as the common "I'm not allowed to install modules" argument.

This question is in the nature of these related Perl tips

perl common gotchas ^[1]
hidden features of perl ^[2].

What common myths do you keep seeing in Perl?

Due to the nature of myths requiring evidence, its probably best to use an assertion/explanation/reference format of some sort.

[1] http://stackoverflow.com/questions/166653/perl-common-gotchas
[2] http://stackoverflow.com/questions/161872/hidden-features-of-perl

Answer 1

Contrary to belief when a cat walks over one's keyboard, the resulting characters seldom produce executable Perl code.

Answer 2

MYTH: I'm not allowed to install modules

The Basics

Generally this is not the case, modules are generally just text files.

You just put these text files on your system, and then tell Perl how to find them.

If you have a system where you are not allowed to write text-files, period, then you are not programming anything.

If management decree that you are not allowed 3rd party code of any form, then you may have an argument, and possibly poor management. For the cases where the modules require Binary components built that's quite understandable, but there are generally good alternatives in Pure Perl.

The Alternatives

This is a wonderful guide on how you can set up locally visible modules in Perl without requiring any form of root account.

http://sial.org/howto/perl/life-with-cpan/non-root/

For the most trivial of modules, you can install only the very basic parts you need by hand.

cd  /project
mkdir lib
cd lib
# Say the 'package' in the file is "Foo::Bar::Baz" 
mkdir -p Foo/Bar 
cd Foo/Bar 
wget  $url -O Baz.pm 
cd /project

And there you have it, you've installed a module into the lib directory of whatever project needs it. Its admittedly a bit messy to do it that way, but it works.

All thats needed is to do in your code:

 #!/usr/bin/perl 
 #   /project/foo.pl 
 # 
     use strict; 
     use warnings; 
     use lib 'lib'; 
     use Foo::Bar::Baz; # Works!

And it should be smooth from there.

Not Using CPAN is a Bad Idea Indeed

Perl modules uploaded to CPAN generally have an entire fleet of testers on different operating systems building them, and running their tests, as well as committing bug reports and contributions for weird edge cases.

The only time you should NOT use an existing module, is when you KNOW as a matter of FACT you can do a better job.

In such cases, its recommended to write your own module, and upload it to CPAN, so that you can benefit from the team of testers who will run it, find bugs, and report them.

If you're really lucky, they'll report bugs with working patches.

Answer 3

Myth - By definition, anything written in Perl is spaghetti code

I think that this is the type 1 Perl myth. The problem is that it's also self-fulfilling, in at least two or three ways.

If you already believe this, then the minute you see Perl code, with it's sigils and special variables and deferences, then - by god - it's spaghetti (not really, but you just can't be bothered to learn what you're looking at).
If you already believe this, then why bother writing clean code? You only use Perl when you want something quick and dirty. You just throw something together under #!/usr/bin/perl and - by god - it's spaghetti (for real now, because you didn't take the time to make it better).
The initial learning curve in Perl is a very mild one, but there's a steep climb up ahead. Many people stop early. It's easy to quickly learn enough to do a staggering amount. Everything in, say, Learning Perl (ie, no OO Perl, no references, no callbacks, no closures, nothing functional, minimal module use). However, this is really just a base. If you never learn more than this, then quite a lot of what you write may be spaghetti, since you will really have to twist the language to do larger things without references, modules, etc.

Answer 4

Myth - Perl Is Dead

If there are a thousand elves in the forest, and you're not looking, are they dead?

Here's a random sampling of the number of idlers in a given set of channels on irc.freenode.org

##javascript: 325
##php : 736
#perl : 517
#ruby : 289

A meaningless sample of course, but still, not something I'd expect from a "dead" community.

But wait, Perl has its own entire dedicated IRC server!.

Yes, with several hundred channels.

And right at this moment there would be at least 250 users there alone, but yet again, it is a rather meaningless statistic.

( #stackoverflow on freenode only has 41 users, obviously, stackoverflow is dead )

Answer 5

Myth - Perl is a shell scripting language

In terms of type 2 myths, one I bump into a lot is the (often unspoken) myth that Perl is really just Bash (Korn, Zsh, whatever) on steroids.

On Linux forums, I constantly see people using Perl to string together calls to awk, grep (not Perl's), sed, etc.

There are lots of problems here, but it ties back into some of the problems from the spaghetti myth.

If all you write are Perl scripts à la shell then, then no wonder some people have such a low opinion of the language.

Also again, it's somewhat easy to write such a script, but it's an unholy, unportable mess and a waste of a very powerful, often elegant language.

Answer 6

The favorite one I saw (in an email sig, and only once so it doesn't really meet the "myths you keep seeing" criteria) was that Larry wrote Perl as a thin but incomprehensible layer on top of Python for people who wanted to guarantee their job security.

I know, I'm pretty certain Perl pre-dates Python as well, but it was pretty funny when I first saw it.

Evidence for Perl pre-dating Python:

Initial Check-in Message for Perl:

Author: Larry Wall 
Date:   Fri Dec 18 00:00:00 1987 +0000

    a "replacement" for awk and sed

    [  Perl is kind of designed to make awk and sed semi-obsolete.
       This posting will include the first 10 patches after the main
       source.  The following description is lifted from Larry's
       manpage. --r$  ]

       Perl is a interpreted language optimized for scanning
       arbitrary text files, extracting information from those text
       files, and printing reports based on that information.  It's
       also a good language for many system management tasks.  The
       language is intended to be practical (easy to use, efficient,
       complete) rather than beautiful (tiny, elegant, minimal).  It
       combines (in the author's opinion, anyway) some of the best
       features of C, sed, awk, and sh, so people familiar with those
       languages should have little difficulty with it.  (Language
       historians will also note some vestiges of csh, Pascal, and
       even BASIC-PLUS.) Expression syntax corresponds quite closely
       to C expression syntax.  If you have a problem that would
       ordinarily use sed or awk or sh, but it exceeds their
       capabilities or must run a little faster, and you don't want
       to write the silly thing in C, then perl may be for you.
       There are also translators to turn your sed and awk scripts
       into perl scripts.

Initial appearance of Python: 1991 [src] ^[1]

[1] http://en.wikipedia.org/wiki/Python%5F%28programming%5Flanguage%29

Answer 7

Myth: Perl is slow

You might think that as an "interpreted scripting language" Perl must be slow. In addition: perl provides easy access to hashes which are more flexible and convenient but much less efficient than structures in C or objects in C++.

The truth is: Perl is a very rich system. It has tools to process data very flexibly and conveniently. But it also has tools to process huge data sets as efficiently as C/C++; usually with much less programming effort (disclosure: right now, I do just this for a living).

First of all: the two tools to learn about are Inline::C (which allows you to easily include C code into your Perl programs) and PDL (conceived by some genus --- process large data matrices with the efficiency of C but without writing C code) and PDL::IO::FastRaw to keep your data in memory-mapped files.

The next ingredient is to be realistic about trade offs: Usually, you can trade some form of efficiency (CPU cycles/memory) by some other form of efficiency (flexibility/convenience). Writing a faster program honestly means paying more attention to the details of how the computer will do the work. You don't find a low-level programming language that is 10 times faster than perl in processing the same program. Instead, it forces you to always take the "fast" choices: pay programming hours to gain a few processing seconds.

Perl offers you a much more intelligent option: Write most of the program in perl (with all the benefits of Perl/CPAN) and optimize just the bottlenecks --- if needed by converting them to C.

PDL is an even better choice (at least if you are a mathematician and understand how to make use of it).

In my work, I usually find that after optimizing a small proportion of the program, I honestly can't justify optimizations in the remaining program because it doesn't take noticeable time to run. But exactly this part of the program (configuration; input/output) would be a real pain in C.

Summary: As compared to C, Perl gives you a choice what part of the program to optimize and what part to leave as-is. As compared to Java etc., Perl gives you access to both C-performance (and C-libraries) and CPAN libraries, and the interfaces to work with are much more convenient.

Disclosure: This approach involves some learning and attention to make right decisions what to optimize in your program --- as supposed to blindly comparing benchmarks in two different languages. But as I said: I do this for a living, not for entertainment.

Answer 8

Myth: Perl Regular Expressions are PCRE

This is not directly Perl related, but lots and lots of help is sent looking Perl's way because of the "Perl" in "Perl Compatible Regular Expressions".

This is a myth, because

PCRE is not Perl Compatible.
PCRE is hardly regular.

If you go and ask a Perl programmer, you will get Perl specific answers, most of which will involve Perl specific libraries, because Perl programmers learnt long ago Regular Expressions are sub-optimal in many many cases and rely heavily on people to have solved the problems for them, and uploaded to the friendly CPAN library.

IE: Here is a small sample of reactions you will get when asking for a regular expression from a Perl group.

Parse HTML with a regular expression

Don't parse html with regular expressions! See HTML::Parser, and its subclasses: HTML::TokeParser, HTML::TokeParser::Simple, HTML::TreeBuilder(::Xpath)?, HTML::TableExtract, etc. See also http://htmlparsing.icenine.ca/ . If your response begins "that's overkill. i only want to..." you are wrong.

Parse CSV/xSV with a regular expression

use Text::CSV_XS, Text::xSV or DBD::CSV

Parse Email with a regular expression

Regexp::Common::Email::Address ^[1]

Asking Perl for help with Regular Expressions makes about as much sense as asking Java people about JavaScript. ( In case you didn't know, they're not related except in name either ) ^[2]

Have a look at all the hairy options Perl supports

http://perldoc.perl.org/perlre.html

If you want these features, ( which we'll ultimately try helping you to use on your PCRE machine, only to discover it doesn't work because of PCRE ), you'll need ACTUAL Perl.

[1] http://p3rl.org/Regexp::Common::Email::Address
[2] http://stackoverflow.com/questions/245062/whats-the-difference-between-javascript-and-java

Answer 9

Myth: Perl is only for parsing text files

A lot of people think about Perl as a language only to parse some text files, or maybe to write web backends in it. But via its modules, Perl is lot more than that.

Perl is nowadays used for a variety of tasks, including writing complex desktop applications such as the Padre Perl IDE ^[1] ( screenshots ^[2]), advanced web application frameworks (e.g. Catalyst ^[3]), in bioinformatics (extensively, here's a list of Amazon books on Perl in bioinformatics ^[4]), content management systems (e.g. WebGUI ^[5]), hierarchical wikis ( MojoMojo ^[6]) etc. and powers very large websites ( IMDB ^[7], Magazines.com ^[8], BBC, Amazon.com, LiveJournal, Ticketmaster, Craigslist, and many ^[9] others ^[10]).

[1] http://padre.perlide.org/
[2] http://padre.perlide.org/trac/wiki/Screenshots
[3] http://catalystframework.org
[4] http://www.amazon.com/s?url=search-alias%3Dstripbooks&field-keywords=perl+bioinformatics
[5] http://www.webgui.org/
[6] http://mojomojo.org
[7] http://www.imdb.com/help/show_leaf?jobatimdb
[8] http://www.appliedstacks.com/NewestFirst/Magazines
[9] http://use.perl.org/articles/08/12/22/0830205.shtml
[10] http://www.appliedstacks.com/NewestFirst/Perl

Answer 10

Myth: perl is fast

Some people seem to think that perl is actually good for writing efficient programs. I've found this to be largely untrue.

While it is possible to get decent performance out of some aspects of perl, and it's possible to make simple programs which are fairly efficient (e.g. read each line of a file and do some regexp matching), perl performance degrades quite badly when it's used for programming in the large.

The main problem I find is that perl copies a string whenever you do anything - it doesn't have copy-on-write strings, nor are strings immutable (ala C#, Python, Java), instead it copies a string whenever you could possibly want the new string to be different (e.g. $b = $a usually copies the string), even if it is never subsequently modified.

This creates a lot of CPU load and fragments the heap.

Moreover, the way that perl is usually used makes this worse. Perl objects use a lot of resource, and perl hashes store one copy of the hash key string for each hash it appears in, so if you have a million hashes with a key of "hello", you'll store a million "hello"s. This means that data structures use a lot of memory. Too much for high performance applications, typically.

I'm not claiming that Python is faster - it's very difficult to rewrite a nontrivial program in a different language to compare, and I don't have enough detailed Python knowledge to be able to compare the guts objectively.

In things such as "The great language shootout", Perl fares relatively well, this is because the programs were written optimally, and the