File Manipulation

  • Perl provides a large number of functions to perform various operations on files
  • These are very similar to the corresponding UNIX system call or command
  • See the UNIX man pages for details

File Test Operators

  • Operates on a filename or filehandle argument (except for -t which only operates on a filehandle argument)
  • Tests associated file to determine if something is true or not about the file
  • If the argument is omitted, $_ is tested (except for -t which tests STDIN)
  • Most of these operators return 1 for True and the empty string for False, or undef if the file does not exist (except for -s which returns the file size and -M, -A and -C which return the
    file age)
  • Precedence is higher than logical and relational operators, but lower than arithmetic operators
  • For superuser, -r, -R, -w and -W always return True and -x and -X return True if ANY execute bit is set

Example;

CODE:
  1. if (-e "/etc/passwd")                # Does it exist?
  2.          {
  3.            print ("Let's start hacking!\n");
  4.          }

File Test Operator List

  • -r File is readable by effective uid
  • -w File is writable by effective uid
  • -x File is executable by effective uid
  • -o File is owned by effective uid
  • -R File is readable by real uid
  • -W File is writable by real uid
  • -X File is executable by real uid
  • -O File is owned by real uid
  • -e File exists
  • -z File exists and has zero size
  • -s File exists and has nonzero size (returns size in bytes)
  • -f File is a plain file
  • -d File is a directory
  • -l File is a symbolic link
  • -p File is a named pipe (FIFO)
  • -S File is a socket
  • -b File is a block special file
  • -c File is a character special file
  • -u File has its setuid bit set
  • -g File has its setgid bit set
  • -k File has its sticky bit set
  • -t Filehandle is a tty
  • -T File is a text file
  • -B File is a binary file
  • -M Modification age in days
  • -A Access age in days
  • -C Inode-modification age in days

Stat Function

  • Returns a 13-element array of info on a file
  • stat (FILEHANDLE)
    • stat FILEHANDLE
    • stat (FILENAME)

    Useful for file info which the file test operators do not provide (such as number of links) or for finding the true mode when superuser

    - Typical use:

    ($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size, $atime, $mtime, $ctime, $blksize, $blocks) = stat ($filename);

    $file = "toy1.c";
    ($uid, $gid) = stat ($file) [4,5];

    Lstat Function

    • Same as the stat() function, but gives info on a symbolic link itself
    • lstat (FILEHANDLE)
    • lstat FILEHANDLE
    • lstat (FILENAME)
    • ul>

      The _ Filehandle

    • Whenever a file test operator, stat function or lstat function is used, Perl invokes the proper system call (stat(2) on UNIX) to get the required info
    • Doing a file test, stat or lstat on the special _ filehandle, causes Perl to use the existing memory cache (stat buffer) of file info from the previous file test, stat or lstat
    • -Example

      CODE:
      1. if (-r $file && -w _)
      2.          {
      3.            print ("$file is both readable and writable\n");
      4.          }

      The above does only one invocation of stat(2) which is more efficient than the following which causes two invocations of stat(2):

      CODE:
      1. if (-r $file && -w $file)
      2.          {
      3.            print ("$file is both readable and writable\n");
      4.          }

      File Name Expansion (Globbing)

    • If the string inside angle brackets is NOT a filehandle, it is interpreted as a C-Shell Filename Expansion (Globbing) pattern (versus the input operator)
    • All C-Shell globbing metacharacters are valid: *, ?, [], -, {}, ~
    • In an array context, the glob returns a list of all filenames that match (or an empty list if none match). In a scalar context, the next filename that matches is returned (or undef if there are no more matches). This is all similar to how the input operator with a filehandle works.
    • One level of scalar variable interpolation is done
    • But since < $x> indicates an indirect filehandle, use < ${x}> for globbing
    • -Example

      CODE:
      1. <*.c>                     # All files that end in c
      2.          <ch [1-3]>                 # Files ch1, ch2 and ch3
      3.          
      4.          $x = "*.c";
      5.          <${x}>                    # All files that end in c

      Unlink Function

    • Removes one or more files (actually deletes links)
    • unlink (LIST)
    • unlink LIST
    • Returns the number of files successfully deleted
    • On failure $! is set to the value of errno
    • Uses unlink(2)
    • Typical use:

      CODE:
      1. $count = unlink ("toy1.c", "toy2.c", "toy3.c");
      2.  
      3.          $count = unlink (<*.c>);

      -Example

      CODE:
      1. #!/usr/bin/perl
      2.          # Simple rm program
      3.  
      4.          foreach $file (@ARGV)
      5.          {
      6.            unlink ($file) || print ("Could not unlink $file: $!\n");
      7.          }

      Rename Function

    • Renames a file
    • rename (OLDNAME, NEWNAME)
    • Returns 1 for success, 0 for failure
    • On failure $! is set to the value of errno
    • Similar to mv(1), but does NOT rename across filesystems and does not work if OLDNAME is a regular file and NEWNAME is an existing directory
    • - Typical use:

      CODE:
      1. $status = rename ("toy1.c", "toy2.c");
      2.          $status = rename ("toy1.c", "toys/toy1.c");

      Link Function

    • Creates a new hard link for a file
    • link (OLDNAME, NEWNAME)
    • Returns 1 for success, 0 for failure
    • On failure $! is set to the value of errno
    • Uses link(2)
    • - Typical use:

      CODE:
      1. $status = link ("toy1.c", "toy2.c");

      Symlink Function

    • Creates a new symbolic (soft) link for a file
    • symlink (OLDNAME, NEWNAME)
    • Returns 1 for success, 0 for failure
    • On failure $! is set to the value of errno
    • Uses symlink(2)
    • - Typical use:

      CODE:
      1. $status = symlink ("toy1.c", "toy2.c");

      Readlink Function

    • Reads the contents of a symbolic link file
    • readlink (FILENAME)
    • readlink FILENAME
    • Returns link contents on success, undef on failure
    • On failure $! is set to the value of errno
    • Uses readlink(2)
    • Uses $_ if FILENAME is omitted
    • - Typical use:

      CODE:
      1. $link = readlink ("toy2.c");

      Chmod Function

    • Changes the mode (permissions) of a list of files
    • chmod (LIST)
    • chmod LIST
    • Returns the number of files successfully changed
    • On failure $! is set to the value of errno
    • Uses chmod(2)
    • The first element of the list must be the numerical mode
    • - Typical use:

      CODE:
      1. $count = chmod (0755, "toy1.c");

      Chown Function

    • Changes the owner and group of a list of files
    • chown (LIST)
    • chown LIST
    • Returns the number of files successfully changed
    • On failure $! is set to the value of errno
    • Uses chown(2)
    • The first two elements of the list must be the numerical uid and gid
    • - Typical use:

      CODE:
      1. $count = chown ($uid, $gid, <toy *.c>);

      Utime Function

    • Changes the access (atime) and modification (mtime) times of a list of files
    • utime (LIST)
    • utime LIST
    • Returns the number of files successfully changed
    • On failure $! is set to the value of errno
    • Similar to touch(1)
    • The first two elements of the list must be the numerical access and modification times
    • The inode modification time (ctime) is set to the current time
    • - Typical use:

      CODE:
      1. $count = utime ($atime, $mtime, "toy1.c");

      Borrowed and reformatted from http://umbc7.umbc.edu/~tarr/perl/perl4/ch12-filemanip.html so I wouldn't loose it.