<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Innovations Technology Solutions &#124; Blog</title>
	<atom:link href="http://www.innovationsts.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://www.innovationsts.com/blog</link>
	<description></description>
	<lastBuildDate>Mon, 02 Jan 2012 16:21:42 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Make Self-Extracting Archives with makeself.sh</title>
		<link>http://www.innovationsts.com/blog/?p=3438</link>
		<comments>http://www.innovationsts.com/blog/?p=3438#comments</comments>
		<pubDate>Mon, 02 Jan 2012 16:21:42 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[How-Tos]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=3438</guid>
		<description><![CDATA[



    
        

 
     
      
      
      
    
               [...]]]></description>
			<content:encoded><![CDATA[<div style="font-family:arial,sans-serif;font-size:small;">
<br/>
<div style="position:relative;left:-8px;">
<video controls><br />
    <source src="/downloads/blog/makeself_post/makeself_Blog_Post.webm" type="video/webm"  />
    <source src="/downloads/blog/makeself_post/makeself_Blog_Post.ogv" type="video/ogg"  />    
<br />
<object id="flowplayer" width="600" height="450" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/makeself_post/makeself_Blog_Post.flv", "autoPlay":false}}' />             
</object>
</video>
</div>
<br />

<h3>Intro</h3>
<p class="narrow">
When making your custom scripts or software available to someone else, it&#8217;s a good idea to make that content as easy to extract and install as possible. You could just create a compressed archive, but then the end user has to manually extract the archive and decide where to place the files. Another option is creating packages (<code>.deb</code>, <code>.rpm</code>, etc) for the user to install, but then you&#8217;re more locked into a specific distribution. A solution that I like to use is to create a self-extracting archive file with the <code>makeself.sh</code> script. This type of archive can be treated as a shell script and will extract itself, running a scripted set of installation tasks when it&#8217;s executed. The reason this works is that the archive is essentially a binary payload with a script stub at the beginning.  This stub handles the archive verification and extraction process and then runs any predefined commands via a script specified at the time the archive is created. This model offers you a lot of flexibility, and can be used not only for installing scripts and software but also for things like documentation.
</p>

<h3>Installation</h3>
<p class="narrow">
The <code>makeself.sh</code> script is itself packaged as a self-extracting archive when you <a href="http://megastep.org/makeself/makeself.run">download it</a>. You can extract the script and its support files by running the <code>makeself.run</code> installer with a Bourne compatible shell (<strong>Listing 1</strong>).
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
$ sh makeself.run
Creating directory makeself-2.1.5
Verifying archive integrity... All good.
Uncompressing Makeself 2.1.5........
Makeself has extracted itself.
$ ls makeself*
makeself.run

makeself-2.1.5:
COPYING  makeself.1  makeself-header.sh  makeself.lsm  makeself.sh  README  TODO
</code>
</pre>
</p>

<p class="narrow">
You can see from the output that I&#8217;m working with version 2.1.5 of <code>makeself.sh</code> for this post. To make things easier, you can install <code>makeself.sh</code> in your <code>~/bin</code> directory, and then make sure <code>$HOME/bin</code> is in your <code>PATH</code> environment variable. You need to ensure that <code>makeself.sh</code> and <code>makeself-header.sh</code> are in the directory together unless you&#8217;re going to specify the location of <code>makeself-header.sh</code> with the <code class="optionsonly">--header</code> option (<strong>Listing 3</strong>).
</p>

<h3>General Usage</h3>
<p class="narrow">
<strong>Listing 2</strong> shows the usage syntax for <code>makeself.sh</code>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
makeself.sh [OPTIONS] archive_dir file_name label startup_script [SCRIPT_ARGS]
</code>
</pre>
</p>

<p class="narrow">
After the <code>OPTIONS</code>, you need to supply the path and name of the directory that you want to include in the archive. The next argument is the file name of the self-extracting archive that will be created. You can choose any name you want, but for consistency and clarity it&#8217;s recommended that the file have a <code>.run</code> or <code>.sh</code> file name extension. Next, you can specify a label that will act as a short description of the archive and will be displayed during extraction. The final argument to <code>makeself.sh</code> is the name of the script that you want to have run after extraction is complete. In turn, this script can have arguments passed to it that are represented by <code>[SCRIPT_ARGS]</code> in <strong>Listing 2</strong>. It&#8217;s important not to get the arguments to the startup script confused with the arguments to <code>makeself.sh</code>.
</p>

<p class="narrow">
<strong>Listing 3</strong> shows some of the options for use with <code>makeself.sh</code>. You can find a comprehensive list on the <code>makeself.sh</code> <a href="http://megastep.org/makeself/" target="_blank">webpage</a>, but in my own experience I&#8217;m usually only concerned with the options listed here.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 3</strong></em></p>
<code>
<em>--gzip</em> : Use gzip for compression (default setting)
<em>--bzip2</em> : Use bzip2 for better compression. Use the '.bz2.run' file name extension to avoid
        confusion on the compression type.
<em>--header</em> : By default it's assumed that the "makeself-header.sh" header script is stored in the
        same location as makeself.sh. This option can be used to specify a different location
        if it's stored somewhere else.
<em>--nocomp</em> : Do not use any compression, which results in an uncompressed TAR file.
<em>--nomd5</em> : Disable the creation of an MD5 checksum for the archive which speeds up the
        extraction process if you don't need integrity checking.
<em>--nocrc</em> : Same as <em>--nomd5</em> but disables the CRC checksum instead.
</code>
</pre>
</p>

<p class="narrow">
In addition to the options passed to <code>makeself.sh</code> when creating the archive, there are options that you can pass to the archive itself to influence what happens during and after the extraction process. <strong>Listing 4</strong> shows some of these options, but again please have a look at the <code>makeself.sh</code> <a href="http://megastep.org/makeself/" target="_blank">webpage</a> for a full list.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 4</strong></em></p>
<code>
<em>--keep</em> : Do not automatically delete any files that were extracted to a temporary directory.
<em>--target DIR</em> : Set the directory (DIR) to extract the archive to.
<em>--info</em> : Print general information about the archive without extracting it.
<em>--list</em> : List the files in the archive.
<em>--check</em> : Check the archive for integrity.
<em>--noexec</em> : Do not run the embedded script after extraction.
</code>
</pre>
</p>

<h3>Example</h3>
<p class="narrow">
Let&#8217;s go through a practical example using some of the information above. If you had a directory named <code>myprogram</code> within your home directory and you wanted to package it, you could create the archive with the command line at the top of <strong>Listing 5</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 5</strong></em></p>
<code>
$ makeself.sh --bzip2 myprogram/ myprogram.bz2.run "The Installer For myprogram" ./post_extract.sh 
Header is 402 lines long

About to compress 20 KB of data...
Adding files to archive named "myprogram.bz2.run"...
./
./myprogram.c
./post_extract.sh
./myprogram
CRC: 955035546
MD5: 7b74c31f31589ee236dea535cbc11fe4

Self-extractible archive "myprogram.bz2.run" successfully created.
</code>
</pre>
</p>

<p class="narrow">
Notice that I used bzip2 compression via the <code class="optionsonly">--bzip2</code> option rather than using the default of gzip. I couple this with setting the file name extension to <code>.bz2.run</code> so that the end user will have a way of knowing that I used bzip2 compression. After the compression option, I pass an argument requesting that the <code>myprogram</code> directory, which contains a simple C program also called <code>myprogram</code>, be added to the archive. After the file name specification (with the <code>.bz2.run</code> extension), we come to the description label for the archive. This can be a string of your choosing and will be displayed with the output from the extraction process. The last argument is the &#8220;startup script&#8221; that will be run when the archive is extracted. <strong>Listing 6</strong> shows the contents of my simple startup script (<code>post_extract.sh</code>) that installs the <code>myprogram</code> binary in the user&#8217;s <code>bin</code> directory, but only if they have one.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 6</strong></em></p>
<code>
#!/bin/sh
   
#Install to ~/bin if it exists
if [ -d $HOME/bin ]
then
    cp myprogram $HOME/bin/
fi
</code>
</pre>
</p>

<p class="narrow">
Notice that when specifying the startup script, I used the path of <code>./</code> which points to the current directory. This is a reference to the directory after the extraction, not the directory where the script resides when you&#8217;re creating the archive. Your startup script should be inside the directory that you&#8217;re adding to the archive. One other thing to note about the startup script is that you will need to set its execute bit before creating the archive. Otherwise you&#8217;ll get a <code class="warnerror">Permission denied</code> error when the makeself-header script stub tries to execute the script.
</p>

<p class="narrow">
Now we transition to the end user viewpoint, where the self-extracting archive has been downloaded and we&#8217;re getting ready to run it. You can set the execute bit of the archive and run it directly, or execute it with a Bourne compatible shell the way the <code>makeself.run</code> installer was: <code>sh makeself.run</code> . Before we extract the archive though, lets verify its integrity and have a look the contents (<strong>Figure 7</strong>).
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 7</strong></em></p>
<code>
$ sh myprogram.bz2.run --check
Verifying archive integrity... MD5 checksums are OK. All good.
$
$ sh myprogram.bz2.run --list
Target directory: myprogram
drwxr-xr-x jwright/jwright   0 2011-12-20 13:49 ./
-rw-r--r-- jwright/jwright  66 2011-12-20 11:45 ./myprogram.c
-rw-r--r-- jwright/jwright  99 2011-12-20 11:49 ./post_extract.sh
-rwxr-xr-x jwright/jwright 7135 2011-12-20 11:45 ./myprogram
</code>
</pre>
</p>

<p class="narrow">
We can see from the first command that the archive is intact and that there are no errors. The second command shows us that the archive contains 3 files. The first is the source file <code>myprogram.c</code> which I left in the archive directory so that I could have the option of giving the user the source code. The next file is the startup script that will be run after extraction. The last file of course is the binary that our end user is wanting to install. Lets go ahead and install <code>myprogram</code> by using the execute bit on the archive (<strong>Listing 8</strong>).
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 8</strong></em></p>
<code>
$ chmod u+x myprogram.bz2.run
$ ./myprogram.bz2.run 
Verifying archive integrity... All good.
Uncompressing The Installer For myprogram....
</code>
</pre>
</p>

<p class="narrow">
Now to test that the installation worked, we can try to run myprogram (<strong>Figure 9</strong>).
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 9</strong></em></p>
<code>
$ myprogram
Hello world!
</code>
</pre>
</p>

<p class="narrow">
I can see that the program is present and did exactly what I expected it to do. Keep in mind that if <code>~/bin</code> is not in your <code>PATH</code> variable you&#8217;ll have to supply the full path to the <code>myprogram</code> binary.
</p>

<h3>Conclusion</h3>
<p class="narrow">
This has been a quick overview of what <code>makeself.sh</code> can do. I&#8217;ve found it to be a very useful script that is also very dependable and easy to use. Through the use of the startup script, along with the full complement of options, <code>makeself.sh</code> offers you a lot of flexibility when creating installers. You can create this type of self-extracting archive manually, but <code>makeself.sh</code> makes it much easier and adds great features like checksum validation.
</p>

<p class="narrow">
Please feel free to leave any comments or questions below, and have a look at <a href="http://www.innovationsts.com" target="_blank">innovationsts.com</a> for other projects, tips, how-tos, and service offerings available from Innovations Technology Solutions. Thanks for reading.
</p>

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
An audio transcript of this post has been provided. Click on a format below to listen in a new window or right click and save the audio to listen later.
</p>
<p>Get An Audio Transcript Of This Post<br />
<a href="/downloads/blog/makeself_post/makeself_Usage_Post.ogg" target="_blank">ogg</a> (4.3 MB) | <a href="/downloads/blog/makeself_post/makeself_Usage_Post.mp3" target="_blank">mp3</a> (6.9 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<ol>
<li><a name="res1" href="http://megastep.org/makeself/"  target="_blank">makeself.sh Homepage</a></li>
<li><a name="res2" href="https://github.com/megastep/makeself"  target="_blank">makeself.sh GitHub Page</a></li>
<li><a name="res3" href="http://www.linuxjournal.com/content/add-binary-payload-your-shell-scripts"  target="_blank">Linux Journal Article on How to Make Self-Extracting Archives Manually</a></li>
</ol>
</div><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=3438</wfw:commentRss>
		<slash:comments>4</slash:comments>
<enclosure url="http://www.innovationsts.com/downloads/blog/makeself_post/makeself_Blog_Post.flv" length="62733569" type="video/x-flv" />
		</item>
		<item>
		<title>Bodhi Linux On a Touchscreen Device</title>
		<link>http://www.innovationsts.com/blog/?p=2959</link>
		<comments>http://www.innovationsts.com/blog/?p=2959#comments</comments>
		<pubDate>Tue, 06 Dec 2011 01:02:28 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=2959</guid>
		<description><![CDATA[


    
        

 
     
      
      
      
    
               [...]]]></description>
			<content:encoded><![CDATA[<br/>
<div style="position:relative;left:-8px;">
<video controls><br />
    <source src="/downloads/blog/projects/Bodhi_Linux_On_A_Touchscreen.webm" type="video/webm"  />
    <source src="/downloads/blog/projects/Bodhi_Linux_On_A_Touchscreen.ogv" type="video/ogg"  />    
<br />
<object id="flowplayer" width="600" height="450" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/projects/Bodhi_Linux_On_A_Touchscreen.flv", "autoPlay":false}}' />             
</object>
</video>
</div>
<br />

<h3>Intro</h3>
<p class="narrow">
Welcome, in this blog post we&#8217;re going to set <a href="http://www.bodhilinux.com/" target="_blank">Bodhi Linux</a> up on a touchscreen device. Since the last post covered touchscreen calibration, I thought I would go one step beyond that by choosing and configuring a distribution to make the touchscreen easy to use (on-screen keyboard, finger scrolling, etc). This post won&#8217;t be an exhaustive run through of everything that you can do with Bodhi on a touchscreen system, but my hope is to give you a good start. Please feel free to talk about your own customizations and ways of doing things in the comment section. We&#8217;ll be focusing on desktop touchscreens and Intel based tablets here, but Bodhi also has an ARM version that&#8217;s currently in alpha. The ARM version of Bodhi will officially support <a href="http://www.archos.com/products/ta/index.html?country=us&#038;lang=en" target="_blank">Archos Gen 8 tablets</a> initially, and then expand support out from there. I&#8217;m using Bodhi because it has a nice <a href="http://www.enlightenment.org" target="_blank">Enlightenment</a> <em>Tablet</em> profile that I think makes using a touchscreen system fairly natural and intuitive. You of course could also use another distro like Ubuntu (Unity) or Fedora (Gnome Shell) with your touchscreen but, as I mentioned, I&#8217;m partial to Bodhi for this use.
</p>

<h3>The Software</h3>
<p class="narrow">
For this post I installed Bodhi 1.2.0 (i386) and used <code>xinput-calibrator</code> as the touchscreen calibration utility. I wrote a Tech Tip on <code>xinput-calibrator</code> last month that you can find <a href="http://www.innovationsts.com/blog/?p=3040" target="_blank">here</a>. If your touchscreen doesn&#8217;t work correctly out of the box, I would suggest following the instructions in that blog post before moving on. If you&#8217;re new to Bodhi Linux, you might want to have a look at their <a href="http://www.bodhilinux.com/wiki/doku.php" target="_blank">wiki</a>. I&#8217;ve also found Lead Bodhi Developer Jeff Hoogland&#8217;s <a href="http://jeffhoogland.blogspot.com" target="_blank">blog</a> to be very informative, especially when I was setting Bodhi up for this post. Jeff and the other users on the Bodhi <a href="http://forums.bodhilinux.com" target="_blank">forum</a> are very nice and helpful if you want to ask questions too.
</p>

<h3>The Hardware</h3>
<p class="narrow">
 My test machine was an Intel based Lenovo T60 laptop with an attached <a href="http://elotouch.com/Products/LCDs/1515L/default.asp" target="_blank" />Elo Touchsystems 1515L</a> 15&#8243; Desktop Touchmonitor. Even if you&#8217;re working with Bodhi Linux on an ARM device though, you&#8217;ll still be able to take a lot of tips away from this post.
</p>

<h3>Installation</h3>
<p class="narrow">
I put a standard installation of Bodhi on the Lenovo T60 by simply following the on-screen instructions. Once I had it installed, I booted the system and ended up at the initial <em>Profile</em> selection screen.
</p>

<p class="narrow" style="text-align:center;">
<a href="/images/blog/projects/bodhi_tablet_profile_screen.png" target="_blank"><img src="/images/blog/projects/bodhi_tablet_profile_screen_thumbnail.png" width="550px"></img></a>
<br />
<strong>The Bodhi Linux Profile Selection Screen</strong>
</p>

<p class="narrow">
Since Bodhi uses Enlightenment for it&#8217;s desktop manager, this profile selection gives you an easy way to customize the Enlightenment UI for the way you&#8217;ll use it. In this case we&#8217;ll be interacting with Bodhi via a touchscreen, so we want to choose the <em>Tablet</em> profile. The next screen is theme selection, and for our purposes it doesn&#8217;t matter which theme you choose.
</p>

<p class="narrow">
Once you&#8217;ve chosen a theme you should be presented with the Bodhi tablet desktop. The first thing that I notice on my machine is that the Y-axis of the touchscreen is inverted. When I touch the bottom of the screen the cursor jumps to the top, and vice versa. In order to fix that we need to get the machine on a network so that we can download and install the screen calibration utility. Bodhi&#8217;s network manager applet is easy to find on the right hand side of the taskbar. After clicking on that and setting up my local wireless network, I&#8217;m ready to download and install my preferred screen calibration utility &#8211; <a href="http://www.freedesktop.org/wiki/Software/xinput_calibrator" target="_blank">xinput-calibrator</a>. As I mentioned, I wrote a <a href="http://www.innovationsts.com/blog/?p=3040" target="_blank">blog post</a> about <code>xinput-calibrator</code> last month.
</p>

<h3>Customization</h3>
<p class="narrow">
Now we can start on the customizations that will make our touchscreen system easier to use. The first thing that I did was install <a href="http://www.mozilla.org/en-US/firefox/new/" target="_blank">Firefox</a>. If you&#8217;re running on a lower power device you might want to stick with Midori, which is Bodhi&#8217;s default browser. If you use Firefox, there&#8217;s a nice add-on called <a href="https://addons.mozilla.org/en-US/firefox/addon/grab-and-drag/" target="_blank">Grab and Drag</a> that allows you to do drag and momentum scrolling. As you&#8217;ll see the first time you run it, Grab and Drag has quite a few settings and I think it&#8217;s worth the time to look through them. One other thing that I like to do with Firefox on a touchscreen device is hide the menu bar, but that&#8217;s just my personal preference.
</p>
<p class="narrow">
If you&#8217;re going to run Midori, you&#8217;re not out of luck on touch and drag scrolling. You can add the environment variable declaration <code>export MIDORI_TOUCHSCREEN=1</code> somewhere like <code>~/.profile</code> to enable touch scrolling. The drawback is that touch scrolling in Midori is not all that easy to use because it doesn&#8217;t distinguish between a touch to scroll, and a touch to drag an image or select text. I&#8217;ve also found that setting the MIDORI_TOUCHSCREEN variable on Bodhi 1.2.0 can be a little finicky, so if all else fails you can prepend <code>MIDORI_TOUCHSCREEN=1</code> to the command in the <code>Exec</code> line of Midori&#8217;s .desktop file. In version 1.2.0, a search for <code>midori.desktop</code> finds this file.
</p>

<p class="narrow">
<a href="http://xournal.sourceforge.net/" target="_blank">Xournal</a> is an application that allows you to write notes and sketch directly on the touchscreen. If you want to take notes on your touchscreen device, this is an application that you&#8217;ll want to check out. If you want to see Xournal in action, you can watch the videos below that have sections showing Jeff Hoogland using Xournal and Bodhi&#8217;s <em>Tablet</em> profile. In the videos you&#8217;ll see that Jeff uses his finger which worked okay for me, but to get nicer looking notes on the 1515L I had to switch to a stylus. If you want to install Xournal, just look for references to the <code>xournal</code> package in your package manager or download the latest version from the Xournal <a href="http://xournal.sourceforge.net/" target="_blank">website</a>. 
</p>

<p class="narrow">
Another customization that I make is to set the file manager up to respond to single clicks. Bodhi 1.2.0 uses <a href="http://pcmanfm.sourceforge.net/" target="_blank">PCManFM 0.9.9</a> as its default file manager, so to do this open it and click <em>Edit</em> -> <em>Preferences</em> in the menu. On the <em>General</em> tab make sure that the <em>Open files with single click</em> box is checked. Alternatively, you can use the less complete but more touch friendly <a href="http://trac.enlightenment.org/e/wiki/EFM" target="_blank">EFM</a> (Enlightenment File Manager). To use EFM, you&#8217;ll need to load the <em>EFM (Starter)</em> module under <em>Modules</em> -> <em>Files</em>. Once you&#8217;ve loaded the module, you can launch it by touching the <em>Bodhi</em> menu on the left hand side of the taskbar and then <em>Files</em> -> <em>Home</em>. The first time you use EFM you&#8217;ll need to add the navigation controls by right clicking on the toolbar, clicking <em>toolbar</em> -> <em>Set Toolbar Contents</em>, and then clicking on <em>EFM Navigation</em> followed by a click of the <em>Add Gadget</em> button. Please keep in mind that EFM is a work in progress, so it&#8217;s not feature-complete.
</p>

<p style="text-align:center">
<a href="/images/blog/projects/efm_screenshot.png" target="_blank"><img src="/images/blog/projects/efm_screenshot_thumbnail.png"  width="550px" /></a>
<br />
<strong>The Enlightenment File Manager (EFM)</strong>
</p>

<p class="narrow">
I&#8217;ve got PDF copies of two of the Linux magazines I normally read, so another addition I make is to install Acrobat Reader or an open source PDF reader. It&#8217;s best if you choose a reader with drag to scroll capability like <a href="http://get.adobe.com/reader/" target="_blank">Adobe Reader</a>. If you do use Adobe Reader, make sure that you have the <em>Hand tool</em> selected and use a continuous page view for the easiest scrolling.
</p>

<p class="narrow">
If you&#8217;re going to view images on your touchscreen system, you may want to install <a href="http://trac.enlightenment.org/e/wiki/Ephoto" target="_blank">Ephoto</a> which is a simple image viewer for Enlightenment. On a Bodhi/Ubuntu/Debian based system a search for the <code>ephoto</code> package should find what you need to install.
</p>

<p style="text-align:center">
<a href="/images/blog/projects/ephoto_screenshot.png" target="_blank"><img src="/images/blog/projects/ephoto_screenshot_thumbnail.png"  width="550px" /></a>
<br />
<strong>The Ephoto Image Viewer For Enlightenment</strong>
</p>

<h3>General Usage</h3>
<p class="narrow">
Below are a few tips for when you&#8217;re using your newly set up touchscreen system. So that you can see what&#8217;s possible when running Bodhi&#8217;s <em>Tablet</em> profile, I&#8217;ve included the two embedded videos below from <a href="http://www.youtube.com/user/Jeffhoogland" target="_blank">Jeff Hoogland</a>.
</p>

<p class="narrow">
<ul>
	<li>There is an applications menu button on the right side of the quick launch bar (bottom of the screen). Clicking this button will bring up a set of <em>Applications</em> along with Enlightenment <em>Widgets</em>, and Bodhi 1.2.0 seems to have a placeholder for a <em>Config</em> subset. There is also a more traditional applications menu button on the left end of the taskbar.</li>
	<li>You can touch and hold down on an icon (launcher) in the applications menu until it lets you drag it. You can then drag the launcher to the desktop or the quick launch bar.</li>
	<li>If you touch and hold the desktop, it&#8217;s icons and the icons in the quick launch bar will start to swing and will have red X&#8217;s beside them. If you click on one of the red X&#8217;s you&#8217;ll remove that launcher. Click on the big red X in the lower right-hand corner of the screen to exit this mode.</li>
	<li>To change to another workspace, simply drag your finger from right to left across the screen. There is a set of dots just above the quick launch bar that shows you which workspace you&#8217;re in. Each of the workspace desktops can be customized with their own set of icons, but the taskbar and quick launch bar stay the same.</li>
	<li>You can touch the <em>Scale</em> windows button on the left of the task bar to get a composited window list. Once you have this list, you can close windows simply by touching and dragging them off the screen.</li>
</ul>
<p style="text-align:center">
<a href="/images/blog/projects/task_bar_scale_btn.png"><img src="/images/blog/projects/task_bar_scale_btn_thumbnail.png" width="550px"></img></a>
<br />
<strong>The Scale Windows Button On The Tablet Profile Taskbar</strong>
</p>

<h3>Bodhi Linux Tablet Usage Videos</h3>
<p style="text-align:center;">
<!--Begin Jeff Hoogland's Dell Duo Video-->
<object width="600" height="320"><param name="movie" value="http://www.youtube.com/v/7qMTCXPybH4&#038;hl=en_US&#038;feature=player_embedded&#038;version=3"></param><param name="allowFullScreen" value="true"></param><param name="allowScriptAccess" value="always"></param><embed src="http://www.youtube.com/v/7qMTCXPybH4&#038;hl=en_US&#038;feature=player_embedded&#038;version=3" type="application/x-shockwave-flash" allowfullscreen="true" allowScriptAccess="always" width="600" height="320"></embed></object>
<!--End Jeff Hooglands Dell Duo Video-->
<strong>Jeff Hoogland Showing Bodhi Linux On A Dell Duo</strong>
<br />
<br />
<!--Start Jeff Hoogland Demonstrating Bodhi Linux ARM-->
<iframe width="600" height="320" src="http://www.youtube.com/embed/7jJNRm3RjTA" frameborder="0" allowfullscreen></iframe>
<!--End Jeff Hoogland Demonstrating Bodhi Linux ARM-->
<strong>Jeff Hoogland Demonstrating Bodhi Linux On An ARM Device</strong>
</p>

<h3>Possible Issues</h3>
<p class="narrow">
Below is a list of things that might cause you some trouble and/or confusion.
<br />
<ul>
	<li>In my experience when the GUI asked for an administrator password, I couldn&#8217;t enter it because the dialog was modal and didn&#8217;t allow me to get to the on-screen keyboard button. A good example of this happens when I try to launch the <em>Synaptic Package Manager</em>.</li>
	<li>If you have trouble closing a window with the Bodhi close button (far right side of the taskbar), try touching the window first to make sure it&#8217;s in focus.</li>
	<li>The on-screen keyboard is not context sensitive and does not do auto-completion. I wasn&#8217;t personally bothered by this, but some avid users of other tablet and smartphone platforms might be.</li>
	<li>Support for screen rotation (from portrait to landscape) will be hit and miss, and depends almost exclusively on community support. Unfortunately, many devices have closed specs so reverse engineering becomes the only solution.</li>
</ul>
</p>

<h3>Conclusion</h3>
<p class="narrow">
That concludes this quick Project. Please feel free to leave any comments or questions below. Before signing off, I&#8217;d like to thank Jeff Hoogland for being so helpful in answering my questions while I was writing this post. A great community has gathered around Bodhi, and I&#8217;m looking forward to see where Jeff and his team take the distro in the future. If you haven&#8217;t tried Bodhi yet, I highly encourage you to head over to their <a href="http://www.bodhilinux.org" target="_blank">website</a> and have a look. Also, have a look at innovationsts.com for other projects, tips, how-tos, and service offerings available from Innovations Technology Solutions. Thanks for reading.
</p>

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
An audio transcript of this post has been provided. Click on a format below to listen in a new window or right click and save the audio to listen later.
</p>
<p>Get An Audio Transcript Of This Post<br />
<a href="/downloads/blog/projects/Bodhi_Linux_On_A_Touchscreen.ogg" target="_blank">ogg</a> (6.3 MB) | <a href="/downloads/blog/projects/Bodhi_Linux_On_A_Touchscreen.mp3" target="_blank">mp3</a> (10.0 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<ol>
<li><a name="res1" href="http://jeffhoogland.blogspot.com/2011/07/bodhi-linux-for-arm-alpha-1.html"  target="_blank">Bodhi Linux for ARM Alpha 1 &#8211; Jeff Hoogland</a></li>
<li><a name="res2" href="http://www.bodhilinux.com/forums/index.php?/forum/24-arm/"  target="_blank">ARM Section of Bodhi Linux Forum</a></li>
<li><a name="res3" href="http://www.bodhilinux.com/forums/index.php?/topic/1507-arm-version-of-bodhi/"  target="_blank">Bodhi Linux Forum &#8211; Arm Version of Bodhi Discussion</a></li>
<li><a name="res4" href="http://jeffhoogland.blogspot.com/2011/08/howto-linux-on-dell-inspiron-duo.html"  target="_blank">HOWTO: Linux on the Dell Inspiron Duo</a></li>
<li><a name="res5" href="http://www.bodhilinux.com"  target="_blank">Bodhi Linux Website</a></li>
<li><a name="res6" href="http://jeffhoogland.blogspot.com/"  target="_blank">Lead Bodhi Developer Jeff Hoogland&#8217;s Blog</a></li>
<li><a name="res7" href="http://www.freedesktop.org/wiki/Software/xinput_calibrator"  target="_blank">xinput-calibrator freedesktop.org Page</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=2959</wfw:commentRss>
		<slash:comments>8</slash:comments>
<enclosure url="http://www.innovationsts.com/downloads/blog/projects/Bodhi_Linux_On_A_Touchscreen.flv" length="52102309" type="video/x-flv" />
		</item>
		<item>
		<title>Tech Tip &#8211; Touchscreen Calibration In Linux</title>
		<link>http://www.innovationsts.com/blog/?p=3040</link>
		<comments>http://www.innovationsts.com/blog/?p=3040#comments</comments>
		<pubDate>Mon, 03 Oct 2011 15:05:44 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=3040</guid>
		<description><![CDATA[


    
    

 
     
      
      
      
    
                 





Intro

Welcome, this [...]]]></description>
			<content:encoded><![CDATA[<br/>
<div style="position:relative;left:-8px;">
<video controls><br />
    <source src="/downloads/blog/tech_tip_video/Touchscreen_Tech_Tip.webm" type="video/webm"  />
    <source src="/downloads/blog/tech_tip_video/Touchscreen_Tech_Tip.ogv" type="video/ogg"  />
<br />
<object id="flowplayer" width="600" height="450" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/tech_tip_video/Touchscreen_Tech_Tip.flv", "autoPlay":false}}' />             
</object>
</video>
</div>
<br />

<h3>Intro</h3>
<p class="narrow">
Welcome, this is an Innovations Tech Tip. I recently did some work with an <a href="http://elotouch.com/Products/LCDs/1515L/default.asp" target="_blank" />ELO Touchsystems 1515L</a> 15&#8243; LCD Desktop Touchmonitor. I was pleased with the touchmonitor&#8217;s hardware and performance, but in order to make it work properly in Linux I had to find a suitable calibration program. Out of the box on several distributions this touchscreen exhibits Y-axis inversion, where touching the top of the screen moves the cursor to the bottom and vice versa. <a href="http://www.freedesktop.org/wiki/Software/xinput_calibrator" target="_blank">xinput-calibrator</a> is a <a href="http://www.freedesktop.org" target="_blank">freedesktop.org</a> project that worked well for calibration, fixing the Y-axis inversion issue, and as a bonus it works for any standard <a href="http://www.x.org/wiki/" target="_blank">Xorg</a> touchscreen driver.
</p>

<h3>The Software</h3>
<p class="narrow">
For this post I tested on Bodhi Linux 1.2.0 (based on Ubuntu 10.04 LTS), Fedora 15, and Ubuntu 11.04. <code>xinput-calibrator</code>, as I mentioned, was the screen calibration utility.
</p>

<h3>The Hardware</h3>
<p class="narrow">
 My test machine was an Intel based Lenovo T60 laptop with an attached <a href="http://elotouch.com/Products/LCDs/1515L/default.asp" target="_blank" />ELO Touchsystems 1515L</a> 15&#8243; LCD Desktop Touchmonitor.
</p>

<h3>Installation</h3>
<p class="narrow">
Click <a href="http://www.freedesktop.org/wiki/Software/xinput_calibrator" target="_blank">here</a> to go to <code>xinput-calibrator</code>&#8217;s website and choose your package. Be aware that if you&#8217;re using the ARM version of Bodhi (in alpha at the time of this writing) it&#8217;s based on Debian, so you&#8217;ll want to grab the <a href="http://github.com/downloads/tias/xinput_calibrator/xinput-calibrator_0.7.5-1_i386.deb" target="_blank">Debian testing package</a>. You can also add a <a href="https://launchpad.net/~tias/+archive/xinput-calibrator-ppa" target="_blank">PPA</a> if you&#8217;re running Ubuntu, but I had trouble getting that to work during my tests. Last but not least, you can grab the source and compile it yourself by downloading the <a href="http://github.com/downloads/tias/xinput_calibrator/xinput_calibrator-0.7.5.tar.gz" target="_blank">tarball</a> or using <a href="https://github.com/tias/xinput_calibrator" target="_blank">git</a>.
</p>

<p class="narrow">
Before you actually install <code>xinput-calibrator</code> on a freshly installed Debian based system (including Ubuntu and Bodhi), make sure to update your package management system or you&#8217;ll get failed dependencies. This is because the package management system doesn&#8217;t know what packages are available in the repositories yet. This isn&#8217;t a problem with Fedora since the package management index is updated every time you use YUM. Once you&#8217;ve ensured that the system is or will be updated, you&#8217;ll be ready to install <code>xinput-calibrator</code> via the package that you downloaded or the PPA.
</p>

<h3>Calibration</h3>
<p class="narrow">
Once <code>xinput-calibrator</code> is installed, it should show up in your application menu(s). Look for an item labeled &#8220;Calibrate Touchscreen&#8221;. If you don&#8217;t see it anywhere, you can launch it from the terminal with the <code>xinput_calibrator</code> (note the underscore) command.
</p>

<p class="narrow">
<a href="/images/blog/tech_tips/xinput_calibrator.png" target="_blank"><img id="screenshot" src="/images/blog/tech_tips/xinput_calibrator.png" width="560px" height="428px" alt="xinput-calibrator screenshot" /></a>
<label for="screenshot" style="margin-left:10px;"><strong>Figure 1</strong> &#8211; xinput_calibrator screenshot</label>
</p>

<h3>Using It</h3>
<p class="narrow">
The use of <code>xinput-calibrator</code> is very simple. You&#8217;re presented with a full-screen application that asks you to touch a series of 4 points. The instructions say that you can use a stylus to increase precision, but I find that using my finger works well for the ELO touchscreen. One of the nice features of <code>xinput-calibrator</code> is that it&#8217;s smart enough to know when it encounters an inverted axis. After I run through the calibration the Y-axis inversion problem is fixed, so I&#8217;m ready to start using the touchscreen.
</p>

<h3>Persistent Calibration</h3>
<p class="narrow">
You&#8217;ll probably want your calibration to persist across reboots, so you&#8217;ll need to do a little more work now to make the settings permanent. First you&#8217;ll need to run the <code>xinput_calibrator</code> command from the terminal and then perform the calibration.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
$ xinput_calibrator
Calibrating EVDEV driver for "EloTouchSystems,Inc Elo TouchSystems 2216 AccuTouch® USB Touchmonitor Interface" id=9
    current calibration values (from XInput): min_x=527, max_x=3579 and min_y=3478, max_y=603

Doing dynamic recalibration:
    Setting new calibration data: 527, 3577, 3465, 600


--> Making the calibration permanent <--
  copy the snippet below into '/etc/X11/xorg.conf.d/99-calibration.conf'
Section "InputClass"
    Identifier    "calibration"
    MatchProduct    "EloTouchSystems,Inc Elo TouchSystems 2216 AccuTouch® USB Touchmonitor Interface"
    Option    "Calibration"    "527 3577 3465 600"
EndSection
</code>
</pre>
</p>

<p class="narrow">
Toward the bottom of the output you can see instructions for "Making the calibration permanent". This section will vary depending what <code>xinput_calibrator</code> detects about your system. In my case under Ubuntu the output was an xorg.conf.d snippet, which I then copied into the xorg.conf.d directory on my distribution. Be aware that even though the output says that <code>xorg.conf.d</code> should be located in <code>/etc/X11</code>, it might actually be located somewhere else like <code>/usr/share/X11</code> on your distribution. Once you've found the <code>xorg.conf.d</code> directory you can use your favorite text editor (with root privileges) to create the <code>99-calibration.conf</code> file inside of it. Now when you reboot, you should see that your calibration has stayed in effect.
</p>

<p class="narrow">
If you have a reason to avoid using an <code>xorg.conf.d</code> file to store your calibrations, you can run <code>xinput_calibrator</code> with the <code class="optionsonly">--output-type xinput</code> option/argument combo.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
$ xinput_calibrator --output-type xinput
Calibrating EVDEV driver for "EloTouchSystems,Inc Elo TouchSystems 2216 AccuTouch® USB Touchmonitor Interface" id=9
    current calibration values (from XInput): min_x=184, max_x=3932 and min_y=184, max_y=3932

Doing dynamic recalibration:
    Setting new calibration data: 524, 3581, 3482, 591


--> Making the calibration permanent <--
  Install the 'xinput' tool and copy the command(s) below in a script that starts with your X session
    xinput set-int-prop "EloTouchSystems,Inc Elo TouchSystems 2216 AccuTouch® USB Touchmonitor Interface" "Evdev Axis Calibration" 32 524 3581 3482 591
</code>
</pre>
</p>

<p class="narrow">
At the bottom of this output you can see that there are instructions for using xinput to make your calibration persistent. If it's not already present, you'll need to install <code>xinput</code> and then copy the command line in <code>xinput_calibrator</code>'s instructions into a script that starts with your X session. You can usually also add it to your desktop manager's startup programs via something like <code>gnome-session-properties</code> if you would prefer.
</p>

<p class="narrow">
Another option that might be of use to you is <code class="optionsonly">-v</code>. The <code class="optionsonly">-v</code> (<code class="optionsonly">--verbose</code>) option displays extra output so that you can see more of what's going on behind the scenes. If you have any trouble getting your calibration to work, this would be a good place to start.
</p>

<p class="narrow">
Your output will probably vary from what I have here depending on what type of hardware you have and which distribution you run. For instance, on Fedora 15 I get the <code>xinput</code> instructions by default instead of an <code>xorg.conf.d</code> snippet. Make sure that you run the above commands yourself, and don't copy the output from my listings.
</p>

<p class="narrow">
If you have a desire or need to redo the calibration periodically, you might want to consider creating a wrapper script to automate the process of making the calibration permanent. Such a script might use <code>sed</code> to strip out the relevant code and then a simple <code>echo</code> statement to dump it into the correct <code>xorg.conf.d</code> file or startup script.
</p>

<h3>Wrapping Up</h3>
<p class="narrow">
That concludes this Tech Tip. Have a look at <a href="http://www.innovationsts.com" target="_blank">innovationsts.com</a> for other tips, projects, how-tos, and service offerings available from Innovations Technology Solutions. Thanks, and stay tuned for more from Innovations.
</p>

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
An audio transcript of this post has been provided. Click on a format below to listen in a new window or right click and save the audio to listen later.
</p>
<p>Get An Audio Transcript Of This Post<br />
<a href="/downloads/blog/tech_tip_audio/touchscreen_calibration/Touchscreen_Tech_Tip_10_03_11.ogg" target="_blank">ogg</a> (16.1 MB) | <a href="/downloads/blog/tech_tip_audio/touchscreen_calibration/Touchscreen_Tech_Tip_10_03_11.mp3" target="_blank">mp3</a> (2.9 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<ol>
<li><a name="res1" href="http://www.freedesktop.org/wiki/Software/xinput_calibrator"  target="_blank">xinput-calibrator Page</a></li>
<li><a name="res2" href="https://github.com/tias/xinput_calibrator"  target="_blank">xinput-calibrator On Github</a></li>
<li><a name="res4" href="http://www.freedesktop.org"  target="_blank">freedesktop.org Page</a></li>
<li><a name="res5" href="http://www.bodhilinux.com"  target="_blank">Bodhi Linux Website</a></li>
<li><a name="res6" href="http://www.ubuntu.com"  target="_blank">Ubuntu Linux Website</a></li>
<li><a name="res7" href="http://www.fedoraproject.org"  target="_blank">Fedora Linux Website</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=3040</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Video Tip &#8211; Finding Open IP Addresses</title>
		<link>http://www.innovationsts.com/blog/?p=2867</link>
		<comments>http://www.innovationsts.com/blog/?p=2867#comments</comments>
		<pubDate>Mon, 29 Aug 2011 14:03:35 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[How-Tos]]></category>
		<category><![CDATA[System Administration]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=2867</guid>
		<description><![CDATA[


    
    
    

 
     
      
      
      
    
               [...]]]></description>
			<content:encoded><![CDATA[<br/>
<div style="position:relative;left:-8px;">
<video controls><br />
    <source src="/downloads/blog/vidtips/tech_tip_2_video.webm" type="video/webm"  />
    <source src="/downloads/blog/vidtips/tech_tip_2_video.ogv" type="video/ogg"  />
    <source src="/downloads/blog/vidtips/tech_tip_2_video.mp4" type="video/mp4"  />
<br />
<object id="flowplayer" width="600" height="450" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/vidtips/tech_tip_2_video.flv", "autoPlay":false}}' />             
</object>
</video>
</div>
<br />

<h3>Intro</h3>
<p class="narrow">
Welcome, this is an Innovations Tech Tip. In this tip we&#8217;re going to explore a couple of ways to find open IP (Internet Protocol) addresses on your network. You might need this information if you were going to temporarily set a static IP address for a host. Even after you&#8217;ve found an open IP though, you still need to take care to avoid IP conflicts if your network uses DHCP (Dynamic Host Configuration Protocol). Please also be aware that one of these techniques uses the <code>nmap</code> network scanning program, which may be against policy in some environments. Even if it&#8217;s not against corporate policy, the <code>nmap</code> man page states that &#8220;there are administrators who become upset and may complain when their system is scanned. Thus, it is often advisable to request permission before doing even a light scan of a network.&#8221;<span class="superscript"><a href="#res2">2</a></span>
</p>

<h3>arping</h3>
<p class="narrow">
The first technique that we&#8217;re going to cover is the use of the <code>arping</code> command to tell if a single address is in use. <code>arping</code> uses ARP (Address Resolution Protocol) instead of ICMP (Internet Control Message Protocol) packets. The reason this is significant is because many firewalls will block ICMP traffic as a security measure. So when using ICMP you&#8217;re never sure whether the host is really down, or if it&#8217;s just blocking your pings. ARP pings will almost always work because ARP packets are used to provide the critical network function of resolving IP addresses to MAC (Media Access Control) addresses. Hosts on an Ethernet network will use these resolved MAC addresses to communicate instead of IPs. Be aware that one case in which ARP pings will not work is when you&#8217;re not on the same subnet as the host you&#8217;re trying to ping. This is because ARP packets are not routed. See <a href="#res3">Resource #3</a> below for more details.
</p>

<p class="narrow">
<code>arping</code> has several options, but the three that we&#8217;ll be focusing on here are <code class="optionsonly">-I</code>, <code class="optionsonly">-D</code>, and <code class="optionsonly">-c</code> . The <code class="optionsonly">-I</code> option specifies the network interface that you want to use. In many cases you might use <code>eth0</code> as your interface, but I&#8217;m using a laptop connected via wireless and my interface is <code>wlan0</code> . The <code class="optionsonly">-D</code> option checks the specified address in DAD (Duplicate Address Detection) mode. Let&#8217;s look at an example.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
$ arping -I wlan0 -D 192.168.1.1
ARPING 192.168.1.1 from 0.0.0.0 wlan0
Unicast reply from 192.168.1.1 [D4:4D:D7:64:C6:5F] for 192.168.1.1 [D4:4D:D7:64:C6:5F] 2.094ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)
</code>
</pre>
</p>

<p class="narrow">
You can see that I&#8217;m pinging 192.168.1.1 (a known router) with the <code class="optionsonly">-D</code> option. If no replies are received DAD mode is considered to have succeeded, and you can be reasonably sure that address is free for use. <strong>Listing 2</strong> shows an example of what you would see if the address is not in use.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
$ arping -I wlan0 -c 5 -D 192.168.1.76
ARPING 192.168.1.76 from 0.0.0.0 wlan0
Sent 5 probes (5 broadcast(s))
Received 0 response(s)
</code>
</pre>
</p>

<p class="narrow">
Here I&#8217;ve picked a different network address that I knew would be unused. I&#8217;ve also added the <code class="optionsonly">-c</code> option mentioned above so that I could have <code>arping</code> stop after sending 5 requests. Otherwise <code>arping</code> would keep trying until I interrupted it (possibly via the Ctrl-C key combo).
</p>

<p class="narrow">
Armed with this information and a knowledge of any dynamic addressing scheme on my network, I can set a temporary static IP for a host. See <a href="#res1">Resource #1</a> for more information on <code>arping</code>.
</p>

<h3>nmap</h3>
<p class="narrow">
<code>nmap</code>, which stands for &#8220;Network MAPper&#8221;, was &#8220;designed to rapidly scan large networks&#8230;to determine what hosts are available on the network, what services (application name and version) those hosts are offering, what operating systems (and OS versions) they are running, what type of packet filters/firewalls are in use, and dozens of other characteristics.&#8221;<span class="superscript"><a href="#res2">2</a></span> We&#8217;ll be using this to find all of the currently used IP addresses on the network.
</p>

<p class="narrow">
<code>nmap</code> has many options and is a very deep utility, and I highly suggest spending some time reading its <a href="http://www.manpagez.com/man/1/nmap/" target="_blank">man page</a>. Of all these options, the only one that we&#8217;ll be dealing with in this quick tech tip is <code class="optionsonly">-e</code>. The <code class="optionsonly">-e</code> option allows you to specify the interface to use when scanning the network. This is similar to the <code class="optionsonly">-I</code> option of <code>arping</code>. The example below shows a simple usage. 
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 3</strong></em></p>
<code>
$ nmap -e wlan0 192.168.1.0/24

Starting Nmap 5.21 ( http://nmap.org ) at 2011-08-23 11:13 EDT
Nmap scan report for 192.168.1.1
Host is up (0.033s latency).
Not shown: 996 closed ports
PORT     STATE SERVICE
23/tcp   open  telnet
53/tcp   open  domain
80/tcp   open  http
5000/tcp open  upnp

Nmap scan report for 192.168.1.7
Host is up (0.00015s latency).
Not shown: 997 closed ports
PORT     STATE SERVICE
111/tcp  open  rpcbind
5900/tcp open  vnc
8080/tcp open  http-proxy

Nmap scan report for 192.168.1.10
Host is up (0.033s latency).
Not shown: 995 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
111/tcp  open  rpcbind
139/tcp  open  netbios-ssn
445/tcp  open  microsoft-ds
2049/tcp open  nfs

Nmap done: 256 IP addresses (3 hosts up) scanned in 4.22 seconds
</code>
</pre>
</p>

<p class="narrow">
The first thing to notice is the notation that I used to specify the network submask (<code>/24</code>). If you&#8217;re unfamiliar with this notation, please see <a href="#res5">Resource #5</a> below. The next thing to notice is that <code>nmap</code> gives us a lot more information than just what IPs are in use. <code>nmap</code> also shows us things like what ports are open on each host, and what service it thinks is running on each port. As a network administrator you can use this information to get a quick overview of your network, or you can dig deeper into <code>nmap</code> to perform in-depth network audits. In our case we&#8217;re just looking for an open IP address to use temporarily, so we can choose one that&#8217;s not listed. Again, care needs to be taken when statically setting IPs on a network with DHCP. Have a look at <a href="#res4">Resource #4</a> for a more comprehensive guide to using <code>nmap</code>.
</p>

<p class="narrow">
That concludes this Tech Tip. Have a look at innovationsts.com for other tips, tricks, how-tos, and service offerings available from Innovations Technology Solutions. Thanks, and stay tuned for more from Innovations.
</p>

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
An audio transcript of this post has been provided. Click on a format below to listen in a new window or right click and save the audio to listen later.
</p>
<p>Get An Audio Transcript Of This Post<br />
<a href="/downloads/blog/vidtips/tech_tip_2_audio.ogg" target="_blank">ogg</a> (13.4 MB) | <a href="/downloads/blog/vidtips/tech_tip_2_audio.mp3" target="_blank">mp3</a> (2.6 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<ol>
<li><a name="res1" href="http://linux.die.net/man/8/arping"  target="_blank">man arping</a></li>
<li><a name="res2" href="http://www.manpagez.com/man/1/nmap/"  target="_blank">man nmap</a></li>
<li><a name="res3" href="http://www.linux.com/archive/feature/50596"  target="_blank">Linux.com &#8211; Gerard Beekmans &#8211; Ping: ICMP vs. ARP</a></li>
<li><a name="res4" href="http://www.networkuptime.com/nmap/index.shtml"  target="_blank">Network Uptime &#8211; James Messer &#8211; Secrets of Network Cartography: A Comprehensive Guide to nmap</a></li>
<li><a name="res5" href="http://compnetworking.about.com/od/workingwithipaddresses/a/cidr_notation.htm"  target="_blank">About.com &#8211; Bradley Mitchell &#8211; CIDR &#8211; Classless Inter-Domain Routing</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=2867</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Video Tip &#8211; Using Pipes With The sudo Command</title>
		<link>http://www.innovationsts.com/blog/?p=2758</link>
		<comments>http://www.innovationsts.com/blog/?p=2758#comments</comments>
		<pubDate>Mon, 18 Jul 2011 13:07:19 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=2758</guid>
		<description><![CDATA[


    
 
     
      
      
      
    
                 





Summary

Here are some formatting conventions that [...]]]></description>
			<content:encoded><![CDATA[<br/>
<div style="position:relative;left:-8px;">
<video controls><br />
    <source src="/downloads/blog/vidtips/tech_tip_1_sudo.ogg" type="video/ogg"  /><br />
<object id="flowplayer" width="600" height="450" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/vidtips/tech_tip_1_sudo.flv", "autoPlay":false}}' />             
</object>
</video>
</div>
<br />

<h3>Summary</h3>
<p class="narrow">
Here are some formatting conventions that I&#8217;ll use in the text.
</p>

<p style="margin-left:40px;">
<code>
    Command Name or Directory Path
</code>
<br />
<code class="warnerror">
    Warning or Error
</code>
<br />
<code class="commandline">
    Command Line Snippet With Commands/Options/Arguments
</code>
<br />
<code class="optionsonly">
    Command Options and Their Arguments Only
</code>
<br />
    <span style="color:#FFAF3E;">Hyperlink</span>
</p>

<p class="narrow">
Welcome, this is an Innovations Tech Tip. In this tip we&#8217;re going to cover how to run a command sequence, such as a pipeline, using <code>sudo</code> which is sometimes also pronounced &#8220;pseudo&#8221;. It may be tempting to think of the &#8220;su&#8221; in <code>sudo</code> as standing for &#8220;super user&#8221; since, especially if you&#8217;re an Ubuntu user, you normally use <code>sudo</code> to execute things as root. Something that may surprise you though is that you can use the <code class="optionsonly">-u</code> option of <code>sudo</code> to specify a user to run the command as. This is assuming that you have the proper privileges. Have a look at the <code>sudo</code> man and info pages for more interesting options.
</p>

<p class="narrow">
Now, if you&#8217;ve ever tried to use <code>sudo</code> to run a command sequence such as a pipeline, where each step required superuser priveleges, you probably got a <code class="warnerror">Permission denied</code> error. This is because <code>sudo</code> only applies to the first command in the sequence and not the others. There are multiple ways to handle this, but there are two that stand out to me. First, you can use <code>sudo</code> to start a shell (such as <code>bash</code>) with root priveleges, and then give that shell the command string. This can be done using the <code class="optionsonly">-c</code> option of <code>bash</code>. To illustrate how this works, I&#8217;ll start out using <code>sudo</code> to run <code>cat</code> on a file that I created in the <code>/root</code> directory that I normally wouldn&#8217;t have access to.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
$ cat /root/example.txt
cat: /root/example.txt: Permission denied
$ sudo cat /root/example.txt
[sudo] password for jwright: 
You won't see this text without sudo.
</code>
</pre>
</p>

<p class="narrow">
If I try to use <code>sudo</code> with a pipeline to make a compressed backup of the <code>/root/example.txt</code> file, I again get the <code class="warnerror">Permission denied</code> error.

</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
$ sudo cat /root/example.txt | gzip > /root/example.gz
-bash: /root/example.gz: Permission denied
</code>
</pre>
</p>

<p class="narrow">
Notice that it&#8217;s the second command (the <code>gzip</code> command) in the pipeline that causes the error. That&#8217;s where our technique of using <code>bash</code> with the <code class="optionsonly">-c</code> option comes in.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 3</strong></em></p>
<code>
$ sudo bash -c 'cat /root/example.txt | gzip > /root/example.gz'
$ sudo ls /root/example.gz
/root/example.gz
</code>
</pre>
</p>

<p class="narrow">
We can see form the <code>ls</code> command&#8217;s output that the compressed file creation succeeded.
</p>

<p class="narrow">
The second method is similar to the first in that we&#8217;re passing a command string to <code>bash</code>, but we&#8217;re doing it in a pipeline via <code>sudo</code>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 4</strong></em></p>
<code>
$ sudo rm /root/example.gz
$ echo "cat /root/example.txt | gzip > /root/example.gz" | sudo bash
$ sudo ls /root/example.gz
/root/example.gz
</code>
</pre>
</p>

<p class="narrow">
Either method works, it&#8217;s just a matter of personal preference on which one to use.
</p>

<p class="narrow">
That concludes this Tech Tip. Have a look at innovationsts.com for other tips, tricks, how-tos, and service offerings available from Innovations Technology Solutions. Thanks, and stay tuned for more quick tips from Innovations.
</p>

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
I have provided an audio transcript of this post with commentary on each of the listings. Click on a format below to listen in a new window or right click and save the audio to listen to later.
</p>
<p>Get An Audio Transcript Of This Post With Author&#8217;s Commentary On Listings<br />
<a href="/downloads/blog/vidtips/tech_tip_1_sudo_audio.ogg" target="_blank">ogg</a> (11 MB) | <a href="/downloads/blog/vidtips/tech_tip_1_sudo_audio.mp3" target="_blank">mp3</a> (2 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<ol>
<li><a name="appleqs" href="http://linux.die.net/man/1/bash">man bash</a></li>
<li><a href="http://linux.die.net/man/8/sudo">man sudo</a></li>
<li><a href="http://soundunreason.com/InkWell/index.php/2010/06/bash-sudo-through-a-pipe/">The Ink Wells &#8211; James Cook</a></li>
<li><a href="http://www.linuxjournal.com/content/running-complex-commands-sudo">Linux Journal &#8211; Don Marti &#8211; Running Complex Commands with sudo</a></li>
<li><a href="http://www.amazon.com/Bash-Cookbook-Solutions-Examples-Cookbooks/dp/0596526784">bash Cookbook &#8211; Albing, Vossen, Newham</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=2758</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Writing Better Shell Scripts &#8211; Part 3</title>
		<link>http://www.innovationsts.com/blog/?p=2363</link>
		<comments>http://www.innovationsts.com/blog/?p=2363#comments</comments>
		<pubDate>Fri, 24 Sep 2010 20:11:52 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[How-Tos]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=2363</guid>
		<description><![CDATA[
Quick Start

This post doesn&#8217;t really lend itself to being a quick read, but you can have a look at the How-To section of this post and skip the rest if you&#8217;re in a hurry. I would highly recommend reading everything though, since there&#8217;s a lot of information that may serve you well in the future. [...]]]></description>
			<content:encoded><![CDATA[<br />
<h3>Quick Start</h3>
<p class="narrow">
This post doesn&#8217;t really lend itself to being a quick read, but you can have a look at the <a href="#howto">How-To</a> section of this post and skip the rest if you&#8217;re in a hurry. I would highly recommend reading everything though, since there&#8217;s a lot of information that may serve you well in the future. There is also <a href="#video">Video</a> attached to this post that may be a good quick reference for you. Don&#8217;t forget that the man and info pages of your Linux/Unix installation can be an invaluable resource as well when you&#8217;re trying to learn new concepts and solve problems.
</p>

<p><a name="video"></a></p>
<h3>Video</h3>
<p class="single">
To enhance this post, I&#8217;ve provided a video so that you can see a general overview of the concepts that are presented. Browsers supporting the HTML5 video tag should present you with a <a href="http://theora.org/">Theora</a> (ogg/ogv) video, and browsers lacking that support should give you a Flash substitute via <a href="http://flowplayer.org/">Flowplayer</a>. You can also download the Theora version of the video <a href="/downloads/blog/Better_Scripts_Part_3/Better_Scripts_Part_3_Video.ogg">here</a> by right clicking on the link. 
</p>
<p><video controls><br />
    <source src="http://www.innovationsts.com/downloads/blog/Better_Scripts_Part_3/Better_Scripts_Part_3_Video.ogg" type="video/ogg" style="margin-left:0px;" /><br />
<object id="flowplayer" width="592" height="432" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/Better_Scripts_Part_3/Better_Scripts_Part_3_Video.flv", "autoPlay":false}}' />             
</object>
</video>
<br />

<h3>Preface</h3>
<p class="narrow">
To make things easier on you, all of the black command line and script areas are set up so that you can copy the text from them. This does make using the commands and scripts easier, but if you&#8217;re not already familiar with the concepts presented here, typing the commands/code yourself and working through why you&#8217;re typing them will help you learn more. If you hit problems along the way, take a look at the <a href="#troubleshooting">Troubleshooting</a> section near the end of this post for help.
</p>
<p class="narrow">
There are formatting conventions that are used throughout this post that you should be aware of. The following is a list outlining the color and font formats used.
</p>
<p>
<code>
    Command Name or Directory Path
</code>
<br />
<code class="warnerror">
    Warning or Error
</code>
<br />
<code class="commandline">
    Command Line Snippet With Commands/Options/Arguments
</code>
<br />
<code class="optionsonly">
    Command Options and Their Arguments Only
</code>
<br />
    <span style="color:#FFAF3E;">Hyperlink</span>
</p>

<h3>Overview</h3>
<p class="narrow">
There is no way for me to cover all of the issues surrounding shell script security in a single blog post. My goal with this post is to help you avoid some of the most common security holes that are often found in shell scripts. No script can be un-crackable, but you can make the cracker&#8217;s task more challenging by following a few guidelines. A secondary goal with this post is to make you more savvy about the scripts that you obtain to run on your systems. With the fact that scripts written for the BASH and SH shells are so portable in the Linux/Unix world, it can be easy for a cracker to write malware that will run on many different systems. Having some knowledge about the security issues surrounding shell scripts might just keep you from installing/running a malicious script such as a trojan, which gives the cracker a back door to your system. The <a href="#resources">Resources</a> section holds books and links which will allow you to delve more deeply into this topic if you&#8217;re looking for more comprehensive knowledge. <strong>Listing 1</strong> shows an example script that contains some of the security problems that we&#8217;ll talk about in this post.
</p>

<p>
<pre class="cliwide" style="margin:0px">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
#!/bin/bash
# A SUID root script that demonstrates various security problems

# Prepend the current path onto the PATH variable
PATH=.:${PATH}

#Count the number of lines in a listing of the current directory
ls | wc -l

# Get user input
read USR_INPUT

# Check to see if the user supplied the right password
if [ $USR_INPUT == "mypassword" ];then
    echo "User input was $USR_INPUT and should have matched the string 'mypassword'"
fi

# Create a temp file
touch /tmp/mytempfile

# Set the temp file so that only the owner can read/write/execute the contents
chmod 0700 /tmp/mytempfile

# Save the password that the user supplied to the temp file
echo $USR_INPUT > /tmp/mytempfile
</code>
</pre>
</p>

<h3>Environment Variables</h3>
<p class="single">
Your shell script has little to no chance of running securely if it trusts the environment that it runs in, and that environment has been compromised. You can help protect your script from unintended behavior by not trusting items like environment variables. Whenever possible, assume that input from the external environment has been designed to cause your script problems.
</p>

<p class="single">
The <code>PATH</code> variable is a common source of security holes in scripts. Two of the most common issues are the inclusion of the current directory (via the <code>.</code> character) in the path, and using a <code>PATH</code> variable that&#8217;s been manipulated by a cracker. The reason that you don&#8217;t want the current directory included in the path is that a malicious version of a command like <code>ls</code> could have been placed in your current directory. For example, lets say your current directory is <code>/tmp</code> which is world writable. A cracker has written a script named <code>ls</code> and placed it in <code>/tmp</code> as well. Since you have the current directory at the front of your <code>PATH</code> variable in <strong>Listing 1</strong>, the malicious version of <code>ls</code> will be run instead of the normal system version. If the cracker wanted to help cover their tracks, they could run the real version of <code>ls</code> before or after running their own code. <strong>Listing 2</strong> shows a very simple script that could replace the system&#8217;s <code>ls</code> command in this case.
</p>

<p>
<pre class="cliwide" style="margin:0px">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
#!/bin/bash

# Run the real ls with the original arguments/options to cover our tracks
/bin/ls "$@"

# Run whatever malicious code we want here
echo "Malicious code"
</code>
</pre>
</p>

<p class="single">
There&#8217;s a decent chance that any cracker who planted the fake <code>ls</code> would create it in such a way that it would look like <code>ls</code> was running normally. This is what I&#8217;ve done in <strong>Listing 2</strong> by passing the <code>@</code> variable to the real <code>ls</code> command so that the user doesn&#8217;t suspect anything. This brings up another point besides the use of the current directory in the path. Just because your script seems to be running fine from the user&#8217;s point-of-view doesn&#8217;t mean that it hasn&#8217;t been compromised. A good cracker knows how to cover their tracks, so if a security flaw has been exploited in your script the breach may go undetected for an indefinite period of time.
</p>

<p class="single">
You can see in <strong>Listing 1</strong> that the order of directories in the <code>PATH</code> variable makes a difference. This is important because if a cracker has write access to a directory that&#8217;s earlier in the search order, they can preempt the standard directories like <code>/bin</code> and <code>/usr/bin</code> that may be harder to gain access to. When you try to run the standard command, the malicious version will be found first and run instead. All the cracker has to do is insert a replacement command, like the one in <strong>Listing 2</strong>, earlier in the path search order.
</p>

<p class="single">
The second main problem with the <code>PATH</code> environment variable is that it could have been manipulated by a cracker before, or as your script was run. If this happens, the cracker could point your script to a directory that they created which holds modified versions of the system utilities that your script relies on. Knowing this, it&#8217;s best if you add code to the top of your script to set the <code>PATH</code> variable to the minimal value your script needs to run. You can save the original <code>PATH</code> variable and restore it on exit. <strong>Listing 3</strong> shows <strong>Listing 1</strong> with the current directory removed from the <code>PATH</code> variable, and a minimal path set to lessen the chances of problems. Keep in mind though that a cracker could have compromised the actual system utilities that are in locations such as <code>/bin</code> and <code>/sbin</code>. Ways to detect and combat this occurrence fall more into the system security realm though and won&#8217;t be talked about in this post.
</p>

<p>
<pre class="cliwide" style="margin:0px">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 3</strong></em></p>
<code>
#!/bin/bash
# A SUID root script that demonstrates various security problems

# Save the current path variable to restore it later
OLDPATH=${PATH}

# Set a minimal path for our script to use
PATH=/bin:/usr/bin

#Count the number of lines
ls | wc -l

# Get user input
read USR_INPUT

# Check to see if the user supplied the right password
if [ $USR_INPUT == "mypassword" ];then
    echo "User input was $USR_INPUT and should have matched the string 'test'"
fi

# Create a temp file
touch /tmp/mytempfile

# Set the temp file so that only we can read/write the contents
chmod 0700 /tmp/mytempfile

# Save the password that the user supplied to the temp file
echo $USR_INPUT > /tmp/mytempfile

# Reset the PATH variable to its original value
PATH="$OLDPATH"
</code>
</pre>
</p>

<p class="single">
In your own scripts it would probably be best to put the reset of the PATH variable inside of a trap on the exit condition. That way <code>PATH</code> gets reset to the original value even if your script is terminated early. I wrote about traps in the <a href="http://www.innovationsts.com/blog/?p=1896">last post</a> in this series on error handling.
</p>

<p class="single">
Another, less desirable way of avoiding malicious PATH exploits would be to use the full (absolute) path to the binary your script is trying to run. So, instead of just entering <code>ls</code> by itself, you would enter <code>/bin/ls</code> . This ensures that you&#8217;re running the binary that you want to, but it&#8217;s a more &#8220;brittle&#8221; approach. If your script is run on a system where the binary you are calling is in a different location, your script will break when the command is not found. One approach to help cut down on this drawback is to use the <code>whereis</code> command to locate the command for you. Caution needs to be applied with this approach too, but I&#8217;ve created an example in <strong>Listing 4</strong> that shows how to do this. Remember that if the cracker has somehow compromised the system&#8217;s standard version of the command that you&#8217;re trying to run, this technique won&#8217;t help. That really starts being a system security problem rather than a script security problem at that point though.
</p>

<p>
<pre class="cliwide" style="margin:0px">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 4</strong></em></p>
<code>
#!/bin/bash - 
#File: findcmd.sh

# Attempt to find the command with the whereis command
CMD=$(whereis $1 | cut -d " " -f 2)

# Check to make sure that the command was found
if [ -n "$CMD" ];then
    echo "$CMD"
fi
</code>
</pre>
</p>

<p class="single">
The script uses the command name to give the user the full path to the binary, if it can be found.There are of course numerous improvements that you could make to the script in <strong>Listing 4</strong>. My main suggestion would be to rewrite the script as a function, and then put that inside a script that you can source. That way you maximize code reuse throughout the rest of your scripts. I&#8217;ve done this in <strong>Listing 29</strong> via the <code>run_cmd</code> function.
</p>

<p class="single">
Another environment variable that can be problematic is <code>IFS</code>. <code>IFS</code> stands for &#8220;Internal Field Separator&#8221; and is the variable that the shell uses when it breaks strings down into fields, words, and so on. It can actually be a handy variable to manipulate when you&#8217;re doing things like using a for loop to deal with a string that has odd separator characters. If your shell inherits the <code>IFS</code> variable for it&#8217;s environment, a cracker can insert a character or characters that will make your script behave in an unexpected way. For example, suppose I have a few scripts in my <code>~/bin</code> directory that I want to run together (or nearly together). The script in <strong>Listing 5</strong> shows one very simple way of doing this.
</p>

<p>
<pre class="cliwide" style="margin:0px">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 5</strong></em></p>
<code>
#!/bin/bash - 

BINS="/home/jwright/bin/bin1.sh /home/jwright/bin/bin2.sh"

for BIN in $BINS
do
    echo $($BIN)
done
</code>
</pre>
</p>

<p class="single">
When I run the script I get the output from bin1.sh and bin2.sh that I expect. In this case the scripts just output their name and exit. Everything is fine until a cracker comes along and sets the <code>IFS</code> variable to a forward slash (/). Now when I run my script I get the output in <strong>Listing 6</strong>.
</p>

<p>
<pre class="cliwide" style="margin:0px">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 6</strong></em></p>
<code>
$ ./ifscrack.sh 

./ifscrack.sh: line 8: home: command not found

./ifscrack.sh: line 8: jwright: command not found

./ifscrack.sh: line 8: bin: command not found

./ifscrack.sh: line 8: bin1.sh : command not found

./ifscrack.sh: line 8: home: command not found

./ifscrack.sh: line 8: jwright: command not found

./ifscrack.sh: line 8: bin: command not found

bin2.sh executing
</code>
</pre>
</p>

<p class="single">
Notice that since the directory <code>/home/jwright/bin</code> is in my path, the bin1.sh call should have run. If you look closely though you&#8217;ll see that there is a space after the filename, which causes the command to not be found. The <code>IFS</code> variable change has not only broken my script, it has allowed the cracker to open up a significant security hole. If the cracker creates a program or script with any of the names like <code>home</code>, <code>jwright</code>, or <code>bin</code> anywhere in the directories in <code>PATH</code>, their code will be executed with the privileges of my script. Because of the privilege issue, this security hole is an even bigger problem with SUID root scripts.
</p>

<p class="single">
On some Linux distributions, the <code>IFS</code> variable is not inherited by a script and instead a default standard <code>IFS</code> value is used. You can still change the value of <code>IFS</code> within your script thought. With this said, it&#8217;s still a good idea to set the <code>IFS</code> variable to a known value at the beginning of your script and restore it before your script exits. This is similar to the change we made in <strong>Listing 3</strong> to store and reset the <code>PATH</code> variable. This is a good idea because even though the distribution that your developing your script on may not allow <code>IFS</code> inheritance, your script may be moved to another distribution that does. It&#8217;s best to be safe and always set <code>IFS</code> to a known value.
</p>

<p class="single">
Make sure that you never use the <code>UID</code>, <code>USER</code>, and <code>HOME</code> environment variables to do authentication. It&#8217;s too easy for a cracker to modify the values of these variables to give themselves elevated privileges. Now on the Fedora system that I&#8217;m using to write this blog post the <code>UID</code> variable is readonly, so I can&#8217;t change it. That doesn&#8217;t guarantee that every system that your script runs on will make <code>UID</code> readonly though. Err on the side of caution and use the <code>id</code> command or another mechanism to authenticate users instead of variables. The <code>id</code> command is very useful, and can give you information like effective user ID, real user ID, username, etc. <strong>Listing 7</strong> is a quick reference of some of the <code>id</code> command&#8217;s options.
</p>

<p>
<pre class="cliwide" style="margin:0px">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 7</strong></em></p>
<code>
-g (--group)            Print only the effective group ID
-n (--name)             Print a name instead of a number, for -ugG
-r (--real)             Print the real ID instead of the effective ID, with -ugG
-u (--user)             Print only the effective user ID
-Z (--context)          Print only the security context of the current user 
                           (SELinux)
</code>
</pre>
</p>

<p class="single">
You&#8217;ll need to use the options <code class="optionsonly">-u</code> and <code class="optionsonly">-g</code> with some of the other options (<code class="optionsonly">-r</code> and <code class="optionsonly">-n</code>) so that the <code>id</code> command knows whether you want information on the user or group. For example you would use <code class="commandline">/usr/bin/id -u -n</code> to get the name of the user instead of their user ID.
</p>

<p class="single">
The fact that the <code>UID</code> variable is set to readonly on my system gives you a hint at how to protect some variables. There is actually a command named <code>readonly</code> that sets variables to a readonly state. This does protect variables from being changed, but it also keeps you as the &#8220;owner&#8221; of the variable from making any changes to it too. You can&#8217;t even unset a readonly variable. To make a variable readonly, you would issue a  command line like <code class="commandline">readonly MYVAR</code> . Make sure to carefully evaluate whether or not a variable will ever need to change or be unset before setting it to readonly.
</p>

<p class="single">
There&#8217;s an IBM developerWorks article in the <a href="#resources">Resources</a> section (#20) that mentions security implications for some other environment variables such as LD_LIBRARY_PATH and LD_PRELOAD. That would be a good place to start digging a little deeper on the security issues surrounding environment variables.
</p>

<h3>Symbolic Links</h3>
<p class="single">
You should always check symbolic links to make sure that a cracker is not redirecting you to their modified code. Symbolic links are a transparent part of everyday life for Linux users. Chances are that when you run <code>sh</code> on your favorite Linux distribution, <code>/bin/sh</code> is actually a link to <code>/bin/bash</code> . Go ahead and run <code class="commandline">ls -l /bin/sh</code> if you&#8217;ve never noticed this before. Symbolic link attacks can take a few different forms, one of which is redirection of sensitive data. In one situation, you may think that you&#8217;re caching sensitive data to a file you&#8217;ve created in <code>/tmp</code> with 0700 file permissions. Instead, by exploiting a race condition in your script (we&#8217;ll talk about race conditions later) a cracker creates a symbolic link with the same filename that your script will be writing data into first, thus causing your creation of the temporary file to throw an error. If your script doesn&#8217;t stop on this error, it will begin dumping data into the file at the end of the symbolic link. The endpoint of the link could be on a mounted remote filesystem where the cracker can get easier access to it. There were several mistakes made in this scenario that we&#8217;ll talk more about later, but before that lets look at making sure we&#8217;re not writing data to a symbolic link.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 8</strong></em></p>
<code>
#!/bin/bash -
#File: symlink_test.sh

# Poor method of temp file creation
touch /tmp/mytempfile

# Check the new temp file to see if it's a symbolic link
IS_LINK=$(file /tmp/mytempfile | grep -i "symbolic link")

# If the variable is not null, then we've detected a symbolic link
if [ -n "$IS_LINK" ];then
    echo "Possible symbolic link exploit detected. Exiting."
    exit 1 #Exit before we dump the sensitive data into to the link
fi

# Dump our sensitive data into the temp file
echo "Sensitive Data" > /tmp/mytempfile
</code>
</pre>
</p>

<p class="single">
If our script sees the string &#8220;symbolic link&#8221; in the output from the <code>file</code> command, it assumes that it&#8217;s looking at an attempted symbolic link exploit. Rather than continuing on and possibly sending data to a cracker, the script chooses to warn the user and exit with an exit status indicating an error. Be aware thought that this script doesn&#8217;t protect against the situation where a cracker creates your temp file in place with permissions to give themselves access to the data. In the case that you don&#8217;t expect a the temp file to already be there, you would throw an error and exit. This brings up another problem though &#8211; DoS (Denial of Service) attacks. If the cracker simply wants your script to fail, all they have to do is make sure your temp file has already been created so that your script will throw and error and exit. You&#8217;re not handing over sensitive data, but your users are being denied the use of your script. The answer to this is to create temporary files with less-predictable file names.
</p>

<h3>&#8220;Safe&#8221; Temporary Files</h3>
<p class="single">
In the header for this section, I put the word safe in quotes to denote that it&#8217;s very difficult to make anything completely safe. What you have to do is make things as safe as possible, and then keep an eye out for suspicious activity. In the <a href="http://www.innovationsts.com/blog/?p=1896">last blog post</a> I created a a function named create_temp that used a simple, but risky mechanism to create temp files. A snippet of the code from that listing is shown in <strong>Listing 9</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 9</strong></em></p>
<code>
# Function to create "safe" temporary files
function create_temp {
    # Give preference to user tmp directory for security
    if [ -e "$HOME/tmp" ]
    then
        TEMP_DIR="$HOME/tmp"
    else
        TEMP_DIR="/tmp"
    fi

    # Construct a "safe" temp file name
    TEMP_FILE="$TEMP_DIR"/"$PROGNAME".$$.$RANDOM

    # Keep the file in an array to remove it later
    TEMPFILES+=( "$TEMP_FILE" )

    {
        touch $TEMP_FILE &#038;> /dev/null
    } || fatal_err $LINENO "Could not create temp file $TEMP_FILE"
}
</code>
</pre>
</p>

<p class="single">
The problem with this function is that it uses a temporary file name with 2 elements that are easy to predict &#8211; the program name and the process ID. The fact that there is a random number on the end is only an inconvenience for the cracker, because all they have to do is create a file for each possible file name with an ending number between 0 and 32767. They can be sure that you&#8217;ll dump data into one of those files, and it&#8217;s easy to write a script to find out which file holds the data. A slightly better method would be to append multiple sets of random numbers onto the file name, separating each set with periods. This makes it much harder for the cracker to cover all the possible file names. A much better way to handle this situation is to use the <code>mktemp</code> command, which is available on most Linux systems.
</p>

<p class="single">
The <code>mktemp</code> command takes a string template that you supply and creates a unique temporary file name. The form could be something like <code class="commandline">mktemp /tmp/test.XXXXXXXXXXXX</code> which would print the random file name to standard out and create a file with that name and path. Running that command line on a Fedora 13 system once gave me the output <code>/tmp/test.o0mTLAgSWTfX</code> which of course will vary each time you run the command. The more X characters you add to the template, the harder it is for a cracker to predict the file name. From what I&#8217;ve read, 10 or so is the recommended minimum amount. Another nice thing about <code>mktemp</code> is that when it creates a temp file, it makes sure that only the owner has access to it. Some useful options for <code>mktemp</code> are shown in <strong>Listing 10</strong>. You should use <code>mktemp</code> in preference to commands like <code>touch</code> and <code>echo</code> to create temp files.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 10</strong></em></p>
<code>
-d (--directory)          Create a directory, not a file.
-q (--quiet)              Suppress diagnostics about file/dir-creation failure.
--suffix=SUFF             Append SUFF to TEMPLATE. SUFF must not contain slash.
                              This option is implied if TEMPLATE does not end
                              in X.
--tmpdir[=DIR]            Interpret TEMPLATE relative to DIR. If DIR is not 
                              specified, use $TMPDIR if set, else /tmp. With
                              this option, TEMPLATE must not be an absolute name.
                              Unlike with -t, TEMPLATE may contain slashes, but
                              mktemp creates only the final component.
</code>
</pre>
</p>

<p class="single">
There are just a few other miscellaneous facts about <code>mktemp</code> that I want to make sure you&#8217;re aware of.
<ol>
<li>
The man pages for <code>mktemp</code> on both Ubuntu 9.10 and Fedora 13 systems specify that the minimum number of X characters that you can have in a template is three. Even though you can go this low, I wouldn&#8217;t recommend it because it greatly increases the predictability of your file names. Ten or more random alpha-numeric characters is better.
</li>
<li>
<code>mktemp</code> is commonly part of the coreutils package.
</li>
<li>
The default number of X characters that you get when you don&#8217;t specify a template with <code>mktemp</code> is 10. This held true on the Fedora 13 and Ubuntu 9.10 systems that I tested.
</li>
</ol>
</p>

<p class="single">
So what happens if you don&#8217;t have <code>mktemp</code> on your system? The LinuxSecurity.com article in the <a href="#resources">Resources</a> section (#17) gives a way to use <code>mkdir</code> to create a temporary directory that only the creator has access to. A script based on the examples in that article is found in <strong>Listing 11</strong>, but should not be used in preference to the <code>mktemp</code> command unless you have a compelling reason.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 11</strong></em></p>
<code>
#!/bin/bash - 
#File safetmp.sh

# Give preference to user tmp directory for security
if [ -e "$HOME/tmp" ];then
    TEMP_DIR="$HOME/tmp"
else
    TEMP_DIR="/tmp"
fi

# Create somewhat secure directory name
TEMP_NAME=${TEMP_DIR}/$$.$RANDOM.$RANDOM.$RANDOM

# Create the directory while at the same time giving
# only the user access to it
(umask 077 &#038;&#038; mkdir $TEMP_NAME) || {
    echo "Error creating the temporary directory."
    exit 1
}
</code>
</pre>
</p>

<p class="single">
Notice that this script does use multiple references of the <code>RANDOM</code> variable separated by periods to make the directory name harder to guess. Also, the umask is set to 077 just before the directory is created so that the end directory permissions are 700. That gives the owner full access to the file, but none to anyone else. At the top of the script I have reused code from the create_temp function in <strong>Listing 9</strong>. This code gives preference to the user&#8217;s home directory over the system (<code>/tmp</code>) directory. If the temporary file or directory that you are creating can be placed in the user&#8217;s home directory, that&#8217;s just one more layer of protection from prying eyes. I would suggest using the user&#8217;s own <code>tmp</code> directory whenever possible.
</p>

<p class="single">
Keep in mind that as I mentioned above, even though you&#8217;ve protected the data in the temp files a cracker can still launch a DoS (Denial of Serivce) attack against your script. In this case since the cracker probably can&#8217;t guess the temporary file name, they might try to fill the <code>/tmp</code> directory so that there&#8217;s no more space for you to create your file. Things like user disk quotas can help mitigate this type of attack though.
</p>

<p class="single">
Now that you know a little more about temp file safety, I&#8217;ll caution you not to overuse temporary files. When you store or use data in external files you are opening a door into your script that a knowledgeable individual may be able to exploit. Use temp files only when needed, and make sure to consistently follow safe guidelines for their use.
</p>

<h3>Race Conditions</h3>
<p class="single">
A race condition occurs when a cracker has a window of opportunity to preempt and modify your scripts behavior, usually by exploiting a design flaw in the execution sequence of your script, or in its reliance on an external resource (like a lock file). The example that we&#8217;ve already talked about is creating a symbolic link or a file in place of the script&#8217;s temp file to capture data. The script that I&#8217;ve created in <strong>Listing 12</strong> uses the <code>sleep</code> command to create a larger window for a race condition.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 12</strong></em></p>
<code>
#!/bin/bash - 
#File: race_cond.sh

TEMP_FILE=/tmp/predictable_temp

# Make sure that the temp file doesn't already exist
if [ ! -f $TEMP_FILE ];then
    # Do something here that takes 10 seconds. This
    # creates the race condition and is simulated by 
    # the sleep command
    sleep 10

    # Create the temp file
    touch $TEMP_FILE

    # Make sure only the user can view the contents
    chmod 0700 $TEMP_FILE

    # Dump our sensitive data to the temp file
    echo "secretpassword" > $TEMP_FILE
fi
</code>
</pre>
</p>

<p class="single">
Once the script is run, the cracker has 10 seconds to create the temp file before the script does. The timing is rarely as simple as I have made it out to be in this example, but the 10 second gap between checking for the existence of the file and the creation of it illustrates the point. The two lines in <strong>Listing 13</strong> can be entered as a different user. The <code>touch</code> command in <strong>Listing 12</strong> will fail because the file is owned by a different user, but the script has another flaw in that it doesn&#8217;t check for that error before writing the data. Because of this the sensitive data is written into a file that is easy for the cracker to read. Checking for an error and making sure that the file you want to create doesn&#8217;t already exist and has the correct permissions would go a long way toward making this script more secure.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 13</strong></em></p>
<code>
touch /tmp/predictable_temp
chmod 0777 /tmp/predictable_temp
</code>
</pre>
</p>

<p class="single">
When the 10 second delay expires in my script I get the error <code class="error">chmod: changing permissions of `/tmp/predictable_temp': Operation not permitted</code> just before the data is written to the file. The temp file is accessible to the cracker using the <code>cat</code> command, and an <code class="commandline">ls -l</code> of the temp file shows that it&#8217;s owned by the user name that the cracker used. There are other race condition exploits, but the moral of the story is to not leave gaps between critical sections of your script. <strong>Listing 11</strong> shows a good example of closing the gap between operations. In that case the permissions are set as the directory is created by setting the umask before the call to <code>mkdir</code>. Race conditions are certainly something to keep in mind as your attempt to increase the security of your scripts.
</p>

<h3>The Shebang Line</h3>
<p class="single">
You may have noticed before that I put a dash and a space (a bare option) at the end of most of my script shebang lines. This is the same as the double dash (<code class="optionsonly">--</code>) option and signals the end of all options for the shell. Any options that are tacked onto the end of the shebang line will be treated as arguments to the shell, and will most likely throw an error. The reason that this is important is that it prevents option spoofing. On some systems if the cracker can get the shebang line to effectively read <code class="commandline">#!/bin/sh -i</code> they will get an interactive shell with the privileges of the script. It&#8217;s important to note that I was not able to get an interactive shell using a script on a Fedora 13 system, even when I entered the shebang line directly as having the <code class="optionsonly">-i</code> option. Even so, you don&#8217;t always know which systems your script will run on, and it only takes a fraction of a second to add the dash (or double dash) at the end of your shebang line. That&#8217;s a very small price to pay for some added security.
</p>

<h3>User Input</h3>
<p class="single">
As I discussed in the <a href="http://www.innovationsts.com/blog/?p=1896">error handling post</a> of this series, user input should be processed cautiously. Even when there is no malicious intent by a user very serious errors can result from incorrect input. At its worst, user input can give a cracker an open door into your system through things like injection attacks. Keeping this in mind, there are a few guidelines that you can follow to help keep user input from bringing your script down.
</p>

<p class="single">
If you can avoid it, don&#8217;t pass user input to the <code>eval</code> command, or pipe the input into a shell binary. This is a script crash or security problem waiting to happen. <strong>Listing 14</strong> shows the wrong way to handle user input when it&#8217;s captured with the <code>read</code> command.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 14</strong></em></p>
<code>
#!/bin/bash - 
#File: badinput.sh

# Get the input from the user
read USR_INPUT

# Don't use eval like this
eval $USR_INPUT

# Don't pipe input to a shell like this
echo $USR_INPUT | sh
</code>
</pre>
</p>

<p class="single">
It&#8217;s probably pretty easy to agree with me that the script in <strong>Listing 14</strong> is a bad idea. The user can type any command string they want (including <code class="commandline">rm -rf /*</code>) and it will be executed with the privileges of the script. Depending on how much the permissions of the script are elevated, this could do a lot of damage. Another scenario that may seem more harmless is the one in <strong>Listing 15</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 15</strong></em></p>
<code>
#!/bin/bash - 

read USR_INPUT

if [ $USR_INPUT == "test" ];then
    echo "You should only see this if you typed test."
fi
</code>
</pre>
</p>

<p class="single">
Everything works fine until a cracker enters the string <code class="commandline">random == random -o random</code> and hits enter. What this effectively does is changes the <code>if</code> statement so that it reads <code class="commandline">if [ random == random -o random == "test" ]</code> where the <code>-o</code> is a logical or. It tells the <code>if</code> statement that either the first statement or the second statement has to be true, but not both. Of course the first statement (<code>random == random</code>) is true, so what&#8217;s inside the <code>if</code> statement executes even though the cracker didn&#8217;t type the correct word or phrase. Depending on what&#8217;s inside the <code>if</code> statement, that security hole could range from a minor to major problem. The way to combat this is to quote your variables (i.e. <code>"$USR_INPUT"</code>) so that they are tested as a whole string. In general quoting your variables is a good idea as you&#8217;ll also head off problems with things like spaces that might otherwise cause your script trouble.
</p>

<p class="single">
This is an example of an injection attack where the cracker slips some extra information in with the input to trick your script into running unintended code. This is a very common attack &#8220;vector&#8221; for database and web servers where a cracker carefully crafts a request to cause arbitrary code to execute, or to bring down the web/database service. Another script that can be exploited by an injection attack is found in <strong>Listing 16</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 16</strong></em></p>
<code>
#!/bin/bash - 

read USR_INPUT

#This line contains the fatal security flaw
echo ls $USR_INPUT > test.sh
</code>
</pre>
</p>

<p class="single">
This script isn&#8217;t necessarily something that you would do in the real world, but it&#8217;s a simple way to demonstrate this injection attack. What the script does is takes a list of directories from the user and then builds a script using the <code>ls</code> command to list the contents of the directories. The injection attack comes when a cracker types <code class="commandline">&#038;&#038; rm randomfile</code> and you find that the resulting script (<code>test.sh</code>) contains a line that will delete files (<strong>Listing 17</strong>). The <code class="commandline">&#038;&#038; rm randomfile</code> line could have just as easily be <code class="commandline">&#038;&#038; rm -rf /*</code> if the cracker wanted.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 17</strong></em></p>
<code>
ls &#038;&#038; rm randomfile
</code>
</pre>
</p>

<p class="single">
The <code>&#038;&#038;</code> operator runs the second command in the sequence if the first command runs successfully (without an error). The <code>ls</code> command is not likely to fail by itself as it just lists the contents of the current directory, so the <code>rm</code> command will most likely run and delete files. The method to deal with this type of attack is similar to the previous method of quoting, except that in this case you escape the quotes around the user input to make sure that it is properly contained. <strong>Listing 18</strong> shows the corrected script.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 18</strong></em></p>
<code>
#!/bin/bash - 

read USR_INPUT

#This line uses escaped quotes to enclose the potentially dangerous input
echo ls "\"$USR_INPUT\"" >> test.sh
</code>
</pre>
</p>

<p class="single">
Along with quoting, it&#8217;s a good idea to search user input for unacceptable entries like meta or escape characters. You can search the user input for these undesirable characters and replace them with something harmless to your script like a blank character or underscore. When doing this, it may be easier to search for the characters that are acceptable instead of trying to cover every single character that&#8217;s not acceptable. The set of acceptable characters is almost always smaller, and it&#8217;s hard to anticipate every bad character that might be passed to your script. <strong>Listing 19</strong> shows a simple way of cleaning the input using the <code>tr</code> command.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 19</strong></em></p>
<code>
#!/bin/bash - 
#File: scrubinput.sh

# Grab the user's input
read USR_INPUT

# Remove all characters that aren't alphanumeric or newline
USR_INPUT=$(echo "$USR_INPUT" | tr -cd '[[:alnum:]\n]') 
</code>
</pre>
</p>

<p class="single">
This script takes the user input using the <code>read</code> command as before, but then pipes the value directly into the <code>tr</code> command. The <code>tr</code> command&#8217;s <code class="optionsonly">-c</code> (<code class="optionsonly">--complement</code>) and <code class="optionsonly">-d</code> (<code class="optionsonly">--delete</code>) options are used to cause <code>tr</code> to look for and delete the unmatched characters. So, anything that&#8217;s not an alphanumeric character (via the <code>alnum</code> character class) or a newline character will be deleted. It&#8217;s not hard to adapt the <code>tr</code> statement to your situation, maybe even replacing the characters instead of deleting them. 
</p>

<p class="single">
As with the other topics in this post I&#8217;m scratching the surface, but hopefully you can see how important it is to check user input before doing anything with it. The inability of a script or program to handle improper input is a common bug in the software world. Whether the user has malicious intent or not, bad user input is something that you must plan for.
</p>

<h3>SUID, SGID, and Scripts</h3>
<p class="single">
There are several of the above scenarios that may not cause that much harm on their own because the user running the script has restricted permissions. This can all change with a script that has its SUID and/or SGID bit set though. The SUID and SGID bits show up in the first of four digits in the octal representation of a file&#8217;s permissions. The SUID bit has a value of 4 and the SGID bit has a value of 2. If both bits are set you get a value of 6, which is similar to how normal permission bits can be added. The other place that you normally see the SUID and SGID bits are in the symbolic permission string. There they show up as the character &#8220;s&#8221; in either the user execute permission space, or in the group execution space respectively. For example, if only the SUID bit was set on a script and the file had read/write/execute permissions of <code>755</code>, the full permissions for the script would be <code>4755</code>. The symbolic representation of this would be <code>-rwsr-xr-x</code> .
</p>

<p class="single">
When the SUID bit is set on an executable, the file is run using the privileges of the file&#8217;s owner. In the same way if the SGID bit is set on an executable, it will be run with the rights of the file owner&#8217;s group. Typically a command/script executes using the real user ID (and rights), but when the SUID or SGID bits are set the script executes with the effective user ID of the file owner instead. A common use is to have the SUID bit set on a file that is owned by root so that a user can access files and resources that they normally wouldn&#8217;t have access to. The <code>passwd</code> command is a good example of this. In order to change a user&#8217;s password, <code>passwd</code> has to access protected files such as <code>/etc/passwd</code> and <code>/etc/shadow</code> . If a normal user is running the <code>passwd</code> command, they would need elevated privileges to access the files since they are only readable and writable by root. This is very handy, and as you&#8217;ve seen, sometimes required on Linux systems but is something that you should avoid doing with your scripts whenever possible. The problem with an SUID root script is that if a cracker compromises that script, they have superuser privileges that could be used to run commands like <code class="commandline">rm -rf /*</code> . As a programmer and/or system administrator, you need to guard against the tendency to take the easiest route to a solution rather than the most secure one. All to many admins will set a script to be SUID root when with some thought the script could have been designed to run without superuser privileges. With that said, you may run into situations where you have to use SUID and SGID. Just make sure that it&#8217;s a true &#8220;have to&#8221; situation. Always follow the Rule of Least Privilege which says that you should never give a user or a program any more rights than you have to.
</p>

<p class="single">
If you really need to use the SUID and SGID bits, you can set them with the <code class="commandline">chmod u+s FILENAME</code> and <code class="commandline">chmod g+s FILENAME</code> command lines respectively. Keep in mind that there are Linux distributions and Unix variants that do not honor the SUID bit when it is set on a script. You&#8217;ll need to check the documentation for your Linux distribution to be sure that setting the SUID bit will work.
</p>

<p class="single">
You can use the <code>find</code> command to search for files on your system with the SUID and SGID bits set. You can use this as a security auditing tool to search for SUID/SGID scripts that look out of place. <strong>Listing 20</strong> shows a quick and simple way to search for out of place SUID/SGID shell scripts that are on your system.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 20</strong></em></p>
<code>
$ sudo find / -type f -user root -perm /4000 2> /dev/null | xargs file | grep "shell script"
/usr/bin/malscript.sh:                                                   setuid Bourne-Again shell script text executable
</code>
</pre>
</p>

<p class="single">
Let&#8217;s take the command line from <strong>Listing 20</strong> one step at a time. The first section is the actual <code>find</code> command (<code class="commandline">find / -type f -user root -perm +4000</code>). The <code>find</code> command searches for a file of type regular file (<code class="optionsonly">-type f</code>) and not a directory, it checks to make sure that the file is owned by root (<code class="optionsonly">-user root</code>), and that it has the SUID bit set (<code class="optionsonly">-perm /4000</code>). The next short section of <code class="commandline">2> /dev/null</code> redirects any errors to the null device so that they are thrown away. This effectively suppresses errors resulting from <code>find</code> trying to access things like Gnome&#8217;s virtual file system. The <code>file</code> command deciphers which type of file is being looked at. This command is not perfect, but will work for a quick and dirty security audit. The <code>file</code> command needs to work on each of the file names individually, so I use the <code>xargs</code> command to run <code>file</code> separately with each line of output from the <code>find</code> command. I could have also used the <code class="optionsonly">-exec</code> option of <code>find</code> in the following way: <code class="commandline">-exec file '{}' \;</code> . The command line up to this point gives me output telling what each type of file is, but I really only care about shell scripts. That&#8217;s where the <code>grep</code> statement comes in. I use <code>grep</code> to filter out only the lines that mention a &#8220;shell script&#8221;. 
</p>

<p class="single">
As you can see in the output of the command line, there is a suspicious file called <code>malscript.sh</code> in <code>/usr/bin</code> . Searching in this way made a file that normally would be overlooked stand out by itself. In this case I created that script and put it in <code>/usr/bin</code> myself so that I would have something to find, but it simulates something that you might find in the field. You could just as easily have searched for SGID scripts (<code class="optionsonly">-perm /2000</code>), SUID/SGID combo scripts (<code class="optionsonly">-perm /6000</code>), SUID root binaries, and much more. Be aware that if the owner execution bit is not set on a directory then it is not searchable. This would cause the <code>find</code> command to skip over the directory, possibly causing you to miss a suspicious file.
</p>

<p class="single">
The SUID root mechanism can be especially dangerous if a cracker manages to make a copy of a shell binary and sets it to be SUID root. Some shells such as BASH will automatically relinquish their privileges if they&#8217;re being run this way. Keep an eye out for extra copies of shell binaries that are set SUID, as they could be part of an attack by a cracker. The shell binary could have been copied and modified using several of the security flaws that we&#8217;ve talked about above. You could use the script in <strong>Listing 20</strong> to help you search for SUID root copies of shell binaries.
</p>

<p class="single">
When running scripts manually as a system administrator, you should run scripts with temporary elevated privileges through a mechanism like <code>sudo</code> whenever possible, rather than setting a script to be SUID root. Even with <code>sudo</code> though you still need to make sure your script is secure as possible because <code>sudo</code> is still granting your script root privileges, and it doesn&#8217;t take much time to do a lot of damage. Item #16 in the <a href="#resources">Resources</a> section touches on many of the security aspects that we&#8217;ve talked about here from the perspective of proper <code>sudo</code> usage.
</p>

<p class="single">
In some cases a user may install or use your script improperly, running it as SUID root or with <code>sudo</code>. If you never want your script run as root, you could use the <code>id</code> command along with some text manipulation to warn the user and then exit. The script in <strong>Listing 21</strong> shows one way of doing this.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 21</strong></em></p>
<code>
#!/bin/bash - 
#File: droproot.sh

# Check to see if we're running as root via sudo
if [ $(/usr/bin/id -ur) -eq 0 ];then
    echo "This script cannot be run with sudo"
    exit 1
fi

# Get the listing on this script
INFO=$(ls -l $0)
# Grab the permission at the SUID position
PERM=$(echo "$INFO" | cut -d " " -f 1 | cut -c 4)
# Grab the owner
OWNER=$(echo "$INFO" | cut -d " " -f 3)

# Check for the SUID bit and the owner of root
if [ "$PERM" == "s" -a "$OWNER" == "root" ];then
    echo "This script cannot be run as SUID root"
    exit 1
fi
</code>
</pre>
</p>

<p class="single">
The script uses the <code>id</code> command to check the real user ID of the user, and if it&#8217;s 0 (root) then the script warns the user that the script is not supposed to be run with <code>sudo</code> or as root and exits. To check for the SUID root condition, I&#8217;ve taken a slightly more complicated route. I run the command line <code class="commandline">ls -l $0</code> which gives me a long listing for the script name (represented by <code>$0</code>) showing the symbolic permission string and the owner. I then extract the character in the permission string that would represent the SUID bit as an &#8220;s&#8221; if present so that I can check it. This is done with the <code class="commandline">cut -c 4</code> command line which extracts the fourth character. Once I have the SUID bit and the user, I just use an <code>if</code> statement to check to see if both the SUID bit is set and that the script is owned by root. If both of those conditions are true, I warn the user that the script can&#8217;t be SUID root and exit.
</p>

<p class="single">
One of the nice things about the BASH shell is that if it detects that it has been run under the SUID root condition, it will automatically drop its superuser privileges. This is nice because even if an attacker is able to make a copy of the <code>bash</code> binary and set it as SUID root, it will not allow them to gain additional access to the system. Unfortunately, most crackers are going to know this and will try to make a copy of another shell like <code>sh</code> that doesn&#8217;t have this feature.
</p>

<p class="single">
The last thing that I&#8217;ll mention about SUID root scripts is that I have seen it suggested by several system administrators that you should use Perl or C whenever you must use SUID root. There have been arguments for and against using Perl or C in place of shell scripting, and ultimately you must decide which you feel safer with. I&#8217;m not going to argue the point, but I will say that if you use unsafe practices when writing your Perl scripts or C programs, you&#8217;re going to end up no better off anyway. Take your time and make sure the code you write is as secure as you can make it. This is a rule to live by no matter what language you&#8217;re using.
</p>

<h3>Storing Sensitive Data In Scripts</h3>
<p class="single">
This is just a bad idea, do your best to avoid it. If you store passwords in a script they&#8217;re just waiting to be found. Even if you set the permissions to 0700, the passwords will still be compromised if a cracker compromises your account. There&#8217;s also the risk that you might accidentally send the script to another user, and forget to scrub the passwords from it.
</p>

<p class="single">
You should also not echo passwords as a user types them. Shoulder surfers could see the password as the user enters it if you have the shell set to echo user input. To avoid this in your script, you can use <code class="commandline">stty -echo</code> as I have in the very simple example in <strong>Listing 22</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 22</strong></em></p>
<code>
#!/bin/bash - 

# Turn echoing off
stty -echo

# Read the password from the user
echo "Please enter a password: "
read PASSWD

# Turn echoing back on
stty echo
</code>
</pre>
</p>

<p class="single">
Notice that only what the user types is suppressed and not the output from the <code>echo</code> command itself. This of course doesn&#8217;t protect the user from somebody watching what their fingers press on the keyboard, but there&#8217;s nothing that you as a programmer can do about that.
</p>

<p class="single">
If you do end up storing passwords in your script or in files on your system, it would be a good idea to encrypt the information. You can encrypt passwords using the <code>md5sum</code> or sha*sum commands. You can pipe the password string straight into the command as with the line <code class="commandline">echo "secretpassword" | sha512sum</code> . I would suggest writing a script that takes the password without echoing the input and converts it into an encrypted hash. Once you&#8217;ve encrypted the password this way it is never decrypted, you just encrypt the password given by the user and compare that to the stored password hash. That way the password is not out in clear text for a cracker to find. Granted, it&#8217;s still possible to crack encryption, but remember that no system is bulletproof and the goal is to make the crackers life as difficult as possible. 
</p>

<p class="single">
One habit that you should encourage with your users (and any system admins under you) is picking long and complex passwords. To ease the strain of having to remember a convoluted password, have users build passwords based on first letters and punctuation from a random phrase. For instance, the phrase &#8220;This is 1 fairly strong password, don&#8217;t you think Jeremy?&#8221; would reduce to &#8220;Ti1fsp,dytJ?&#8221;. The specific phrase doesn&#8217;t matter, but it should include a mix of numbers, letters (upper and lowercase), and symbols to be the most secure. Make sure that all of the symbols being used are acceptable for the system you&#8217;re choosing the password for though.
</p>

<h3>The shc Utility</h3>
<p class="single">
The <code>shc</code> utility compiles a script in order to make it harder for a cracker to read its contents. This is especially useful if you find that you have to store passwords or other sensitive information inside of a script. Take note that I said &#8220;harder&#8221; and not &#8220;impossible&#8221; for a cracker to read. It&#8217;s been shown that <code>shc</code> compiled scripts can be reverse engineered to gain access to the contents. Remember that you should strive to make sure that your protection mechanisms are multi-layered. If you use <code>shc</code> to compile a script with passwords in it, encrypt the passwords with the <code>md5sum</code> command, and set the access permissions to be as restrictive as possible. That way you&#8217;re not just relying on <code>shc</code> to keep your data safe. Some of the options for the <code>shc</code> utility are shown in <strong>Listing 23</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 23</strong></em></p>
<code>
-e date            The date after which the script will refuse to run (dd/mm/yyyy)
-f script_name     The file name of the script to compile
-m message         The message that will be displayed after the expiration date
-T                 Allow the binary form of the script to be traceable
-v                 Verbose output
</code>
</pre>
</p>

<p class="single">
Using these options I compiled a sample script via the command line in <strong>Listing 24</strong>, looked at what files were created, and then tried to run the resulting binary. The version of <code>shc</code> that I used was 3.8.7 which I compiled from source. I then copied the <code>shc</code> binary to my <code>~/bin</code> directory so that I could run it more conveniently.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 24</strong></em></p>
<code>
$ shc -e 08/09/2010 -m "Please contact your administrator" -v -f test.sh 
shc shll=bash
shc [-i]=-c
shc [-x]=exec '%s' "$@"
shc [-l]=
shc opts=- : No real one. Removing opts
shc opts=
shc: cc  test.sh.x.c -o test.sh.x
shc: strip test.sh.x
shc: chmod go-r test.sh.x
$ ls
test.sh  test.sh.x  test.sh.x.c
$ ./test.sh.x 
./test.sh.x: has expired!
Please contact your administrator
</code>
</pre>
</p>

<p class="single">
You can see in <strong>Listing 24</strong> that I&#8217;ve set an expiration date of September 8th, 2010, which is earlier than the date that I&#8217;m writing this. I supply the expiration message of &#8220;Please contact your administrator&#8221;, I ask <code>shc</code> for verbose output, and then I give it the script that I want it to compile (<code>test.sh</code>). When I list the files in the directory I see <code>test.sh</code>, <code>test.sh.x</code>, and <code>test.sh.x.c</code> . <code>test.sh.x</code> is the compiled binary that <code>shc</code> creates from my original script. <code>test.sh.x.c</code> is the C source code that is generated for <code>test.sh</code> . Be careful to keep this file in a safe place as it gives critical information that will compromise your compiled script. In <strong>Listing 24</strong> I get an error when I try to run the compiled script (<code>test.sh.x</code>), but this is expected as I used an expiration date in the past. I did this just to show you how the compiled script would react when the expiration period expires. You don&#8217;t have to specify the expiration date, but it can be handy if you only want to give a user access to a script&#8217;s capabilities for a few days or weeks.
</p>

<p class="single">
Overall <code>shc</code> is a nice tool to have at your disposal, but as I mentioned above don&#8217;t count on it for foolproof protection. The Linux Journal article in the <a href="#resources">Resources</a> section (#5) talks about how <code>shc</code> compiled scripts can be cracked. Additional features have been added to newer versions of <code>shc</code>, such as the removal of group and other read permissions by default, to make the compiled scripts harder to get at. Even so, make sure that you have multiple layers of security surrounding your scripts as we&#8217;ve talked about earlier.
</p>

<p><a name="howto"></a></p>
<h3>How-To</h3>
<p class="single">
At this point, let&#8217;s take what we&#8217;ve discussed so far and apply it to the script in <strong>Listing 1</strong>. I&#8217;ve already removed the current directory from the <code>PATH</code> variable, and made sure that we start off with a clean path by resetting the variable in <strong>Listing 3</strong>. The script in <strong>Listing 25</strong> shows the script that we&#8217;ll be starting with.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 25</strong></em></p>
<code>
#!/bin/bash
# A SUID root script that demonstrates various security problems

# Save the current path variable to restore it later
OLDPATH=${PATH}

# Set a minimal path for our script to use
PATH=/bin:/usr/bin

#Count the number of lines
ls | wc -l

# Get user input
read USR_INPUT

# Check to see if the user supplied the right password
if [ $USR_INPUT == "mypassword" ];then
    echo "User input was $USR_INPUT and should have matched the string 'test'"
fi

# Create a temp file
touch /tmp/mytempfile

# Set the temp file so that only we can read/write the contents
chmod 0700 /tmp/mytempfile

# Save the password that the user supplied to the temp file
echo $USR_INPUT > /tmp/mytempfile

# Reset the PATH variable to its original value
PATH="$OLDPATH"
</code>
</pre>
</p>

<p class="single">
Now that we have a minimal and known <code>PATH</code> variable set, we can feel a little better about running the <code class="commandline">ls | wc -l</code> command line. As stated before, we could use absolute paths for each command but that could lead to a portability issue on some systems where the binaries are stored in different locations.
</p>

<p class="single">
The next step is to deal with the user input. I&#8217;m first going to put quotes around the variable to help ensure that it&#8217;s treated as a string, and not a part of the statement. Also, just after the <code>read</code> line I&#8217;m going to scrub the input to make sure there aren&#8217;t any inappropriate characters contained within it. <strong>Listing 26</strong> shows the script with these changes.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 26</strong></em></p>
<code>
#!/bin/bash
# A SUID root script that demonstrates various security problems

# Save the current path variable to restore it later
OLDPATH=${PATH}

# Set a minimal path for our script to use
PATH=/bin:/usr/bin

#Count the number of lines
ls | wc -l

# Get user input
read USR_INPUT

# Remove all characters that aren't alphanumeric or newline
USR_INPUT=$(echo "$USR_INPUT" | tr -cd '[[:alnum:]\n]')

# Check to see if the user supplied the right password
if [ "$USR_INPUT" == "mypassword" ];then
    echo "User input was $USR_INPUT and should have matched the string 'mypassword'"
fi

# Create a temp file
touch /tmp/mytempfile

# Set the temp file so that only we can read/write the contents
chmod 0700 /tmp/mytempfile

# Save the password that the user supplied to the temp file
echo $USR_INPUT > /tmp/mytempfile

# Reset the PATH variable to its original value
PATH="$OLDPATH"
</code>
</pre>
</p>

<p class="single">
The section of code that scrubs the user input is taken from <strong>Listing 19</strong>, and a full explanation of the process can be found in the paragraphs following that listing. In short, the user input is echoed into the <code>tr</code> command so that all characters except alpha-numeric and newline characters are deleted.
</p>

<p class="single">
Of course as I mentioned above, you wouldn&#8217;t want to store any password information in a script unless you have to. If it becomes necessary to store a password inside a script it&#8217;s best to encrypt the password using a command like <code>md5sum</code>. Think about this decision carefully because there is almost always a way to avoid storing a password inside of a script. For the purpose of this example, I&#8217;ve decided to leave the password in the file and use <code>md5sum</code> to encrypt it. <strong>Listing 27</strong> shows the results of adding password encryption.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 27</strong></em></p>
<code>
#!/bin/bash
# A SUID root script that demonstrates various security problems

# Create the array that will keep the list of temp files
TEMPFILES=( )

# Function to create "safe" temporary files.
function create_temp {
    # Give preference to user tmp directory for security
    if [ -e "$HOME/tmp" ]
    then
        TEMP_DIR="$HOME/tmp"
    else
        TEMP_DIR="/tmp"
    fi

    # Construct a "safe" temp file using mktemp    
    TEMP_FILE=$(mktemp --tmpdir=$TEMP_DIR  XXXXXXXXXX)

    # Keep the file in an array to remove it later
    TEMPFILES+=( "$TEMP_FILE" )   
}

# Save the current path variable to restore it later
OLDPATH=${PATH}

# Set a minimal path for our script to use
PATH=/bin:/usr/bin

#Count the number of lines
ls | wc -l

# Make sure that nobody can see the password as it's entered
stty -echo

# Get user input
read USR_INPUT

# Re-enable echoing of typed input
stty echo

# Remove all characters that aren't alphanumeric or newline
USR_INPUT=$(echo "$USR_INPUT" | tr -cd '[[:alnum:]\n]')

# Check to see if the user supplied the right password, but use encryption
if [ $(echo "$USR_INPUT" | md5sum | cut -d " " -f 1) == "d84c7934a7a786d26da3d34d5f7c6c86" ];then
    # Don't echo the user's password, just tell them it worked
    echo "Password Accepted."
fi

# Call the function that will create a "safe" temp file for us
create_temp

# Make sure that the temp file/name was added to the array
echo ${TEMPFILES[0]}

# Reset the PATH variable to its original value
PATH="$OLDPATH"
</code>
</pre>
</p>

<p class="single">
Next, we start getting into the temporary file section of the script. I had created a function for this in the last blog post, but we&#8217;ll write the function from scratch here applying what we&#8217;ve learned so far. <strong>Listing 28</strong> shows the new function and it&#8217;s implementation within the script.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 28</strong></em></p>
<code>
#!/bin/bash
# A SUID root script that demonstrates various security problems

# Create the array that will keep the list of temp files
TEMPFILES=( )

# Function to create "safe" temporary files.
function create_temp {
    # Give preference to user tmp directory for security
    if [ -e "$HOME/tmp" ]
    then
        TEMP_DIR="$HOME/tmp"
    else
        TEMP_DIR="/tmp"
    fi

    # Construct a "safe" temp file using mktemp    
    TEMP_FILE=$(mktemp --tmpdir=$TEMP_DIR  XXXXXXXXXX)

    # Keep the file in an array to remove it later
    TEMPFILES+=( "$TEMP_FILE" )   
}

# Save the current path variable to restore it later
OLDPATH=${PATH}

# Set a minimal path for our script to use
PATH=/bin:/usr/bin

#Count the number of lines
ls | wc -l

# Make sure that nobody can see the password as it's entered
stty -echo

# Get user input
read USR_INPUT

# Re-enable echoing of typed input
stty echo

# Remove all characters that aren't alphanumeric or newline
USR_INPUT=$(echo "$USR_INPUT" | tr -cd '[[:alnum:]\n]')

# Check to see if the user supplied the right password, but use encryption
if [ $(echo "$USR_INPUT" | md5sum | cut -d " " -f 1) == "d84c7934a7a786d26da3d34d5f7c6c86" ];then
    # Don't echo the user's password, just tell them it worked
    echo "Password Accepted."
fi

# Call the function that will create a "safe" temp file for us
create_temp

# Make sure that the temp file/name was added to the array
echo ${TEMPFILES[0]}

# Reset the PATH variable to its original value
PATH="$OLDPATH"
</code>
</pre>
</p>

<p class="single">
Within the <code>create_temp</code> function, I use the TEMPFILES array to hold the file names and paths of the temporary files that I create. That way I can remove them later when the script is finished. Normally I would add a trap to handle this which I talked about in the <a href="http://www.innovationsts.com/blog/?p=1896">last blog post</a> on error handling. I left the trap out of <strong>Listing 28</strong> just to keep the example a little bit shorter. When the <code>create_temp</code> function is called, the script first checks to see if the user has their own <code>tmp</code> directory. If they do, it is used in preference to the main <code>/tmp</code> directory since it is world writable. Once the <code>tmp</code> folder has been selected it is passed to the <code>mktemp</code> command using the <code class="optionsonly">--tmpdir</code> option. <code>mktemp</code> creates the temp file, and the pathname of the file that was created is stored in a variable. According to our error handling knowledge, I should be checking to make sure that the temp file was created and that there were no errors, but I&#8217;ve left this check out to keep the script more streamlined. In your own use of this script code you&#8217;ll want to apply the error handling techniques that we talked about in the <a href="http://www.innovationsts.com/blog/?p=1896">last post</a>. The path and file name that&#8217;s stored in the variable is then added to the <code>TEMPFILES</code> array to be dealt with later. Once that&#8217;s done, the temp file is ready for use. Normally you would redirect data into the temp file, but I just echoed the path and name of the temp file instead.
</p>

<p class="single">
The last thing that I do is to restore the PATH variable using the saved value in the <code>OLDPATH</code> variable. This undoes the change that we made at the beginning of the script which helped us run system commands more safely.
</p>

<p class="single">
There are still improvements that can be made to this script based on what has been discussed in previous posts. Please add your ideas about the script in the comments on this post.
</p>

<h3>Tips and Tricks</h3>
<ul>
<li>Never copy commands or code from forums, blogs, and the like without checking them to make sure they&#8217;re safe. <a href="#resources">Resource</a> #2 has a list of malicious commands that have been given out by problem users in the Ubuntu Forums. Your best defense is to review the commands/code thoroughly yourself, or find someone who can review it for you before you execute it. You can also post the code to other forums and ask the users there if it&#8217;s safe.</li>
<li>Always be suspicious of external inputs to your script whether they be variables, user input, or anything else. We talked about validating user input in the <a href="http://www.innovationsts.com/blog/?p=1896"> last post</a> on error handling as well as this one.</li> It&#8217;s important to remember that incorrect input is not always the doing of a cracker. Many times users make honest mistakes, and your script needs to be able to handle that eventuality.
<li>Make sure that your script is writable only by the owner. That makes direct code injection attacks harder for a cracker to accomplish.</li>
<li>Use the <code>cd</code> command to change into a trusted directory when your script starts. This way you have a known starting point.</li>
<li>When you&#8217;re writing a script, always assume that it will be installed and run incorrectly. If it&#8217;s designed to be in a directory that&#8217;s only readable/writable by the owner, and it holds sensitive information, assume that it&#8217;s going to be placed in a world writable directory with full permissions for everyone. Don&#8217;t hard code an installation directory into your script unless you have to.</li>
<li>Don&#8217;t assume that your script is always going to be run as a regular user, or just as the super user. You need to understand what your script will do when run by unprivileged and privileged users.</li>
<li>Attempt to keep your scripts and files out of world writable directories like <code>/tmp</code> as much as possible.</li>
<li>Don&#8217;t give users access to programs with shell escapes (like <code>vi</code> and <code>vim</code>) from your scripts, especially when elevated privileges are involved.</li>
<li>Do not rely only on one security technique to protect your script and your users. Putting all your faith in a method like &#8220;security through obscurity&#8221; (such as password encryption) while ignoring all of the other security tools in your box is asking for trouble. Some security methods can give you a false sense of security, and you need to be vigilant. Remember, try to make the crackers life as difficult as you possibly can. This involves a multi-tiered script security strategy.</li>
<li>Use secure versions of commands in your scripts whenever possible. For instance, use <code>ssh</code> and <code>scp</code> instead of <code>telnet</code> and <code>rcp</code>, or the <code>slocate</code> command rather than the <code>locate</code> command. The man page for the base command will sometimes point you toward the more secure versions.</li>
<li>Have other coders look over your script to check it for problems and security holes. You can even post your script to various forums and ask them to try to break it for you.</li>
<li>Make sure that any startup and configuration scripts that you add to your system are as secure and bug free as possible. Don&#8217;t add a script to the system&#8217;s init or Upstart mechanism without testing it thoroughly.</li>
<li>When using information like passwords within your script, try not to store the information within environment variables. Instead use pipes and redirection. The data will be harder to access by a cracker.</li>
<li>When creating and running scripts you should follow the Rule Of Least Privilege by only giving the minimal set of privileges that the script needs to do it&#8217;s job. Also, make sure that you&#8217;ve designed the script well so that it doesn&#8217;t need elevated privileges unnecessarily. For instance, if a script works well with ownership of <code>nobody</code> and a permission string of 0700, don&#8217;t set the script to be owned by root and have permissions of 4777 .</li>
<li>In the appropriate context, use options for commands that tend to enhance security and resistance to bad input. For instance, the <code>find</code> command has an option <code class="optionsonly">-print0</code> that causes the output to be null terminated instead of newline terminated. The <code>xargs</code> command has a similar option (<code class="optionsonly">-0</code>). These options can help ensure that input containing things like newlines won&#8217;t break your script. This requires extra study of what can go wrong with your script, and how to use the available commands to avoid anything going wrong.</li>
<li>If you have scripts shared via something like a download repository, consider giving your users  md5 and/or sha1 sum values so that they can check the integrity of a script they download. If you&#8217;re emailing a script, you might want to use GPG so that you can do things like ensuring that the contents of the script have not been tampered with, and that a third party cannot read the contents of the script in transit.</li>
</ul>

<h3>Scripting</h3>
<p class="single">
These scripts are somewhat simplified and in most cases could be done other ways too, but they will work to illustrate the concepts. If you use these scripts, make sure you adapt them to your situation. Never run a script or command without understanding what it will do to your system.
</p>

<p class="single">
This first script (<strong>Listing 29</strong>) is a compilation of the shell script code that I&#8217;ve demonstrated throughout this post. The code has been organized into functions and placed in a separate script that can be sourced to add security specific code to your own scripts. Keep in mind though that the functions in this script don&#8217;t give you comprehensive coverage. Once again, we&#8217;re barely scratching the surface.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 29</strong></em></p>
<code>
#!/bin/bash - 
# File: security_src.sh
# Script that you can source to add a few security features to 
# your own scripts.

#Variables to store the old values of the IFS and PATH variables
OLD_IFS=""
OLD_PATH=""

# Function to create "safe" temporary files.
function create_temp {
    # Give preference to user tmp directory for security
    if [ -e "$HOME/tmp" ]
    then
        TEMP_DIR="$HOME/tmp"
    else
        TEMP_DIR="/tmp"
    fi

    # Construct a "safe" temp file using mktemp    
    TEMP_FILE=$(mktemp --tmpdir=$TEMP_DIR  XXXXXXXXXX)

    # Keep the file in an array to remove it later
    TEMPFILES+=( "$TEMP_FILE" )   
}

# Function that will keep this script from being run with any kind
# of root privileges.
function drop_root {
    # Check to see if we're running as root via sudo
    if [ $(/usr/bin/id -ur) -eq 0 ];then
        echo "This script cannot be run with sudo"
        exit 1
    fi

    # Get the listing on this script
    INFO=$(ls -l $0)
    # Grab the permission at the SUID position
    PERM=$(echo "$INFO" | cut -d " " -f 1 | cut -c 4)
    # Grab the owner
    OWNER=$(echo "$INFO" | cut -d " " -f 3)

    # Check for the SUID bit and the owner of root
    if [ "$PERM" == "s" -a "$OWNER" == "root" ];then
        echo "This script cannot be run as SUID root"
        exit 1
    fi
}

# Function that will get and scrub user input to make it safer to use.
function scrub_input {
    # Grab the user's input
    read USR_INPUT

    # Remove all characters that aren't alphanumeric or newline
    USR_INPUT=$(echo "$USR_INPUT" | tr -cd '[[:alnum:]\n]')    
}

# Function that sets certain environment variables to known values
function clear_vars {
    # Save the old variables so that they can be restored
    OLD_IFS="$IFS"
    OLD_PATH="$PATH"

    # Set the variables to known safer values
    IFS=$' \t\n' #Set IFS to include whitespace characters
    PATH='/bin:/usr/bin' #Assumed safe paths
}

# Function that restores environment variables to what the were at the
# start of the script.
function restore_vars {
    IFS="$OLD_IFS"
    PATH="$OLD_PATH"
}

# Function that attempts to run a command safely via the whereis command.
function run_cmd {
    # Attempt to find the command with the whereis command
    CMD=$(whereis $1 | cut -d " " -f 2)

    # Check to make sure that the command was found
    if [ -f "$CMD" ];then
        eval "$CMD"
    else
        echo "The command $CMD was not found"
        exit 127
    fi
}
</code>
</pre>
</p>

<p class="single">
This script starts out with our new and improved function which creates relatively safe temp files for us (<code>create_temp</code>). This was taken directly from <strong>Listing 28</strong> which we&#8217;ve already discussed. After that, there&#8217;s the <code>drop_root</code> function that encapsulates the functionality from <strong>Listing 21</strong>. We can just call this function at the beginning of the script to make sure that we&#8217;re not being run with <code>sudo</code> and that the script is not SUID root. This function merely warns the user and exits, it does not give up it&#8217;s root privileges like BASH does. The next function reads input from the user and then removes everything but alphanumeric characters and the newline character. This is taken from <strong>Listing 19</strong>. The next two functions deal with environment variables. The first (<code>clear_vars</code>) saves the old variable values for both <code>IFS</code> and <code>PATH</code>, and then sets new values for each. The <code>restore_vars</code> function uses the saved variable values to reset the variables back to their original condition. This is the same concept as what we talked about in <strong>Listing 3</strong> enclosed in functions. The last function (<code>run_cmd</code>) is similar to <strong>Listing 4</strong>, but I&#8217;ve expanded it a little bit to check if a file with the name of the command exists or not before trying to run it. If the command exists, it is run via the <code>eval</code> command. If the command does not exist, we warn the user and exit.
</p>

<p class="single">
<strong>Listing 30</strong> shows a simple script where I implement the collection of security specific functions in <strong>Listing 29</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 30</strong></em></p>
<code>
#!/bin/bash - 
# File: security_src_test.sh
# Script to test the sourcable script security_src.sh

# Function to clean up after ourselves
function clean_up {
    # Step through and delete all of the temp files
    for TMP_FILE in "${TEMPFILES[@]}"
    do
        # Make sure that the tempfile exists
        if [ -e "$TMP_FILE" ]; then
            echo "Temp file: $TMP_FILE"
            rm $TMP_FILE
        fi
    done

    # Reset the variables to their original values
    restore_vars
}

# Source the script that holds the security functions
. security_src.sh

# Make sure that we delete the temp files when we exit
trap 'clean_up' EXIT

# Array to hold the temporary files
TEMPFILES=( )

# Variable to hold the user's input
USR_INPUT=""

# Make sure that we're not running with root privileges
drop_root

# Make sure that we have safe variables to work with
clear_vars

# Call the function that will create a temp file for us
create_temp

# Check to make sure that the temp file was created
echo "${TEMPFILES[0]}"

# Let the user know that input is expected
printf "Please enter your input: "

# Get and scrub the user input
scrub_input

# Test the user input
echo $USR_INPUT

# Try to safely run a command that exists
run_cmd ls > /dev/null

# Try to safely run a command that does not exist
run_cmd foo
</code>
</pre>
</p>

<p class="single">
At the very top of the script I create a <code>clean_up</code> function that handles the removal of any temporary files, and calls the sourced function that restores the <code>IFS</code> and <code>PATH</code> variables to their original values. This function is used in the <code>trap</code> statement so that it will be called whenever the script exits. Just above the trap statement is where the script is sourced (<code>security_src.sh</code>) that gives us access to the security related functions. Continuing on down the script you see that I&#8217;ve created a couple of variables to hold the temporary file names and the user input. The names of these variables are from the sourced script. The sourced function <code>drop_root</code> ensures that the script is not being run with root privileges, and then <code>clear_vars</code> is called to make sure that <code>IFS</code> and <code>PATH</code> are safer to use. After that I call the <code>create_temp</code> function to set up a temporary file for me, and then immediately echo the name/path of the file by accessing the first element of the <code>TEMPFILES</code> array (<code class="commandline">echo "${TEMPFILES[0]}"</code>).
</p>

<p class="single">
I prompt the user for input with an <code>echo</code> statement next, but instead of putting the <code>read</code> command directly in my script I call the <code>scrub_input</code> function and let it handle the task of getting the input from the user. When I ran the script I tried inputting several symbols that should not be allowed in the user input, and upon hitting enter I saw via the <code class="commandline">echo $USR_INPUT</code> statement that the symbols were properly scrubbed from the input. The last two things that I do is to try to run two commands via the <code>run_cmd</code> function. The first time that I use the function I run the <code>ls</code> command, which I would expect to succeed. I use the <code class="commandline">&gt; /dev/null</code> section of the line to suppress the output from the <code>ls</code> command so that the output of the script doesn&#8217;t get too cluttered. The second command that I try to run with the <code>run_cmd</code> function is <code>foo</code>. I would not expect this command to be found, and have added it to show what the function does. <strong>Listing 31</strong> shows the output that I get when I run the script in <strong>Listing 30</strong>.
</p>

<p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 31</strong></em></p>
<code>
$ ./security_src_test.sh 
/home/jwright/tmp/mEAPJhqgyb
Please enter your input: ?blog
blog
The command foo: was not found
Temp file: /home/jwright/tmp/mEAPJhqgyb
</code>
</pre>
</p>

<p class="single">
When I check the <code>/home/jwright/tmp</code> folder for the temporary file, I see that it was properly deleted by the script. I also see that the <code>ls</code> command was found since there is no error, but the foo command was not. This is exactly what was expected. The example script in <strong>Listing 30</strong> is not a real world script by any means, but works to show you how you would use the sourced script, and what order you might want to call the sourced functions in. As always, I welcome any input on corrections, additions, and tweaks that you think should be added to these scripts or any scripts in this post. Tell me what you think in the comments section.
</p>

<p><a name="troubleshooting"></a></p>
<h3>Troubleshooting</h3>
<p class="single">
If you get any capital letters in the symbolic permission string for a file, it means that something is wrong. Usually if you get a capital &#8220;S&#8221; in the string, it means that you need to set execute rights for the owner or the group. A capital &#8220;T&#8221; means that you set the sticky bit without setting the execute permission for other/world on the file or directory. 
</p>

<h3>Conclusion</h3>
<p class="single">
As I stated when we started this post, I haven&#8217;t been able to cover every aspect shell script security, and for the most part I avoided the issue of system security as that&#8217;s an even larger (but related) subject. It&#8217;s simply been my hope that I&#8217;ve given you a good starting point to plug some of the common security holes in your own scripts. Using this as a starting point, have a look at the <a href="#resources">Resources</a> section for more information, and make sure to take opportunities to continue your learning on script, program, and system security whenever they arise.
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<br />
<h4>Books</h4>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0596005954&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=143021841X&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0596009658&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0596003234&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<br />
<br />

<h4>Links</h4>
<ol>
<li><a href="http://www.cerias.purdue.edu/">Purdue University&#8217;s Center for Education and Research in Information Assurance and Security</a></li>
<li><a href="http://ubuntuforums.org/announcement.php?a=54">Ubuntu Forums Announcement A Few Malicious Commands To Avoid In Forums/Posts/Lists</a></li>
<li><a href="http://www.linuxsecurity.com/content/view/117920/171/">LinuxSecurity.com Article On The shc Utility That Encrypts Shell Scripts</a></li>
<li><a href="http://www.datsi.fi.upm.es/~frosal/">shc Utility Homepage</a></li>
<li><a href="http://www.linuxjournal.com/article/8256">Linux Journal Article On Security Concerns When Using shc</a></li>
<li><a href="http://searchenterpriselinux.techtarget.com/tip/Seven-tips-for-optimizing-shell-script-security">7 Tips On Script Security By James Turnbull (requires registration)</a></li>
<li><a href="http://tldp.org/LDP/abs/html/securityissues.html">TLDP Advanced Bash-Scripting Guide: Chapter 35 &#8211; Miscellany</a></li>
<li><a href="http://developer.apple.com/mac/library/documentation/OpenSource/Conceptual/ShellScripting/ShellScriptSecurity/ShellScriptSecurity.html">Mac OS X Article On Shell Script Security That Gives Examples Of Attacks</a></li>
<li><a href="http://www.faqs.org/faqs/unix-faq/faq/part4/section-7.html">Article From faq.org On SUID Shell Scripts</a></li>
<li><a href="http://docstore.mik.ua/orelly/networking/puis/ch05_05.htm">Practical Unix &amp; Internet Security &#8211; Chapter 5 &#8211; Section 5.5 (SUID)</a></li>
<li><a href="http://docstore.mik.ua/orelly/networking/puis/ch23_01.htm">Practical Unix &amp; Internet Security &#8211; Chapter 23 &#8211; Writing Secure SUID and Network Programs</a></li>
<li><a href="http://www.net-security.org/malware_news.php?id=55">Help Net Security Article On Unix Shell Scripting Malware</a.</li>
<li><a href="http://www.tech-faq.com/security-problems-with-suid.html">More SUID Vulnerability Information</a></li>
<li><a href="http://www.linuxtopia.org/online_books/advanced_bash_scripting_guide/securityissues.html">Short Article On Linuxtopia About THe Dangers Of Running Untrusted Shell Scripts</a></li>
<li><a href="http://etutorials.org/Linux+systems/how+linux+works/Chapter+7+Introduction+to+Shell+Scripts/7.10+Important+Shell+Script+Utilities/">etutorials.org Article On Useful Shell Utilities For Scripts</a></li>
<li><a href="http://www.kramse.dk/projects/unix/security-sudo-script_en.html">Examples Of Risky Scripts To Use With sudo</a></li>
<li><a href="http://www.linuxsecurity.com/content/view/115462/151/">Very Good LinuxSecurity.com Article On Creating Safe Temporary Files</a></li>
<li><a href="http://www.ibm.com/developerworks/linux/library/l-sp1.html">IBM developerWorks Article: Secure programmer: Developing secure programs</a></li>
<li><a href="http://www.ibm.com/developerworks/library/l-sp2.html">IBM developerWorks Article: Secure programmer: Validating input</a></li>
<li><a href="http://www.ibm.com/developerworks/linux/library/l-sp3.html">IBM developerWorks Article: Secure programmer: Keep an eye on inputs</a></li>
<li><a href="http://bashscript.blogspot.com/2010/03/unixlinux-advanced-file-permissions.html">Article on SUID, SGID, and Stick Bits In Linux And Unix</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=2363</wfw:commentRss>
		<slash:comments>4</slash:comments>
<enclosure url="http://www.innovationsts.com/downloads/blog/Better_Scripts_Part_3/Better_Scripts_Part_3_Video.ogg" length="20652973" type="audio/ogg" />
<enclosure url="http://www.innovationsts.com/downloads/blog/Better_Scripts_Part_3/Better_Scripts_Part_3_Video.flv" length="18951910" type="video/x-flv" />
		</item>
		<item>
		<title>Writing Better Shell Scripts – Part 2</title>
		<link>http://www.innovationsts.com/blog/?p=1896</link>
		<comments>http://www.innovationsts.com/blog/?p=1896#comments</comments>
		<pubDate>Mon, 26 Jul 2010 12:56:16 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[How-Tos]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=1896</guid>
		<description><![CDATA[

Quick Start

As with Part 1 of this series, this information does not lend itself to having a &#8220;Quick Start&#8221; section. With that said, you can read the How-To section of this post for a quick general overview. I would highly recommend reading everything though, as a good understanding of the concepts and commands outlined here [...]]]></description>
			<content:encoded><![CDATA[<!-- google_ad_section_start -->
<br />
<h3>Quick Start</h3>
<p class="narrow">
As with <a href="http://www.innovationsts.com/blog/?p=1395">Part 1</a> of this series, this information does not lend itself to having a &#8220;Quick Start&#8221; section. With that said, you can read the <a href="#howto">How-To</a> section of this post for a quick general overview. I would highly recommend reading everything though, as a good understanding of the concepts and commands outlined here will serve you well in the future. <a href="#video">Video</a> and <a href="#audio">Audio</a> are also included with this post which may work as a quick reference for you. Don&#8217;t forget that the man and info pages of your Linux/Unix system can be an invaluable resource as well when you&#8217;re learning commands and solving problems.
</p>

<h3>Preface</h3>
<p class="narrow">
To make things easier on you, all of the black command line and script areas are set up so that you can copy the text from them. This does make using the commands easier, but if you&#8217;re not already familiar with the concepts presented here, typing the commands yourself and working through why you&#8217;re typing them will help you learn more. If you hit problems along the way, take a look at the <a href="#troubleshooting">Troubleshooting</a> section near the end of this post for help.
</p>
<p class="narrow">
There are formatting conventions that are used throughout this post that you should be aware of. The following is a list outlining the color and font formats used.
</p>
<p>
<code>
    Command Name or Directory Path
</code>
<br />
<code class="warnerror">
    Warning or Error
</code>
<br />
<code class="commandline">
    Command Line Snippet With Commands/Options/Arguments
</code>
<br />
<code class="optionsonly">
    Command Options and Their Arguments Only
</code>
<br />
    <span style="color:#FFAF3E;">Hyperlink</span>
</p>

<h3>Overview</h3>
<p class="narrow">
This post is the second in a series on shell script debugging, error handling, and security. The content of this post will be geared mainly toward BASH users, but there will be information that&#8217;s suitable for users of other shells as well. Information such as techniques and methodologies may transfer very well, but BASH specific constructs and commands will not. The users of other shells (CSH, KSH, etc) will have to do some homework to see what transfers and what does not. 
</p>
<p class="narrow">
There are a lot of opinions about how error handling should be done, which range from doing nothing to implementing comprehensive solutions. In this post, as well as my professional work, I try to err on the side of in-depth solutions. Some people will argue that you don&#8217;t need to go through the trouble of providing error handling on small single-user scripts, but useful scripts have a way of growing past their original intent and user group. If you&#8217;re a system administrator, you need to be especially careful with error handling in your scripts. If you or an admin under you gets careless, someday you may end up getting a call from one of your users complaining that they just deleted the contents of their home directory &#8211; with one of your scripts. It&#8217;s easier to do than you might think when precautions are not taken. All you need are a couple of lines in your script like the those in <strong>Listing 1</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
#!/bin/bash

cd $1
rm -rf *
</code>
</pre>

<p class="narrow">
So what happens if a user forgets to supply a command line argument to <strong>Listing 1</strong>? The <code>cd</code> command changes into the user&#8217;s home directory, and the <code>rm</code> command deletes all of their files and directories without prompting. That has the makings of a bad day for both you and your user. In this post I&#8217;ll cover some ways to avoid this kind of headache.
</p>

<p class="narrow">
To help ease the extra burden of making your scripts safer with error handling, we&#8217;ll talk about separating error handling code out into reusable modules which can be sourced. Once you do this and become familiar with a few error handling techniques, you&#8217;ll be able to implement robust error handling in your scripts with less effort.
</p>

<p class="narrow">
The intent of this post is to give you the information you need to make good judgments about error handling within your own scripts. Both proactive and reactive error handling techniques will be covered so that you can make the decision on when to try to head off errors before they happen, and when to try to catch them after they happen. With those things in mind, lets start off with some of the core elements of error handling.
</p>

<h3>BASH Options</h3>
<p class="single">
There are several BASH command line options that can help you avoid some errors in your scripts. The first two are ones that we already covered in <a href="http://www.innovationsts.com/blog/?p=1395">Part 1</a> of this series. The <code class="optionsonly">-e</code> option, which is the same as <code class="commandline">set -o errexit</code>, causes BASH to exit as soon as it detects an error. While there are a significant number of people who promote setting the <code class="optionsonly">-e</code> option for all of your scripts, that can prevent you from using some of the other error handling techniques that we&#8217;ll be talking about shortly. The next option <code class="optionsonly">-u</code>, which is the same as <code class="commandline">set -o nounset</code> causes the shell to throw an error whenever a variable is used before its value has been set. This is a simple way to prevent the risky behavior of <strong>Listing 1</strong>. If the user does not provide an argument to the script, the shell will see it as the <code>1</code> variable not being set and complain. This is usually a good option to use in your scripts. 
</p>

<p class="single">
<code class="commandline">set -o pipefail</code> is something that we&#8217;ll touch on in the <a href="#cmdsequences">Command Sequences</a> section and causes a whole command pipeline to error out if just one of the sections has an error. The last shell option that I want to touch on is <code class="commandline">set -o noclobber</code> (or the <code class="optionsonly">-C</code> option) which helps you because it prevents the overwriting of files with redirection. You will just get an error similar to <code class="error">cannot overwrite existing file</code>. This can save you when you&#8217;re working with system configuration files, as overwriting one of them could result in any number of big problems. <strong>Listing 2</strong> holds a quick reference list of these options.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
errexit   (-e)      Causes the script to exit whenever there is an error.
noclobber (-C)      Prevents the overwriting of files when using redirection.
nounset   (-u)      Causes the shell to throw an error whenever an unset 
                    variable is used.
pipefail            Causes a pipeline to error out if any section has an error.
</code>
</pre>

<h3>Exit Status</h3>
<p class="single">
Exit status is the 8-bit integer that is returned to a parent process when a subprocess exits (either normally or is forced to exit). Typically, an exit status of 0 means that the process completed successfully, and a greater than 0 exit status means that there was a problem. This may seem counter intuitive to C/C++ programmers who are used to true being 1 (non-zero) and false being 0. There are exceptions to the shell&#8217;s exit status standard, so it&#8217;s always best to understand how the distribution/shell/command combo you&#8217;re using will handle the exit status. An example of a command that acts differently is <code>diff</code>. When you run <code>diff</code> on two files, it will return 0 if the files are the same, 1 if the files are different, and some number greater than 1 if there was an error. So if you checked the exit status of <code>diff</code> expecting it to behave &#8220;normally&#8221;, you would think that the command failed when it was really telling you that the files are different.
</p>

<p class="single">
Probably the easiest way to begin experimenting with exit status is to use the BASH shell&#8217;s built-in <code>?</code> variable. The <code>?</code> variable holds the exit status of the last command that was run. <strong>Listing 3</strong> shows an example where I check the exit status of the <code>true</code> command which always gives an exit status of 0 (success), and of the <code>false</code> command which always gives an exit status of 1 (failure). Credit goes to <a href="http://gd.tuwien.ac.at/linuxcommand.org/wss0150.php">William Shotts, Jr.</a> who&#8217;s straight forward use of <code>true</code> and <code>false</code> in his examples on this topic inspired some of the examples in this post.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 3</strong></em></p>
<code>
$true
$echo $?
0
$false
$echo $?
1
</code>
</pre>

<p class="single">
In this case the <code>true</code> and <code>false</code> commands follow the 0 = success, non-zero = failure standard, so we can be certain whether or not the command succeeded. As stated above though, the meaning of the exit status is not always so clear. I check the man page for any unfamiliar commands to see what their exit statuses mean, and I suggest you do the same with the commands you use. <strong>Listing 4</strong> lists some of the standard exit statuses and their usual meanings.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 4</strong></em></p>
<code>
0           Command completed successfully.
1-125       Command did not complete successfully. Check the command's man page 
                for the meaning of the status.
126         Command was found, but couldn't be executed.
127         Command was not found.
128-254     Command died due to receiving a signal. The signal code is added to 
                128 (128 + SIGNAL) to get the status.
130         Command exited due to Ctrl-C being pressed.
255         Exit status is out of range.
</code>
</pre>

<p class="single">
For statuses 128 through 254, you see that the signal that caused the command to exit is added to the base status of 128. This allows you to subtract 128 from the given exit status later to see which signal was the culprit. Some of the signals that can be added to the base of 128 are shown in <strong>Listing 5</strong> and were obtained from the signal man page via <code class="commandline">man 7 signal</code> . Note that <code>SIGKILL</code> and <code>SIGSTOP</code> cannot be caught, blocked, or ignored because those signals are handled at the kernel level. You may see all of these signals at one time or another, but the most common are <code>SIGHUP</code>, <code>SIGINT</code>, <code>SIGQUIT</code>, <code>SIGKILL</code>, <code>SIGTERM</code>, and <code>SIGSTOP</code>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 5</strong></em></p>
<code>
       Signal     Value     Action   Comment
       ──────────────────────────────────────────────────────────────────────
       SIGHUP        1       Term    Hangup detected on controlling terminal
                                     or death of controlling process
       SIGINT        2       Term    Interrupt from keyboard
       SIGQUIT       3       Core    Quit from keyboard
       SIGILL        4       Core    Illegal Instruction
       SIGABRT       6       Core    Abort signal from abort(3)
       SIGFPE        8       Core    Floating point exception
       SIGKILL       9       Term    Kill signal
       SIGSEGV      11       Core    Invalid memory reference
       SIGPIPE      13       Term    Broken pipe: write to pipe with no
                                     readers
       SIGALRM      14       Term    Timer signal from alarm(2)
       SIGTERM      15       Term    Termination signal
       SIGUSR1   30,10,16    Term    User-defined signal 1
       SIGUSR2   31,12,17    Term    User-defined signal 2
       SIGCHLD   20,17,18    Ign     Child stopped or terminated
       SIGCONT   19,18,25    Cont    Continue if stopped
       SIGSTOP   17,19,23    Stop    Stop process
       SIGTSTP   18,20,24    Stop    Stop typed at tty
       SIGTTIN   21,21,26    Stop    tty input for background process
       SIGTTOU   22,22,27    Stop    tty output for background process
</code>
</pre>

<p class="single">
A listing of signals which only shows the symbolic and numeric representations without the descriptions can be obtained with either <code class="commandline">kill -l</code> or <code class="commandline">trap -l</code> .
</p>

<p class="single">
You can explicitly pass the exit status of the last command executed back to the parent process (most likely the shell) with a line like <code class="commandline">exit $?</code> . You can do the same thing implicitly by calling the <code>exit</code> command without an argument. This works fine if you want to exit immediately, but if you want to do some other things with the exit status first you&#8217;ll need to store it in a variable. This is because after you read the <code>?</code> variable once, it resets. <strong>Listing 6</strong> shows one way of using an if statement to pass the exit status back to the parent after implementing your own error handling functionality.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 6</strong></em></p>
<code>
#!/bin/bash - 

# Run the command(s)
false

# Save the exit status (it's reset once we read it)
EXITSTAT=$?

# If the command has a non-zero exit status
if [ $EXITSTAT -gt 0 ]
then
    echo "There was an error."
    exit $EXITSTAT #Pass the exit status back to parent
fi
</code>
</pre>

<p class="single">
You can also use an <code>if</code> statement to directly test the exit status of a command as in <strong>Listing 7</strong>. Notice that using the command this way resets the <code>?</code> variable so that you can&#8217;t use it later.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 7</strong></em></p>
<code>
#!/bin/bash - 

# If the command has a non-zero exit status
if ! false
then
    echo "There was an error."    
    exit 1
fi
</code>
</pre>

<p class="single">
The <code class="commandline">if ! false</code> statement is the key here. What&#8217;s inside of the <code>if</code> statement will be executed if the command (in this case <code>false</code>) returns a non-zero exit status. Using this type of statement can give you a chance to warn the user of what&#8217;s going on and take any actions that are needed before the script exits. 
</p>

<p class="single">
You can also use the <code>if</code> and <code>test</code> combination in more complex ways. For instance, according to its man page, the <code>ls</code> command uses an exit status of 0 for no errors, 1 for minor errors like not being able to access a sub directory, and 2 for major errors like not being able to access a file/directory specified on the command line. With this in mind, take a look at <strong>Listing 8</strong> to see how you could differentiate between the &#8220;no error&#8221;, &#8220;minor error&#8221;, and &#8220;major error&#8221; conditions.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 8</strong></em></p>
<code>
#!/bin/bash - 

function testex {
    # We can only read $? once before it resets, so save it
    exitstat=$1

    # See which condition we have
    if test $exitstat -eq 0; then
        echo "No error detected"
    elif test $exitstat -eq 1; then
        echo "Minor error detected"
    elif test $exitstat -eq 2; then
        echo "Major error detected"
    fi
}

# Try a listing the should succeed
echo "- 'ls ~/*' Executing"
ls ~/* &#038;> /dev/null

# Check the success/failure of the ls command
testex $?

# Try a listing that should not succeed
echo "- 'ls doesnotexist' Executing"
ls doesnotexist &#038;> /dev/null

testex $?
</code>
</pre>

<p class="single">
Inside the <code>testex</code> function I have placed code that looks for specific exit statuses and then tells the user what was found. Normally you wouldn&#8217;t worry about handling the situation where there&#8217;s no error (exit status 0), but doing so helps clarify the concept in our example. The output that you would get from running this script is shown in <strong>Listing 9</strong>. 
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 9</strong></em></p>
<code>
$ ./testex.sh 
- 'ls ~/*' Executing
No error detected
- 'ls doesnotexist' Executing
Major error detected
</code>
</pre>

<p class="single">
There are a couple of final things to be aware of when you&#8217;re using the <code>?</code> variable. First, remember that whenever you use <code>?</code> from the command line or in a script, the shell resets its value. If you need to use the <code>?</code> variable more than once in your script, you&#8217;ll want to store it&#8217;s value in another variable and use that. The second is that <code>?</code> becomes ineffective when you are using the <code class="optionsonly">-e</code> option or the line <code class="commandline">set -o errexit</code>. The reason for this is that the script will exit as soon as an error is detected, and so you never get a chance to check the <code>?</code> variable. 
</p>

<h3>The command_not_found_handle Function</h3>
<p class="single">
As of BASH 4.0, the provision for a <code>command_not_found_handle</code> function has been added. This function makes it possible to display user friendly messages when a command the user types is not found. BASH searches for the command and if it&#8217;s not found anywhere, BASH looks to see if you have the <code>command_not_found_handle</code> function defined. If you do, that function is invoked passing it the attempted command and its arguments so that a useful message can be displayed. If you use a Debian or Ubuntu system you&#8217;ve probably seen this in action as they&#8217;ve had this feature for awhile. <strong>Listing 10</strong> shows an example of the command_not_found_handle function output on an Ubuntu 9.10 system.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 10</strong></em></p>
<code>
$cat2
No command 'cat2' found, did you mean:
 Command 'cat' from package 'coreutils' (main)
cat2: command not found
</code>
</pre>

<p class="single">
You can implement/override the behavior of the <code>command_not_found_handle</code> function to provide your own functionality. <strong>Listing 11</strong> shows an implementation of the <code>command_not_found_handle</code> function inside of a stand-alone script. In most cases you would want to add it to your BASH configuration file(s) so that you can make use of the function anytime that you&#8217;re at the shell prompt.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 11</strong></em></p>
<code>
#!/bin/bash - 
# File: cmdnf.sh

function command_not_found_handle {
    echo "The command ($1) is not valid."
    exit 127 #The command not found status
}

cat2
</code>
</pre>

<p class="single">
You would access the arguments to the original (not found) command via <code>$2</code>, <code>$3</code> and so on. Notice that I used the <code>exit</code> command and passed it the code of 127, which is the command not found exit status. The exit status of the whole script is the exit status of the <code>command_not_found_handle</code> function. If you don&#8217;t set the exit status explicitly the script will end up returning 0 (success), thus preventing a user or script from using the exit status to determine what type of error occurred. Propagation of the exit status and terminating signal (which we&#8217;ll talk about later) is a good thing to do to prevent your users from missing important information and/or having problems. When run, the script in <strong>Listing 11</strong> gives you the following output in <strong>Listing 12</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 12</strong></em></p>
<code>
$./cmdnf.sh 
The command (cat2) is not valid.
$echo $?
127
</code>
</pre>

<p><a name="cmdsequences"></a></p>
<h3>Command Sequences</h3>
<p class="single">
Command sequences are multiple commands that are linked by pipes or logical short-circuit operators. Two logical short-circuits are the double ampersand (<code>&amp;&amp;</code>) and double pipe (<code>&#124;&#124;</code>) operators. The <code>&amp;&amp;</code> only allows the command that comes after it in the series to be executed if the previous command exited with a status of 0. The <code>&#124;&#124;</code> operator does the opposite by only allowing the next command to be executed if the previous one returned a non-zero exit status. <strong>Listing 13</strong> shows examples of how each of these work.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 13</strong></em></p>
<code>
$true &#038;&#038; echo 'Hello World!'
Hello World!
$false &#038;&#038; echo 'Hello World!'
$true || echo 'Hello World!'
$false || echo 'Hello World!'
Hello World!
</code>
</pre>

<p class="single">
So, one of the many ways to solve the unset variable problem we see in <strong>Listing 1</strong> is the example shown in <strong>Listing 14</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 14</strong></em></p>
<code>
#!/bin/bash

#Make sure the user provided a command line argument
[ -n "$1" ] || { echo "Please provide a command line argument."; exit 1; }

#Change to the directory and delete the files and dirs
cd $1 &#038;&#038; rm -rf *
</code>
</pre>

<p class="single">
In the first line of interest, we check to make sure that the value of <code>$1</code> is not null. If that <code>test</code> command fails, it means that <code>$1</code> is unset and that the user did not provide a command line argument. Since the <code>&#124;&#124;</code> operator only allows the next command to run if the previous one fails, our code block warns the user of their mistake and exits with a non-zero status. If a command line argument was supplied, the script continues on. In the second interesting line we use the <code>&amp;&amp;</code> operator to run the <code>rm</code> command if, and only if, the <code>cd</code> command succeeds. This keeps us from accidentally deleting all of the files and directories in the user&#8217;s/script&#8217;s current working directory if the <code>cd</code> command fails for some reason.
</p>

<p class="single">
The next type of command sequence that we&#8217;re going to cover is a pipeline. When commands are piped together, only the last return code will be looked at by the shell. If you have a series of pipes like the one in <strong>Listing 15</strong>, you would expect it to show a non-zero exit status, but instead it&#8217;s 0.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 15</strong></em></p>
<code>
$true | false | true
$echo $?
0
</code>
</pre>

<p class="single">
To change the shell&#8217;s behavior so that it will return a non-zero value for a pipeline if any of it&#8217;s elements have a non-zero exit status, use the <code class="commandline">set -o pipefail</code> line in your script. The result of using <code>pipefail</code> is shown in <strong>Listing 16</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 16</strong></em></p>
<code>
$set -o pipefail
$true | false | true
$echo $?
1
</code>
</pre>

<p class="single">
This method doesn&#8217;t give you any insight into where in the pipeline your error occurred though. In many cases I prefer to use the BASH array variable <code>PIPESTATUS</code> to check pipelines. It gives you the ability to tell where in the pipeline the error occurred, so that your script can more intelligently adapt to or warn about the error. <strong>Listing 17</strong> gives an example.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 17</strong></em></p>
<code>
$true | false | true
$echo ${PIPESTATUS[0]} ${PIPESTATUS[1]} ${PIPESTATUS[2]}
0 1 0
</code>
</pre>

<p class="single">
To keep things clean inside your script, you might put the code to check the PIPESTATUS array into a function and use a loop to process the array elements. This way you have reusable code that will automatically adjust to the number of commands that are in your pipe. One of the scripts in the <a href="#scripting">Scripting</a> section shows this technique.
</p>

<p class="single">
If you&#8217;re running a version of BASH prior to 3.1, a potential problem with using pipes is the <code class="error">Broken pipe</code> warning. If a reader in a pipeline finishes before its writer completes, the writer command will get a <code>SIGPIPE</code> signal which causes the <code class="error">Broken pipe</code> warning to be thrown. It may be a non-issue for you, but it doesn&#8217;t hurt to be aware of it. If you&#8217;re running a version of BASH that&#8217;s 3.1 or higher, you use the <code>PIPESTATUS</code> variable to see if there&#8217;s been a pipe error. I&#8217;ve done this in <strong>Listing 18</strong> where I&#8217;ve written two scripts that will cause the pipeline to break. The code inside the scripts doesn&#8217;t really matter in this case, just the end result.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 18</strong></em></p>
<code>
$./pipeerr2.sh | ./pipeerr.sh
test
test
test
$echo ${PIPESTATUS[0]} ${PIPESTATUS[1]}
141 0
</code>
</pre>

<p class="single">
You can see that the pipe exit status for the first script (or pipeline section) is 141. This number actually results from the addition of a base exit status and the signal code, which I&#8217;ve mentioned before. The base status is 128, which the shell uses to signify that a command stopped due to receiving a signal rather than exiting normally. Added to that is the code of the signal that caused the termination, which in this case is 13 (SIGPIPE) on my system. This technique embeds the signal code in the exit status in a way that makes it easy to retrieve. Since the status is built by adding 128 and 13, all I have to do is use arithmetic expansion to extract the signal code from <strong>Listing 18</strong>: <code class="commandline">echo $((${PIPESTATUS[0]}-128))</code> . This gives me output showing the value of <code>13</code>, which is what we expect. Keep in mind that the <code>PIPESTATUS</code> array variable is like the <code>?</code> variable in that it resets once you access it or a new pipeline is executed.
</p>

<p class="single">
As stated in <a href="http://www.innovationsts.com/blog/?p=1395">Part 1</a> of this series, you can replace pipes with temporary files. This will eliminate the SIGPIPE and exit status pitfalls of pipes, but as stated before temp files are much slower than pipes and require you to clean them up after you&#8217;re done with them. In general, I would suggest staying away from temp files unless you have a compelling reason to use them. A compromise between temp files and pipes might be named pipes. On modern Linux systems you use the <code>mkfifo</code> command to create a named pipe, which you can then use with redirection. On older systems you may have to use <code>mknod</code> instead to create the pipe. In <strong>Listing 19</strong> you can see that I&#8217;ve used named pipes instead of regular pipes, and that this technique allows me to check each of the sections of the pipeline as they&#8217;re used. Keep in mind that I&#8217;m reading from the named pipe in another terminal with <code class="commandline">cat &lt; pipe1</code> since a line like <code class="commandline">true &gt; pipe1</code> will block until the pipe has been read from. Also notice that I use the <code>rm</code> command to delete the named pipe after I&#8217;m done with it. I do this as a housekeeping measure, since I don&#8217;t want to leave named pipes laying around that I don&#8217;t need.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 19</strong></em></p>
<code>
$mkfifo pipe1
$true > pipe1
$echo $?
0
$false > pipe1
$echo $?
1
$rm pipe1
</code>
</pre>

<h3>Wrapper Functions</h3>
<p class="single">
If there&#8217;s a command that you&#8217;re using multiple times in your script and that command requires some error handling, you might want to think about creating a wrapper function. For instance, in <strong>Listing 1</strong> the <code>cd</code> command has the unwanted side effect of switching to the user&#8217;s home directory if the user hasn&#8217;t supplied a command line argument. If you&#8217;re using <code>cd</code> multiple times throughout the script, you could write a function that extends <code>cd</code>&#8217;s functionality.  <strong>Listing 20</strong> shows an example of this.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 20</strong></em></p>
<code>
#!/bin/bash - 

function cdext {
    # We want to make sure that the user gave an argument
    if [ $# -eq 1 ]
    then
        cd $1
    else
        echo "You must supply a directory to change to."
        exit 1
    fi
}

# This should succeed
cdext /tmp

# Make sure that it did succeed
pwd

# This should fail with our warning
cdext
</code>
</pre>

<p class="single">
I first use the shell&#8217;s built-in <code>#</code> variable to make sure that the user has specified and single argument. It would probably also be a good idea to add a separate <code>else</code> statement to warn the user that they supplied too many arguments. If the user supplied the single argument, the function uses <code>cd</code> to change to that directory and we make sure it worked correctly with the <code>pwd</code> command. If the user didn&#8217;t supply a command line argument, we warn them of their error and exit the script. This simple function adds an extra restriction to the <code>cd</code> command&#8217;s usage to help make your script safer.
</p>

<p class="single">
To make the most of this technique you need to understand what types of things can go wrong with a command. Make sure that you&#8217;ve learned enough about the command, through resources like the man page, to handle the potential errors properly.
</p>

<h3>&#8220;Scrubbing&#8221; Error Output</h3>
<p class="single">
What I mean by scrubbing in this instance is searching through the error output from a command looking for patterns. That pattern could be something like &#8220;file not found&#8221; or &#8220;file or directory does not exist&#8221;. Essentially what you&#8217;re doing is looking through the command&#8217;s output trying to find a string that will give you specific information about what error occurred. This method tends to be very brittle, meaning that the slightest change in the output can break your script. For this reason I don&#8217;t recommend this method, but in some cases it may be your only choice to gather more specific information about a command&#8217;s error condition. One method to make this technique slightly more robust would be to use regular expressions and case insensitivity. In <strong>Listing 21</strong> I&#8217;ve provided a very simple example of output scrubbing.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 21</strong></em></p>
<code>
$ls doesnotexist 2>&#038;1 | grep -i "file not found"
$ls doesnotexist 2>&#038;1 | grep -i "no such"
ls: cannot access doesnotexist: No such file or directory
</code>
</pre>

<p class="single">
Notice that I&#8217;m using the <code class="optionsonly">-i</code> option of <code>grep</code> to make it case insensitive. I&#8217;m also redirecting both stdout and stderr into the pipe with the <code>2>&#038;1</code> statement. That way I can search all of of the command&#8217;s messages, errors, and warnings looking for the pattern of interest. In the first search statement I look for the pattern &#8220;file not found&#8221;, which is not a statement found in the <code>ls</code> command&#8217;s output. When I search for the statement &#8220;no such&#8221;, I get the line of output that contains the error. You could push this example a lot further with the use of regular expressions, but even if you&#8217;re very careful a simple change to the command&#8217;s output by the developer could leave your script broken. I would suggest filing this technique away in your memory and using it only when you&#8217;re sure there&#8217;s not a better way to solve the problem.
</p>

<h3>Being A Good Linux/UNIX Citizen</h3>
<p class="single">
There are some signals that we need to take extra care in dealing with, such as <code>SIGINT</code>. With <code>SIGINT</code> all processes in the foreground see the signal, but the innermost (foremost) child process decides what will be done with the signal. The problem with this is that if the innermost process just absorbs the <code>SIGINT</code> signal and doesn&#8217;t act on it and/or send it on up to it&#8217;s parent, the user will be unable to exit the program/script with the Ctrl-C key combination. There are a few applications that trap this signal intentionally which is fine, but doing this on your own can lead to unpredictable behavior and is what I would consider to be an undesirable practice. Try to avoid this in your own scripts unless you have a compelling reason to do otherwise and understand the consequences. To get around this issue we&#8217;ll propagate signals like <code>SIGINT</code> up the process stack to give the parent(s) a chance to react to them.
</p>

<p class="single">
One way of handling error propagation is shown in <strong>Listing 22</strong> where I&#8217;ve assumed that the shell is the direct parent of the script.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 22</strong></em></p>
<code>
#!/bin/bash - 

function int_handler {
    echo "SIGINT Caught"

    #Propagate the signal up to the shell
    kill -s SIGINT $$

    # 130 is the exit status from Ctrl-C/SIGINT
    exit 130
}

# Our trap to handle SIGINT/Ctrl-C
trap 'int_handler' INT

while true
do
    :
done
</code>
</pre>

<p class="single">
First of all, don&#8217;t get caught up in the <code>trap</code> statement if you don&#8217;t already know what it is. We&#8217;ll talk about traps shortly. This script busy waits in a while loop until the user presses Ctrl-C or the system sends the <code>SIGINT</code> signal. When this happens the script uses the <code>kill</code> command to send <code>SIGINT</code> on up to the shell (who&#8217;s process ID is represented by <code>$$</code> in the line <code class="commandline">kill -s SIGINT $$</code>), and then exits with an exit status corresponding to a forced exit due to <code>SIGINT</code>. This way the shell gets to decide what it wants to do with the <code>SIGINT</code>, and the exit status of our script can be examined to see what happened. Our script handles the signal properly and then allows everyone else above it to do the same.
</p>

<h3>Error Handling Functions</h3>
<p class="single">
Since you&#8217;re most likely going to be using error handling code in multiple places in your script, it can be helpful to separate it out into a function. This keeps your script clean and free of duplicate code. <strong>Listing 23</strong> shows one of the many ways of using a function to encapsulate some simple error handling functionality.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 23</strong></em></p>
<code>
#!/bin/bash - 

function err_handler {
    # Check to see which error code we were given
    if [ $1 -eq 1001 ]; then
        echo "Non-Fatal Error #1 Has Occurred"
        # We don't need to exit here
    elif [ $1 -eq 1002 ]; then
        echo "Fatal Error #2 Has Occurred"
        exit 1 # Error was fatal so exit with non-zero status
    fi
}

# Notice that I'm using my own made up error codes (1001, 1002)
err_handler 1001
err_handler 1002
</code>
</pre>

<p class="single">
Notice that I made up my own error codes (1001 and 1002). These have no correlation to any exit status of any of the commands that my script would use, they&#8217;re just for my own use. Using codes in this way keeps me from having to pass long error description strings to my function, and thus saves typing, space, and clutter in my code. The drawback is that someone modifying the script later (maybe years later) can&#8217;t just glance at a line of code (err_handler 1001) and know what error it is referring to. You could help lessen this problem by placing error code descriptions in the comments at the top of your script. When I run the script in <strong>Listing 23</strong> I get the output in <strong>Listing 24</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 24</strong></em></p>
<code>
$./err_handler.sh 
Non-Fatal Error #1 Has Occurred
Fatal Error #2 Has Occurred
$
</code>
</pre>

<h3>Introducing The trap Command</h3>
<p class="single">
The <code>trap</code> command allows you to associate a section of code with a particular signal (see <strong>Listing 5</strong>), so that when the signal is seen by the shell the code is run. The shell essentially sets up a signal handler for the signal associated with the trap. This can be very handy to allow you to correct for errors, log what happened, or remove things like temporary files before your script exits. These things highlight one of the downsides to using <code>kill -9</code> because <code>SIGKILL</code> is one of the two signals that can&#8217;t be trapped. If you use <code>SIGKILL</code>, the process that you&#8217;re killing won&#8217;t get a chance to clean up after itself before exiting. That could leave things like temporary files and stale file locks around to cause problems later. It&#8217;s better to use <code>SIGTERM</code> to end a process because it gives the process a chance to clean up.
</p>

<p class="single">
<strong>Listing 25</strong> shows a couple of ways to use the <code>trap</code> command in a script.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 25</strong></em></p>
<code>
#!/bin/bash - 

function exit_handler {
    echo "Script Exiting"
}

trap "echo Ctrl-C Caught; exit 0" int

trap 'exit_handler' EXIT

while true
do
    :
done
</code>
</pre>

<p class="single">
Notice that I first use a semi-colon separated list of commands with <code>trap</code> to catch the <code>SIGINT</code> (Ctrl-C) signal. While this particular implementation is bad design because it doesn&#8217;t propagate <code>SIGINT</code>, it allows me to keep the example simple. The <code class="commandline">exit 0</code> statement is what causes the second trap that&#8217;s watching for the <code>EXIT</code> condition to be triggered. This second trap uses a function instead of a semi-colon separated list of commands. This is a cleaner way to handle traps that promotes code reuse, and except in simple cases should probably be your preferred method. Notice the form of the <code>SIGINT</code> specifier that I use at the end of the first trap statement. I use <code>int</code> because the prefix <code>SIG</code> is not required, and the signal declaration is not case sensitive. The same applies when using signals with commands like <code>kill</code> as well. You&#8217;re also not limited to specifying one signal per trap. You can append a list of signal specifiers onto the end of the <code>trap</code> statement and each one will use the error handling code specified within the trap. 
</p>

<p class="single">
One tip to be aware of is that you can specify the signals by their numeric representation, but I would advise against it. Using their symbolic representation tells anyone looking at your script (which could even be you years from now) at a glance which signal you&#8217;re using. There&#8217;s no chance for misinterpretation, and symbolic signals are more portable than just specifying a signal number since numbers tend to vary more by platform. 
</p>

<p class="single">
The output from running the script in <strong>Listing 25</strong> and hitting Ctrl-C is shown in <strong>Listing 26</strong>. Notice that the <code>SIGINT</code> trap is processed before the <code>EXIT</code> trap. This is the expected behavior because the traps for all other signals should be processed before the <code>EXIT</code> trap.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 26</strong></em></p>
<code>
$./trapuse.sh
^CCtrl-C Caught
Script Exiting
$
</code>
</pre>

<p class="single">
There are four signal specifiers that you&#8217;re probably going to be most interested in when using traps and they are <code>INT</code>, <code>TERM</code>, <code>EXIT</code>, and <code>ERR</code>. All of these have been touched on so far except for <code>ERR</code>. If you remember from above, you could use <code class="commandline">set -o errexit</code> to cause the shell to exit on an error. This was great from the standpoint that it kept your script from running after a potentially dangerous error had occurred, but kept you from handling the error yourself. Setting a trap using the <code>ERR</code> signal specifier takes care of this shortcoming. The shell receives an <code>ERR</code> signal on the same conditions that cause an exit with <code>errexit</code>, so you can use a <code>trap</code> statement to do any clean up or error correction before exiting. <code>ERR</code> does have the limitation that an error is not detected if it is enclosed in a command sequence, <code>if</code> statement test, a <code>while</code> or <code>until</code> statement, or if the command&#8217;s exit status is being inverted by an <code>!</code> . On older versions of BASH command substitutions <code>$(...)</code> that fail may not be caught by a <code>trap</code> statement either.
</p>

<p class="single">
You can reset traps back to their original conditions before they were associated with commands using the <code>-</code> command specifier. For example, in the script in <strong>Listing 25</strong> you could add the line <code class="commandline">trap - SIGINT</code> after which the code for the <code>SIGINT</code> trap would no longer be called when the user hits Ctrl-C. You can also cause the shell to ignore signals by passing a null string as a signal specification as in <code class="commandline">trap ""  SIGINT</code> . This would cause the shell to ignore the user whenever they press the Ctrl-C key combination. This is not recommended though as it makes it harder for the user to terminate the process. It&#8217;s a better practice to do our clean up and then propagate the signal in the way that we talked about earlier. A handy trick is that you can simulate the functionality of the <code>nohup</code> command with a line like <code class="commandline">trap "" SIGHUP</code> . What this does is cause your script to ignore the HUP (Hangup) signal so that it will keep running even after you&#8217;ve logged out.
</p>

<p class="single">
If you run <code>trap</code> by itself without any arguments, it outputs the traps that are currently set. Using the <code class="optionsonly">-p</code> option with <code>trap</code> causes the same behavior. You can also supply signal specifications (<code class="commandline">trap -p INT EXIT</code>) and <code>trap</code> will output only the commands associated with those signals. This output can be redirected and stored, and with a little bit of work read back into a script to reinstate the traps later. <strong>Listing 27</strong> shows two lines of output from the addition of the line <code class="commandline">trap -p</code> to the script in <strong>Listing 25</strong> just before the while loop.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 27</strong></em></p>
<code>
trap -- 'exit_handler' EXIT
trap -- 'echo Ctrl-C Caught; exit 0' SIGINT
</code>
</pre>

<p class="single">
Even with all the information that I&#8217;ve given you on the <code>trap</code> command, there&#8217;s still more information to be had. I&#8217;ve tried to hit the highlights that I think will be most useful to you. You can open the BASH man page and search for &#8220;trap&#8221; if you want to dig deeper.
</p>

<p><a name="howto"></a></p>
<h3>How-To</h3>
<p class="single">
In this section I&#8217;m going to use a few of the different methods that we&#8217;ve discussed to fix the script in <strong>Listing 1</strong>. The goal is to protect the user from unexpected behavior such as having everything in their home directory deleted. I won&#8217;t cover every single way of solving the problem, instead I&#8217;ll be integrating a few of the topics we&#8217;ve covered into one script to show some practical applications. It&#8217;s my hope that by this point in the post you&#8217;re starting to see your own solutions and will be able to build on (and/or simplify) what I do here.
</p>

<p class="single">
If you look at <strong>Listing 28</strong> I&#8217;ve added the <code class="optionsonly">-u</code> option to the shebang line of the script, and also added a check to make sure that the directory exists before changing to it.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 28</strong></em></p>
<code>
#!/bin/bash -u

if [ ! -d $1 ];then
    echo "Please provide a valid directory."
    exit 1
fi

cd $1
rm -rf *
</code>
</pre>

<p class="single">
<strong>Listing 29</strong> shows what happens when I make a couple of attempts at running the script incorrectly.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 29</strong></em></p>
<code>
$./l1cor_1.sh 
./l1cor_1.sh: line 3: $1: unbound variable
$./l1cor_1.sh /doesnotexist
Please provide a valid directory.
</code>
</pre>

<p class="single">
The <code class="optionsonly">-u</code> option causes the <code class="error">unbound variable</code> error because <code>$1</code> will not be set if the user doesn&#8217;t supply at least one command line argument. The <code>if</code>/<code>test</code> statement declares that if the directory does not exist we will give the user an error message and then exit. There are also other checks that you could add to <strong>Listing 28</strong> including one to make sure that the directory is writable by the current user. Ultimately you decide which checks are necessary, but the end goal with this particular example is to make sure that any dangerous behavior is avoided.
</p>

<p class="single">
<strong>Listing 28</strong> still has a problem because the <code>rm</code> command will run even if the <code>cd</code> command has thrown an error (like <code class="error">Permission denied</code>). To fix this I&#8217;m going to rearrange the <code>cd</code> and <code>rm</code> commands into a command sequence using the <code>&amp;&amp;</code> operator, and then check the exit status of the sequence. You can see these changes in <strong>Listing 30</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 30</strong></em></p>
<code>
#!/bin/bash -u

if [ ! -d $1 ];then
    echo "Please provide a valid directory."
    exit 1
fi

cd $1 &#038;&#038; rm -rf *

if [ $? -gt 0 ];then
    echo "An error occurred during the cd/rm process."
    exit 1
fi
</code>
</pre>

<p class="single">
The double ampersand (<code>&amp;&amp;</code>) will cause the command sequence to exit if the <code>cd</code> command fails, thus ignoring the <code>rm</code> command. I do this to catch any of the other errors that can occur with the <code>cd</code> command. If there&#8217;s an unknown error with the <code>cd</code> command, we don&#8217;t want <code>rm</code> to delete all of the files/directories in the current directory. Remember that I can only check the exit status of the last command in the sequence, which doesn&#8217;t tell me whether it was <code>cd</code> or <code>rm</code> that failed. As a work around to this I&#8217;ll check to see if the <code>rm</code> command succeeded in the next step where I set a trap on the <code>EXIT</code> signal. I&#8217;ve added the <code>trap</code> statement and a function to use with the trap in <strong>Listing 31</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 31</strong></em></p>
<code>
#!/bin/bash -u

# A final check to let the user know if this script failed 
# to perform its primary function - deleting files
function exit_handler {
    # Count the number of lines (files/dirs) in the directory
    DIR_ENTRIES=$(ls $1 | wc -l)

    # If there are still files in there throw an error message
    if [ $DIR_ENTRIES -gt 0 ];then
        echo "Some files/directories were not deleted"
        exit 1
    fi
}

# We want to check one last thing before exiting
trap 'exit_handler $1' EXIT

# If the directory doesn't exist, warn the user
if [ ! -d $1 ];then
    echo "Please provide a valid directory."
    exit 1
fi

# Don't execute rm unless cd succeeds and suppress messages
cd $1 &#038;> /dev/null &#038;&#038; rm -rf * &#038;> /dev/null

# If there was an error with cd or rm, warn the user
if [ $? -gt 0 ];then
    echo "An error occurred during the cd/rm process."
    exit 1
fi
</code>
</pre>

<p class="single">
I&#8217;m not saying that this is the most efficient way to solve this problem, but it does show you some interesting uses of the techniques we&#8217;ve talked about. I went ahead and suppressed the messages from <code>cd</code> and <code>rm</code> so that I could substitute my own. This is done with the <code class="commandline">&amp;&gt; /dev/null</code> additions to the command sequence. I also added the <code class="commandline">trap 'exit_handler $1' EXIT</code> line to the script, which sets a trap for the <code>EXIT</code> signal and uses the <code>exit_handler</code> function to handle the event. Notice the use of single quotes around the <code class="commandline">'exit_handler $1'</code> argument to <code>trap</code>. This keeps the <code>$1</code> variable reference from being expanded until the trap is called. We need that variable so that our exit handler can check the directory to make sure that all the files and directories were deleted. For our purposes the example script is now complete and does a reasonable job of protecting the user, but there is plenty of room for improvement. Tell us how you would change <strong>Listing 31</strong> to make it better and/or simpler in the comments section of this post.
</p>

<h3>Tips and Tricks</h3>
<ul>
<li>You can sometimes use options with your commands to make them more fault tolerant. For instance the <code class="optionsonly">-p</code> option of <code>mkdir</code> automatically creates the parents of the directory you specify if they don&#8217;t already exist. This keeps you from getting a <code class="error">No such file or directory</code> error. Just make sure the options you use don&#8217;t introduce their own new problems.</li>
<li>It&#8217;s usually a good idea to enclose variables in quotation marks, especially the <code>@</code> variable. Doing this ensures that your script can better handle spaces in filenames, paths, and arguments. So, doing something like <code class="commandline">echo "$@"</code> instead of <code class="commandline">echo $@</code> can save you some trouble.</li>
<li>You can lessen your chances of leaving a file (like a system configuration file) in an inconsistent state if you make changes to a copy of the file and then use the <code>mv</code> command to put the altered file in place. Since <code>mv</code> typically only changes the information for the file and doesn&#8217;t move any bits, the changeover is much faster so it&#8217;s less likely that another program will try to access the file in the time the change is being made. There are a few subtle issues to be aware of when using this method though. Have a look at David Pashley&#8217;s article (link #2) in the <a href="#resources">Resources</a> section for more details.</li>
<li>You can use parameter expansion (<code>${...}</code>) to avoid the null/unset variable problem that you see in <strong>Listing 1</strong>. Using a line like <code class="commandline">cd ${1:?"A directory to change to is required"}</code> would display the phrase &#8220;A directory to change to is required&#8221; and exit the script if the user didn&#8217;t provide the command line argument represented by <code>$1</code> . When used inside a script, the line gives you error output similar to <code class="error">./expansion.sh: line 3: 1: A directory to change to is required</code></li>
<li>When you&#8217;re accepting input from a user, you can make your script more forgiving by using regular expressions and the case insensitive options of your commands. For instance, use the <code class="optionsonly">-i</code> option of <code>grep</code> so that your script will not care whether it matches &#8220;Yes&#8221; or &#8220;yes&#8221;. With a regular expression, you could be as vague as <code>^[yY].*</code> to match &#8220;y&#8221;, &#8220;Y&#8221;, &#8220;ya&#8221;, &#8220;Ya&#8221;, &#8220;Yeah&#8221;, &#8220;yeah&#8221;, &#8220;yes&#8221;, &#8220;Yes&#8221; and many other entries that begin with an upper/lower case &#8220;y&#8221; and have 0 or more letters that come after it.</li>
<li>Always check to make sure that you got the expected number of command line arguments before going any further in your script. If possible, also check the arguments to make sure that they&#8217;re what you expect (i.e. that a phone number wasn&#8217;t given for a directory name).</li>
<li>To avoid introducing portability errors when writing scripts for the Bourne Shell (<code>sh</code>), you can use the <code>checkbashisms</code> program from the <code>devscripts</code> package. This program will check to make sure that you don&#8217;t have any BASH specific statements in your Bourne Shell script.</li>
<li>Don&#8217;t catch an error on a low level inside your script and not pass it back up the stack to the parent. This can cause your program to behave in a non-standard (non-Unix) way.</li>
<li>If you have a script that runs in the background, it can create a predefined file and redirect output to it so that you can see what/when/how/why your script exited.</li>
<li>If you use file locks in your scripts, you&#8217;ll want to check for dead/stale file locks each time your script starts. This is because a user may have issued a <code class="commandline">kill -9</code> (<code>SIGKILL</code>) command on your script, which doesn&#8217;t give your script a chance to clean up it&#8217;s lock files. If you don&#8217;t check for stale/dead locks, your user could end up having to remove the locks themselves manually, which is definitely not ideal.</li>
<li>When you have a script that is processing a large amount of data/files, you can use <code>trap</code> to keep track of where your script was in the event of an unexpected exit. One way to do this would be to <code>echo</code> a filename into a predefined file when the trap is triggered. You can then read the start location back into the script when it starts up again and resume where you left off. If there&#8217;s a really large amount of data and you need to make sure your script keeps its place, you should probably already be continuously tracking the progress as part of the processing loop and using the trap(s) as a fallback.</li>
</ul>

<p><a name="scripting"></a></p>
<h3>Scripting</h3>
<p class="single">
In this scripting section I&#8217;m going to create a script that we can source to add ready made error handling functions to other scripts. You will also see a couple of conceptual additions such as the use of code blocks in an attempt to streamline sections of code. <strong>Listing 32</strong> shows the modular script that you can source, and <strong>Listing 33</strong> shows it in use.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 32</strong></em></p>
<code>
#!/bin/bash -u
# File: error_source.sh
# Holds functions that can be used to more easily add error handling
# to your scripts.
# The -u option in the shebang line above causes the shell to throw
# an error whenever a variable is unset.

# Define our handlers for errors and/or forced exits
trap 'fatal_err $LINENO 1001' ERR #Handle uncaught errors
trap 'clean_up; exit' HUP TERM #Clean up and exit on SIGHUP or SIGTERM
trap 'clean_up; propagate' INT #Clean up after and propagate SIGINT
trap 'clean_up' EXIT #Clean up last thing before we exit

PROGNAME=$(basename $0) #Error source program name
TEMPFILES=( ) #Array holding temp files to remove on script exit

# This function steps through each pipe section's exit status to see if 
# there was an error anywhere. Takes as an argument the line number 
# that's being checked.
function check_pipe {
    # We want to see if there was an error somewhere in the pipeline
    for PIPEPART in $2
    do
        # There was an error at the current part of the pipeline
        if [ "$PIPEPART" != "0" ]
        then
            nonfatal_err $1 1002
            return 0; #We don't need to step through the rest
        fi
    done
}

# Function that gets rid of things like temp files before an exit.
function clean_up {
    # We want to remove all of the temp files we created
    for TFILE in ${TEMPFILES[@]}
    do  
        # If the file doesn't exist, skip it
        [ -e $TFILE ] || continue
       
        # Notice the use of a code block to streamline this check     
        {
            # If you use -f, errors are ignored
            rm --interactive=never $TFILE &#038;> /dev/null
        } || nonfatal_err $LINENO 1001
    done
}

# Function to create "safe" temporary files which we'll get into more in the
# next blog post on security.
function create_temp {
    # Give preference to user tmp directory for security
    if [ -e "$HOME/tmp" ]
    then
        TEMP_DIR="$HOME/tmp"
    else
        TEMP_DIR="/tmp"
    fi

    # Construct a "safe" temp file name
    TEMP_FILE="$TEMP_DIR"/"$PROGNAME".$$.$RANDOM

    # Keep the file in an array to remove it later
    TEMPFILES+=( "$TEMP_FILE" )

    {
        touch $TEMP_FILE &#038;> /dev/null
    } || fatal_err $LINENO "Could not create temp file $TEMP_FILE"
}

# Function that handles telling the user about critical errors that
# force an exit. It takes 2 arguments, a line number near where the
# error occurred, and an error code / message telling what happened.
function fatal_err {
    # Call function that will clean up temp files
    clean_up

    printf "Near line $1 in $PROGNAME: "

    # Check to see if the supplied error matches any predefined codes
    if [ "$2" == "1001" ];then
        printf "There has been an unknown fatal error.\n"
    # A custom error message has been specified by the caller
    else
        printf "$2\n"
    fi

    # We don't want to continue running with a fatal error
    exit 1
}

# Function that handles telling the user about non-critical errors 
# that don't force an exit. It takes 2 arguments, a line number near
# where the error occurred, and an error code / message telling what
# happened.
function nonfatal_err {
    printf "Near line $1 in $PROGNAME: "

    # Check to see if the supplied error matches any predefined codes
    if [ "$2" == "1001" ];then
        printf "Could not remove temp file.\n"
    elif [ "$2" == "1002" ];then
        printf "There was an error in a pipe.\n"
    elif [ "$2" == "1003" ];then
        printf "A file you tried to access doesn't exist.\n"
    # A custom error message has been specified by the caller
    else
        printf "$2\n"
    fi 
}

# Function that handles propagating the SIGINT signal up to the parent
# process, which in this case is assumed to be the shell.
function propagate {
    echo "Caught SIGINT"

    #Propagate the signal up to the shell
    kill -s SIGINT $$

    # 130 is the exit status from Ctrl-C/SIGINT
    exit 130
}
</code>
</pre>

<p class="single">
<strong>Listing 32</strong> has 6 functions that are designed to handle various error related conditions. These functions are <code>check_pipe</code>, <code>create_temp</code>, <code>clean_up</code>, <code>propagate</code>, <code>fatal_err</code>, and <code>nonfatal_err</code>. The <code>check_pipe</code> function takes a list representing all the elements of the <code>PIPESTATUS</code> array variable, and steps through each item in the list to see if there was an error. If there was an error it throws a non-fatal error message, which could just as easily be a fatal error message that causes an exit. This makes it a little easier to check our pipes for errors without using <code class="commandline">set -o pipefail</code>. This function could easily be modified to tell you which part of the pipe failed as well.
</p>

<p class="single">
The <code>create_temp</code> function automates the process of creating &#8220;safe&#8221; temporary files for us. It gives preference to the user&#8217;s <code>tmp</code> directory, and uses the system <code>/tmp</code> directory if the user&#8217;s is not available. We&#8217;ll talk more about temporary file safety in the next blog post on security. The path/name of the temp file created is added to a global array so that it will be easier to remove it later on exit. Notice the use of the code block around the <code>touch</code> command that creates the temp file. It might have been easier to leave the brackets out and just put the <code>&#124;&#124;</code> right after the <code>touch</code> statement, but I felt that the code block helped streamline the code a little bit. The <code>&#124;&#124;</code> at the end of the code block causes our error handling code to be executed if there&#8217;s an error with the last command in the block.
</p>

<p class="single">
The <code>clean_up</code> function steps through the file names in our array of temporary files and deletes them. This is meant to be called just before we exit the script so that we don&#8217;t leave any stray temp files laying around. The function checks to make sure that it doesn&#8217;t try to delete files that have already been removed. This is to prevent a warning from being displayed when we have an error,  thus calling <code>clean_up</code> and then exit which also calls <code>clean_up</code>. There are other ways to handle this type of problem, but for our purposes the &#8220;skip if already deleted&#8221; method works fine. The <code>propagate</code> function uses the <code>kill</code> command to resend the <code>INT</code> signal on up to the shell, and then uses the <code>exit</code> command to set the exit status of the script to 130. This tells anyone checking the <code>?</code> built-in variable that the script exited because of <code>SIGINT</code>. 
</p>

<p class="single">
The <code>fatal_err</code> and <code>nonfatal_err</code> functions are very similar, with the only difference being that <code>fatal_err</code> calls the <code>clean_up</code> function and <code>exit</code> command when it runs. Both functions take 2 arguments which are a line number and an error code or string. The line number is presumably the line near where the error occurred, but won&#8217;t be exact. It&#8217;s designed to get a shell script developer close enough to the error that they should be able to find it. The error code is a 4 digit number that&#8217;s used in an <code>if</code> statement (a <code>case</code> statement would be a little cleaner here) to see what error message should be given to the user. The <code>else</code> part of the statement allows the caller to provide their own custom error string. This way the caller isn&#8217;t stuck if they can&#8217;t find a code that fits their situation. If the script was going to see wide spread general use, it might be best to dump all of the error codes into a separate function that <code>fatal_err</code> and <code>nonfatal_err</code> could both call. That way you would have consistent and reusable error codes across all of the functions.
</p>

<p class="single">
To make sure that the functions are called properly, the script defines several traps at the top. The <code>ERR</code> signal is used to catch any errors that we haven&#8217;t handled ourselves. These are treated as &#8220;unknown&#8221; fatal errors since we obviously didn&#8217;t see them coming. The <code>HUP</code> and <code>TERM</code> signals are trapped so that we have a chance to run our <code>clean_up</code> function before exiting. Keep in mind that the <code>KILL</code> signal cannot be trapped, so if somebody runs <code class="commandline">kill -9</code> on our script, we&#8217;re still going to be leaving temp files behind. The <code>INT</code> signal is trapped to give us a chance clean up as well, but we also take the opportunity to propagate the signal up to the shell. That way we&#8217;re not just absorbing <code>SIGINT</code> and not allowing the world around us to react to it. The final trap is set on the <code>EXIT</code> condition and is our last chance to make sure that the temp files have been removed.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 33</strong></em></p>
<code>
#!/bin/bash -u
# File: err_src_test.sh
# Tests the modular error_source.sh script which holds error handling functions.

# Include the modular error handling script so that we can use its functions.
. error_source.sh

# Use our function to create a random "safe" temp file
create_temp

# Be proactive in checking for problems like a file that doesn't exist
if [ -e doesnotexist  ]
then
    ls doesnotexist
else
    nonfatal_err  $LINENO 1003    
fi

# Check a bad pipeline with a function we've created
true|false|true # Error not caught because of last true
PIPEST="${PIPESTATUS[@]}"
check_pipe $LINENO "$PIPEST"

# Check a good pipeline with the same function
true|true|true|true
PIPEST="${PIPESTATUS[@]}"
check_pipe $LINENO "$PIPEST"

# Generate a custom non-fatal error
nonfatal_err $LINENO "This is a custom error message."

# Generate an unhandled error
false

echo "The script shouldn't still be running here."
</code>
</pre>

<p class="single">
The <strong>Listing 33</strong> implementation shows just a few ways to use the modular error handling script in one of your own scripts. The first thing that the script does is source the error_source.sh script so that it is treated like a part of our own. Once that&#8217;s done, the error handling functions can be called as if we had typed them directly into our script. That&#8217;s why we can call the <code>create_temp</code> function. Normally we would do something with the temporary file path/name that is created, but in this case I only want to create a temp file that can be removed later by the <code>clean_up</code> function. The next thing I do is be proactive in checking to see if a file/directory exists before I try to use it. If it doesn&#8217;t exist I throw a non-fatal error to warn the user. Normally you would want to throw a fatal error that would cause an exit here, but I want the script to fall all the way through to the last error so that the output in <strong>Listing 34</strong> will be a little cleaner. Ultimately with this error handling method it&#8217;s your call on whether or not the script should exit on an error, but I would suggest erring on the side of exiting rather than letting the script continue with a potentially dangerous error in place.
</p>

<p class="single">
The next section of <strong>Listing 33</strong> has code that checks a pipeline with an error (the <code>false</code> in the middle), and after that there&#8217;s a check of a pipeline with no errors. This is done using the <code>check_pipe</code> function that we wrote earlier. You can see that I&#8217;ve basically converted the <code>PIPESTATUS</code> array elements into a string list before passing that to <code>check_pipe</code>. The list works a little more cleanly in the <code>for</code> loop that&#8217;s used to check each part of the pipeline. 
</p>

<p class="single">
Next, I&#8217;ve shown how to generate your own custom error by passing the <code>nonfatal_err</code> function a string instead of an error code. A custom string should fail all of the tests in the <code>nonfatal_err</code> <code>if</code> construct, causing the <code>else</code> to be triggered. This gives us the ability to create compact error handling code in our own scripts using error codes, but still gives us the flexibility to throw errors that haven&#8217;t been defined yet. 
</p>

<p class="single">
The last interesting thing that the script does is use the <code>false</code> command to generate an unhandled error which is caught by the <code>ERR</code> signal&#8217;s trap. You can see that even if we miss handling an error manually, it still gets caught overall. The drawback is that although the user gets a line number for the error, they are given a message telling them that and unknown error has occurred which doesn&#8217;t tell them very much. This is still preferable to letting your script run with an unhandled error though. The very last line of the script is just there to alert us that something very wrong has happened if our script reaches that point.
</p>

<p class="single">
<strong>Listing 34</strong> shows what happens when I run the script in <strong>Listing 33</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:490px;display:inline;"><em><strong>Listing 34</strong></em></p>
<code>
$./err_src_test.sh 
Near line 16 in err_src_test.sh: A file you tried to access doesn't exist.
Near line 22 in err_src_test.sh: There was an error in a pipe.
Near line 30 in err_src_test.sh: This is a custom error message.
Near line 33 in err_src_test.sh: There has been an unknown fatal error.
</code>
</pre>

<p class="single">
If you have any additions or changes to the script(s) above don&#8217;t hesitate to tell us about it in the comments section. I would especially like to see what changes all of you would make to the script in <strong>Listing 32</strong> to make it more useful and/or correct any flaws that it may have. Feel free to paste your updates to the code in the comments section.
</p>

<p><a name="troubleshooting"></a></p>
<h3>Troubleshooting</h3>
<p class="single">
This post was developed using BASH 4.0.x, so if you&#8217;re running an earlier version keep an eye out for subtle syntax differences and missing features. Post something in the comments section if you have any trouble so that we can try to help you out. Also, don&#8217;t forget to apply the debugging knowledge that you got from reading <a href="http://www.innovationsts.com/blog/?p=1395">Post 1</a> in this series as you&#8217;re experimenting with these concepts.
</p>

<h3>Conclusion</h3>
<p class="single">
As with shell script debugging, we can see that script error handling is a very in-depth subject. Unfortunately, error handling is often overlooked in shell scripts but is an important part of creating and maintaining production scripts. My goal with this post has been to give you a diverse set of tools to help you efficiently and effectively add error handling to your scripts. I know that opinions on this topic vary widely, so if you&#8217;ve got any suggestions or thoughts on the content of this post it would be great to hear from you. Leave a comment to let us know what you think. Thanks for reading.
</p>
<!-- google_ad_section_end -->

<p><a name="video"></a></p>
<h3>Video</h3>
<p class="single">
To enhance this post, I&#8217;ve provided a video so that you can see a general overview of the provided examples. Browsers supporting the HTML5 video tag should present you with a <a href="http://theora.org/">Theora</a> (ogg/ogv) video, and browsers lacking that support should present you with a Flash substitute via <a href="http://flowplayer.org/">Flowplayer</a>. You can also download the Theora version of the video <a href="/downloads/blog/Better_Scripts_Part_2/Better_Scripts_Part_2_Video.ogg">here</a> by right clicking on the link. 
</p>
<video controls><br />
    <source src="/downloads/blog/Better_Scripts_Part_2/Better_Scripts_Part_2_Video.ogg" type="video/ogg" style="margin-left:0px;" /><br />
<object id="flowplayer" width="592" height="464" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/Better_Scripts_Part_2/Better_Scripts_Part_2_Video.flv", "autoPlay":false}}' />             
</object>
</video>
<br />

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
I have provided an audio transcript of this post with commentary on each of the listings. Click on a format below to listen in a new window or right click and save the audio to listen to later.
</p>
<p>Get An Audio Transcript Of This Post With Author&#8217;s Commentary On Listings<br />
<a href="/downloads/blog/Better_Scripts_Part_2/Better_Shell_Scripts_Part_2.ogg" target="_blank">ogg</a> (61 MB) | <a href="/downloads/blog/Better_Scripts_Part_2/Better_Shell_Scripts_Part_2.mp3" target="_blank">mp3</a> (42 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<br />
<h4>Books</h4>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0596005954&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=143021841X&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0596009658&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<iframe src="http://rcm.amazon.com/e/cm?t=innovatechn08-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=1555582737&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>

<br />
<br />

<h4>Links</h4>
<ol>
<li><a href="http://www.linuxjournal.com/article/10029">Linux Journal, May 2008, Work The Shell, By Dave Taylor, &#8220;Handling Errors and Making Scripts Bulletproof&#8221;, pp 26-27</a></li>
<li><a href="http://www.davidpashley.com/articles/writing-robust-shell-scripts.html">Writing Robust Shell Scripts &#8211; DavidPashley.com</a></li>
<li><a href="http://www.linuxplanet.com/linuxplanet/tutorials/7025/1/">Linux Planet Article On Making Friendlier Error Messages</a></li>
<li><a href="http://www.linuxplanet.com/linuxplanet/tutorials/7025/2/">Linux Planet Article With A Good Example Of A Modularized Error Handling Script</a></li>
<li><a href="http://gd.tuwien.ac.at/linuxcommand.org/wss0150.php">Errors and Signals and Traps (Oh My!) &#8211; Part 1 By William Shotts, Jr.</a></li>
<li><a href="http://gd.tuwien.ac.at/linuxcommand.org/wss0160.php">Errors and Signals and Traps (Oh My!) &#8211; Part 2 By William Shotts, Jr.</a></li>
<li><a href="http://www.turnkeylinux.org/blog/shell-error-handling">Turnkey Linux Article With Good Discussion In Comments Section</a></li>
<li><a href="http://www.randombugs.com/linux/shell-error-handling.html">Script Error Handling Overview</a></li>
<li><a href="http://www.cons.org/cracauer/sigint.html">Article On The &#8220;Proper handling of SIGINT/SIGQUIT&#8221;</a></li>
<li><a href="http://snap.nlc.dcccd.edu/learn/frazer2/ppt/Error_Handling.ppt">Script Error Handling Slide Presentation (Download Link)</a></li>
<li><a href="http://steve-parker.org/sh/bourne.shtml">General UNIX Scripting Guide With Error Handling By Steve Parker</a></li>
<li><a href="http://news.ycombinator.com/item?id=1241488">Some General Thoughts On Making Scripts Better And Less Error Prone</a></li>
<li><a href="http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html">OpenGroup.org Article On Scripting Including A Section On &#8220;Exit Status and Errors&#8221;</a></li>
<li><a href="http://man.he.net/man1/checkbashisms">A checkbashisms man Page Entry</a></li>
<li><a href="http://www.greenend.org.uk/rjk/2001/04/shell.html">Common Shell Mistakes and Error Handling Article</a></li>
<li><a href="http://www.hpsc.csiro.au/userguides/faq/shell_recovery.php">CSIRO Advanced Scientific Computing Article</a></li>
<li><a href="http://stackoverflow.com/questions/64786/error-handling-in-bash">Opinions On Error Handling On stackoverflow</a></li>
<li><a href="http://codeandfury.blogspot.com/2007/01/bash-hacks-error-handling.html">A Way To Handle Errors Using Their Error Messages</a></li>
<li><a href="http://assela.pathirana.net/Bash_disaster_prevention">Simple BASH Error Handling</a></li>
<li><a href="http://www.faqs.org/faqs/unix-faq/shell/bash/">BASH FAQ Including Broken Pipe Warning Information</a></li>
<li><a href="http://www.linuxjournal.com/article/2156?page=0,0">Linux Journal Article On Named Pipes</a></li>
<li><a href="http://tldp.org/LDP/abs/html/bashver4.html">Example Use Of command_not_found_handle</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=1896</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Writing Better Shell Scripts &#8211; Part 1</title>
		<link>http://www.innovationsts.com/blog/?p=1395</link>
		<comments>http://www.innovationsts.com/blog/?p=1395#comments</comments>
		<pubDate>Tue, 15 Jun 2010 13:21:05 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[How-Tos]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=1395</guid>
		<description><![CDATA[

Quick Start

The information presented in this post doesn&#8217;t really lend itself to having a &#8220;Quick Start&#8221; section, but if you&#8217;re in a hurry we have a How-To section along with Video and Audio included with this post that may be a good quick reference for you. There are some really great general references in the [...]]]></description>
			<content:encoded><![CDATA[<!-- google_ad_section_start -->
<br />
<h3>Quick Start</h3>
<p class="narrow">
The information presented in this post doesn&#8217;t really lend itself to having a &#8220;Quick Start&#8221; section, but if you&#8217;re in a hurry we have a <a href="#howto">How-To</a> section along with <a href="#video">Video</a> and <a href="#audio">Audio</a> included with this post that may be a good quick reference for you. There are some really great general references in the <a href="#resources">Resources</a> section that may help you as well.
</p>

<h3>Preface</h3>
<p class="narrow">
To make things easier on you, all of the black command line and script areas are set up so that you can copy the text from them. This does make using the commands and scripts easier, but if you&#8217;re not already familiar with the concepts presented here, typing things yourself and working through why you&#8217;re typing them will help you learn more. If you hit problems along the way, take a look at the <a href="#troubleshooting">Troubleshooting</a> section near the end of this post for help.
</p>
<p class="narrow">
There are formatting conventions that are used throughout this post that you should be aware of. The following is a list outlining the color and font formats used.
</p>
<p>
<code>
    Command Name or Directory Path
</code>
<br />
<code class="warnerror">
    Warning or Error
</code>
<br />
<code class="commandline">
    Command Line Snippet With Commands/Options/Arguments
</code>
<br />
<code class="optionsonly">
    Command Options and Their Arguments Only
</code>
<br />
    <span style="color:#FFAF3E;">Hyperlink</span>
</p>

<h3>Overview</h3>

<p class="narrow">
This post is the first in a series on shell script debugging, error handling, and security. Although I&#8217;ll be presenting some methodologies and techniques that apply to all shell languages (and most programming languages), this series will focus very heavily on BASH. Users of other shells like CSH will need to do some homework to see what information transfers and what does not.
</p>

<p class="narrow">
One of the difficulties with debugging a shell script is that BASH typically doesn&#8217;t give you very much information to go on. You might get error output showing a line number, but that&#8217;s just the line where the shell became aware of the error, not necessarily the line where the error actually occurred. Add in a vague error message such as the one in <strong>Listing 1</strong>, and it gets difficult to tell what&#8217;s going on inside your script.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
$ ./buggy_script.sh
./buggy_script.sh: line 23: syntax error: unexpected end of file
</code>
</pre>

<p class="narrow">
This post is written with the intent of giving you knowledge that will help when you see an error like the one in <strong>Listing 1</strong> while trying to run a script. This type of error is just one of many errors that the shell may give you, and is more easily dealt with when you have a good understanding of scripting syntax and the debugging tools at your disposal.
</p>

<p class="narrow">
Along with talking about debugging tools/techniques, I&#8217;m going to introduce a handy script debugger called <a href="http://bashdb.sourceforge.net/">BASHDB</a>. BASHDB allows you to step through a script in much the same way as a program debugger like GNU&#8217;s <a href="http://www.gnu.org/software/gdb/">GDB</a> does with C code.
</p>

<p class="single">
By the end of this post you should be armed with enough knowledge to handle the majority of debugging needs that you have. There&#8217;s a lot of information here, but taking the time to learn it will help make you more effective in your work with Linux.
</p>

<h3>Command Line Script Debugging</h3>
<p class="single">
BASH has several command line options for debugging your shell scripts, and some of these are shown in <strong>Listing 2</strong>. These options will be applied to your entire script though, so it&#8217;s an all-or-nothing trade off. Later in this post I&#8217;ll talk about more selective methods of debugging.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
-n  Checks for syntax errors without executing the script (noexec).
    
-u  Causes an error to be thrown whenever you try to access a variable that has 
    not been set (nounset).
    
-v  Sends all lines to standard error (stderr) as they are read, even comments.

-x  Turns on execution tracing (xtrace) which displays each command as it is 
    executed.
</code>
</pre>

<p class="narrow">
All of the options in <strong>Listing 2</strong> can be used just like options with other programs (<code class="commandline">bash -x scriptname</code>), or with the built-in <code>set</code> command as shown later. With the <code class="optionsonly">-x</code> option, the number of + characters before each of the lines of output denotes the subshell level. The more + characters there are, the further down into nested subshells you are. If there are no + characters at the start of the line, then the line is the normal output from the execution of the script. You can use the <code class="optionsonly">-x</code> and <code class="optionsonly">-v</code> options together for verbose execution tracing, but the amount of output can become a little overwhelming. Using the <code class="optionsonly">-n</code> and <code class="optionsonly">-v</code> options together provides a verbose syntax check without executing the script.
</p>

<p class="narrow">
If you decide to use the <code class="optionsonly">-x</code> and <code class="optionsonly">-v</code> options together, it can be helpful to use redirection in conjunction with a pager like <code>less</code>, or the <code>tee</code> command to help you handle the information. The shell sends debugging output to stderr and the normal output to stdout, so you&#8217;ll need to redirect both of them if you want the full picture of what&#8217;s going on. To do this and use the <code>less</code> pager to handle the information, you would use a command line like <code class="commandline">bash -xv scriptname 2>&#038;1 | less</code> . Instead of seeing the debugging output scroll by in the shell, you&#8217;ll be placed into the <code>less</code> pager where you&#8217;ll have access to functions like scrolling and search. While using the pager in this way, it&#8217;s possible that you may get an error like <code class="error">Broken pipe</code> if you exit the pager before the script is done executing. This error has to do with the script trying to write output to something (<code>less</code>) that&#8217;s no longer there, and in this case can be ignored.
</p>
<p class="narrow">
If you would prefer to redirect the debugging output to a file for later review and/or processing, you can use <code>tee</code>: <code class="commandline">bash -xv scriptname 2>&#038;1 | tee scriptname.dbg</code> . You will see the debugging output scroll by on the screen, but if you check the current working directory you will also find the <code>scriptname.dbg</code> file which holds the redirected output. This is what the <code>tee</code> command does for you. It allows you to send the output to a file while still displaying it on the screen. If the script will take awhile to run you can alter the redirection operator slightly, put the script in the background, and then use <code class="commandline">tail -f scriptname.dbg</code> to follow the updates to the file. You can see this in action in <strong>Listing 3</strong>, where I&#8217;ve created a script that runs in an infinite loop (the code is incorrect on purpose) generating output every 20 seconds. I start the script in the background, redirecting the output to the infinite_loop.dbg file only (not to the screen too). I then start the <code class="commandline">tail -f</code> command to follow the file for a few iterations, and then hit Ctrl-C to interrupt the <code>tail</code> command. Once you understand how to redirect the debugging output in this way, it&#8217;s fairly easy to figure out how to split the debugging and regular output into separate files.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 3</strong></em></p>
<code>
$ bash -xv infinite_loop.sh &#038;> infinite_loop.dbg &#038;
[1] 9777
$ tail -f infinite_loop.dbg 
num=0
+ num=0

while [ $num -le 10 ]
do
    sleep 2
    echo "Testing"
done
+ '[' 0 -le 10 ']'
+ sleep 2
+ echo Testing
Testing
+ '[' 0 -le 10 ']'
+ sleep 2
^C
</code>
</pre>

<h3>Internal Script Debugging</h3>
<p class="single">
This section is called &#8220;Internal Script Debugging&#8221; because it focuses on changes that you make to the script itself to add debugging functionality. The easiest change to make in order to enable debugging is to change the shebang line of the script (the first line) to include the shell&#8217;s normal command line switches. So, instead of a shebang line like <code class="commandline">#!/bin/bash - </code> you would have <code class="commandline">#!/bin/bash -xv</code>. There are also both external and built-in commands for the BASH shell that make it easier for you to debug your code, the first of which is <code>set</code>.
</p>

<p class="single">
The <code>set</code> command allows you to set shell options while your script is running. The options of the most interest for our purposes are the ones from <strong>Listing 2</strong>. For example, you can enclose sections of your script between the <code class="commandline">set -x</code> and <code class="commandline">set +x</code> command lines. By doing this you enable debugging for only the section of code within those lines, giving you control over what specific section of the script is debugged. <strong>Listing 4</strong> shows a very simple script using this technique, and <strong>Listing 5</strong> shows the script in action.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 4</strong></em></p>
<code>
#!/bin/bash - 
# File: set_example.sh

echo "Output #1"
set -x #Debugging on
echo "Output #2"
set +x #Debugging off
echo "Output #3"
</code>
</pre>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 5</strong></em></p>
<code>
$ ./set_example.sh 
Output #1
+ echo 'Output #2'
Output #2
+ set +x
Output #3
</code>
</pre>

<p class="single">
As you can see, the debugging output looks like you started the script with the <code class="commandline">bash -x</code> command line. The difference is that you get to control what is traced and what is not, instead of having the execution of the whole script traced. Notice that the command to disable execution tracing (<code class="commandline">set +x</code>) is included in the execution trace. This makes sense because execution tracing is not actually turned off until after the <code class="commandline">set +x</code> line is done executing.
</p>

<p class="single">
Output statements (<code>echo</code>/<code>print</code>/<code>printf</code>) are useful for getting information from your script at specific points. You can use output statements to track the progression of logic throughout your script by doing things like evaluating variable values and shell expansions, and finding infinite loops. Another advantage of using output statements is that you can control the format. When using command line debugging switches you have little or no control over the format, but with <code>echo</code>, <code>print</code>, and <code>printf</code>, you have the opportunity to customize the output to display in a way that makes sense to you.
</p>

<p class="single">
You can utilize a DEBUG function to provide a flexible and clean way to turn debugging output on and off in your script. <strong>Listing 6</strong> shows the script in <strong>Listing 4</strong> with the addition of the DEBUG function, and <strong>Listing 7</strong> shows one way to switch the debugging on and off from the command line using a variable.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 6</strong></em></p>
<code>
#!/bin/bash - 
# File: func_example.sh

# This function can be used to selectively enable/disable debugging.
# Use with the set command to debug sections of the script.
function DEBUG()
{       
    # Check to see if the enable debugging variable is set
    if [ -n "${DEBUG_ENABLE+x}" ]
    then                
        # Run whatever command/option/argument combo that was
        # passed to our DEBUG function.
        $@             
    fi
}

echo "Output #1"
DEBUG set -x #Debugging on
echo "Output #2"
DEBUG set +x #Debugging off
echo "Output #3"
</code>
</pre>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 7</strong></em></p>
<code>
$ ./func_example.sh #Without debugging
Output #1
Output #2
Output #3
$ DEBUG_ENABLE=true ./func_example.sh #With debugging
Output #1
+ echo 'Output #2'
Output #2
+ DEBUG set +x
+ '[' -n x ']'
+ set +x
Output #3
</code>
</pre>

<p class="single">
The DEBUG function treats the rest of the line after it as an argument. If the <code>DEBUG_ENABLE</code> variable is set, the DEBUG function will output it&#8217;s argument (the rest of the line) as a command via the <code>$@</code> operator. So, any line that has DEBUG in front of it can be turned on or off by simply setting/unsetting one variable from the command line or inside your script. This method gives you a lot of flexibility in how you set up debugging in your script, and allows you to easily hide that functionality from your end users if needed.
</p>

<p class="single">
Instead of requiring a user to set an environment variable on the command line to enable debugging, you can add command line options to your script. For instance, you could have the user run your script with a <code class="optionsonly">-d</code> option (<code class="commandline">./scriptname -d</code>) in order to enable debugging. The mechanism that you use could be as simple as having the <code class="optionsonly">-d</code> option set the <code>DEBUG_ENABLE</code> variable inside of the script. An example of this, with the addition of multiple debugging levels, can be seen in the <a href="#scripting">Scripting</a> section. 
</p>

<p class="single">
Another technique that you can use to track down problems in your script is to write data to temporary files instead of using pipes. Temp files are many times slower than pipes though, so I would use them sparingly and in most cases only for temporary debugging. There is a Linux Journal article by Dave Taylor (April 2010) referenced in the <a href="#resources">Resources</a> section that talks about using temporary files in the article&#8217;s script. In a nutshell, you replace the pipe operator (<code>|</code>) with a redirection to file (<code>&gt; $temp</code>), where <code>$temp</code> is a variable holding the name of your temporary file. You read the temporary file back into the script with another redirection operator (<code>&lt; $temp</code>). This allows you to examine the temporary file for errors in the script&#8217;s pipeline. <strong>Listing 8</strong> shows a very simplified example of this.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 8</strong></em></p>
<code>
#!/bin/bash - 

# Set the path and filename for the temp file
temp="./example.tmp"

# Dump a list of numbers into the temp file
printf "1\n2\n3\n4\n5\n" > $temp

# Process the numbers in the temp file via a loop
while read input_val
do
    # We won't do any real work, just output the values
    echo $input_val
done < $temp # Feeds the temp file into the loop

# Clean up our temp file
rm $temp
</code>
</pre>

<p class="single">
The last debugging technique that I'm going to touch on here is writing to the system log. You can use the <code>logger</code> command to write debugging output to <code>/var/log/messages</code>, or another file if you use the <code class="optionsonly">-f</code> option. I consider this technique to be primarily for production scripts that have already been released to your users, and you don't want to abuse this mechanism. Flooding your system log with script debugging messages would be counter productive for you and/or your system administrator. It's best to only log mission critical messages like warnings or errors in this way.
</p>

<p class="single">
To use the <code>logger</code> command to help track script debugging information, you would just add a line like <code class="commandline">logger "${BASH_SOURCE[0]} - My script failed somewhere before line $LINENO."</code> to your script. The line that this adds in the system log looks like the output line in <strong>Listing 9</strong>. There are a couple of variables that I've thrown in here to make my entry in the system log more descriptive. One is <code>BASH_SOURCE</code>, which is an array that in this case holds the name and path of the script that logged the message. The other is <code>LINENO</code>, which holds the current line number that you are on in your script. There are several other useful environment variables built into the newer versions of BASH (>= 3.0). Some of these other variables (all arrays) include <code>BASH_LINENO</code>, <code>BASH_ARGC</code>, <code>BASH_ARGV</code>, <code>BASH_COMMAND</code>, <code>BASH_EXECUTION_STRING</code>, and <code>BASH_SUBSHELL</code>. See the BASH man page for details.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 9</strong></em></p>
<code>
$ tail -1 /var/log/messages
May 28 14:35:35 testhost jwright: ./logger_test.sh - My script failed somewhere before line 11.
</code>
</pre>

<h3>Introducing BASHDB</h3>
<p class="single">
As I mentioned before, BASHDB is a debugger that does for BASH scripts what GNU's GDB does for C/C++ programs. BASHDB can do a lot, and it has four main features to help you eliminate errors from your scripts. First, It can start a script with options, arguments, and anything else that might affect its operation. Second, it allows you to set conditions on which a script will stop. Third, it gives you the ability to examine what's going on at the point in a script where it's stopped. Fourth, BASHDB allows you to manipulate things like variable values before telling the script to move on.
</p>

<p class="single">
You can type <code class="commandline">bashdb scriptname</code> to start BASHDB and set it to debug the script <code>scriptname</code>. <strong>Listing 10</strong> shows a couple of useful options for the bashdb program.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 10</strong></em></p>
<code>
-X     Traces the entire script from beginning to end without putting bashdb in 
       interactive mode. Notice that it's capital X, not lowercase.
       
-c     Tests/traces a single string command. For example, "bashdb -c ls *" will 
       allow you to step through the command string "ls *" inside the debugger.
</code>
</pre>

<p class="single">
In order to show where you're at, BASHDB displays the full path and current line number of the running script above the prompt. In interactive mode, the prompt BASHDB gives you looks something like <code class="commandline">bashdb<(1)></code> where 1 is the number of commands that have been executed. The parentheses around the command number denote the number of subshells you are nested within. The more parentheses there are, the deeper into subshells you are nested. <strong>Listing 11</strong> gives a decent command reference that you can use when debugging scripts at the BASHDB interactive mode prompt.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 11</strong></em></p>
<code>
-         Lists the current line and up to 10 lines that came before it.

backtrace Abbreviated "T". Shows the trace of calls including things like 
          functions and sourced files that have brought the script to where it 
          is now. You can follow "backtrace" with a number, and only that number
          of calls will be shown.

break     Abbreviated "b". Sets a persistent breakpoint at the current line unless
          followed by a number, in which case a breakpoint is set at the line 
          specified by the number. See the "continue" command for a shortcut to 
          specifying the line number.
        
continue  Abbreviated "c". Resumes execution of the script and moves to the next 
          stopping point or breakpoint. If followed by a number, "continue" works
          in a similar way as issuing the "break" command followed by the number
          and then the continue command. The difference is that "continue" sets a
          one time breakpoint whereas "break" sets a persistent one.

edit      Opens the text editor specified by the EDITOR environment variable to
          allow you make and save changes to the current script. Typing "edit"
          by itself will start editing on the current line. If "edit" is followed
          by a number, editing will start on the line specified by that number. 
          Once you're done editing you have to type "restart" or "R" to reload 
          and restart the script with your changes.

help      Abbreviated "h". Lists all of the commands that are available when
          running in interactive mode. When you follow "help" or "h" with a
          command name, you are shown information on that command.

list      Abbreviated "l". Lists the current line and up to 10 lines that come
          after it. If followed by a number, "list" will start at the specified
          line and print the next 10 lines. If followed by a function name, "list"
          starts at the beginning of the function and prints up to 10 lines.
        
next      Abbreviated "n". Moves execution of the script to the next instruction, 
          skipping over functions and sourced files. If followed by a number,
          "next" will move that number of instructions before stopping.

print     Abbreviated "p". When followed by a variable name, prints the value of
          a specified variable. Example: print $VARIABLE
          
quit      Exits from BASHDB.
          
set       Allows you to change the way BASH interacts with you while running 
          BASHDB. You can follow "set" with an argument and then the words "on"
          or "off" to enable/disable a feature. Example: "set linetrace on".
      
step      Abbreviated "s". Moves execution of the script to the next instruction. 
          "step" will move down into functions and sourced files. See the "next" 
          command if you need behavior that skips these. If followed by a number, 
          "step" will move that number of instructions before stopping.
          
x         Similar to the "print" command, but more powerful. Can print variable 
          and function definitions, and can be used to explore the effects of a 
          change to the current value of a variable. Example: "x n-1" subtracts 1
          from the variable "n" and displays the result.
</code>
</pre>

<p class="single">
Normally when you hit the Enter/Return key without entering a command, BASHDB executes the <code>next</code> command. This behavior is overridden though when you have just run the <code>step</code> command. Once you've run <code>step</code>, pressing the Enter/Return key will re-execute <code>step</code>. The rest of the operation of BASHDB is fairly straight forward, and I'll run through an example session in the <a href="#howto">How-To</a> section.
</p>

<p class="single">
If you're a person who prefers to use a graphical interface, have a look at <a href="http://www.gnu.org/software/ddd/">GNU DDD</a>. DDD is a graphical front end for several debuggers including BASHDB, and includes some interesting features like the ability to display data structures as graphs.
</p>

<p><a name="howto"></a></p>
<h3>How-To</h3>
<p class="single">
If you've been reading this post straight through, you can see that there are a lot of script debugging tools at your disposal. In this section, I'm going to go through a simple example using a few of the different methods so that you can see some practical applications. <strong>Listing 12</strong> shows a script that has several bugs intentionally added so that we can use it as our example.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 12</strong></em></p>
<code>
#!/bin/bash - 

# buggy_script.sh is designed to help us learn about 
# shell script debugging
# 

if [-z $1 ] # Space left out after first test bracket
then
    echo "TEST"
#fi  #The closing fi is left out

# Use of uninitialized variable
echo "The value is: $VALUE1"

# Infinite loop caused by not incrementing num
num=0

while [ $num -le 10 ]
do
    sleep 2
    echo "Testing"
done
</code>
</pre>

<p class="single">
When I try to run the script for the first time I get the same error that we got in <strong>Listing 1</strong>. The first thing that I'm going to do is use the <code class="optionsonly">-x</code> and <code class="optionsonly">-u</code> options of BASH to run the script with extra debugging output (<code class="commandline">bash -xu ./buggy_script.sh</code>). When I rerun the script this way, I see that I don't really gain anything because BASH detects the <code class="error">unexpected end of file</code> bug before it even tries to execute the script. The line number isn't any help either since it just points me to the very last line of the script, and that's not very likely to be where the error occurred. I'll run into the same problems if I try to run the script with BASHDB as well.
</p>

<p class="single">
I remember that the rule of thumb with <code class="error">unexpected end of file</code> errors is that they usually mean that I've forgotten to close something out. It could be an if statement without a <code>fi</code> at the end, a case statement that's missing an <code>esac</code> or <code>;;</code>, or any number of other constructs that require closure. When I start looking through the script I notice that my if statement is missing a fi, so I add (uncomment) that. This particular bug teaches us an important lesson - that there will always be some errors that will require us to do some digging on our own. We may be able to use our debugging techniques to get us close to the error, but in the end we have to know 
the language well enough to be able to spot syntax errors. Once I add the <code>fi</code> statement, I'm ready to rerun the script. The second time the script runs, I get an <code class="error">unbound variable</code> error.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 13</strong></em></p>
<code>
$ bash -xu ./buggy_script.sh
./buggy_script.sh: line 6: $1: unbound variable
</code>
</pre>

<p class="single">
You can see in the error that a command line argument (<code>$1</code>) is unbound. This tells me that I forgot to add an argument after <code>./buggy_script.sh</code> . I end up with the command line <code class="commandline">bash -xu ./buggy_script.sh testarg1</code> which gives me the next two errors shown in <strong>Listing 14</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 14</strong></em></p>
<code>
$ bash -xu ./buggy_script.sh testarg1
+ '[-z' testarg1 ']'
./buggy_script.sh: line 6: [-z: command not found
./buggy_script.sh: line 12: VALUE1: unbound variable
</code>
</pre>

<p class="single">
Execution tracing shows me that the last command executed is <code class="commandline">[-z' testarg1 ']</code> . The first error tells me that for some reason the start of the test statement (<code>[-z</code>) is being treated as a command. I think about it for a second and remember that there has to be a space between test brackets and what they enclose. The statement <code>[-z $1 ]</code> should read <code>[ -z $1 ]</code> . Since I try to focus on one error at a time, I fix the test statement and rerun the script. The first error from <strong>Listing 14</strong> goes away, but the second error remains. You can see that it's another <code class="error">unbound variable</code> error, but this time it's referencing a variable that I created and not a command line argument. The problem is that I use the variable <code>VALUE1</code> in an <code>echo</code> statement before I've even set a value for it. In this case that would just leave a blank at the end of the <code>echo</code> statement, but in some cases it can cause more serious problems. This is what using the <code class="optionsonly">-u</code> option of BASH does for you. It warns you that a variable doesn't have a value before you try to use it. To correct this error, I add a statement right above the <code>echo</code> line that sets a value for the variable (<code class="commandline">VALUE1="1"</code>).
</p>

<p class="single">
After fixing the above errors and rerunning the script, everything seems to work fine. The only problem is that even though I set the while loop up to quit after the variable <code>num</code> gets to 10, the loop doesn't exit. It seems that I have an infinite loop problem. This loop is simple enough that you can probably just glance at it and see the problem, but for the sake of the example we're going to take the long way around. I add an <code>echo</code> statement (<code class="commandline">echo "num Value: $num"</code>) to show me the value of the <code>num</code> variable right above the <code>sleep 2</code> line. When I run the script again without the BASH <code class="optionsonly">-x</code> option (to cut out some clutter), I get the output shown in <strong>Listing 15</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 15</strong></em></p>
<code>
$ bash -u ./buggy_script.sh testarg1
The value is: 1
num Value: 0
Testing
num Value: 0
Testing
num Value: 0
</code>
</pre>

<p class="single">
You can see that the output from the <code>echo</code> statement I added is always the same (<code>num Value: 0</code>). This tells me that the value of <code>num</code> is never incremented and so it will never reach the limit of 10 that I set for the while loop. The fix is to use arithmetic expansion to increment the <code>num</code> variable by 1 each time around the while loop: <code class="commandline">num=$((num+1))</code> . When I run the script now, <code>num</code> increments like it should and the script exits when it's supposed to. With this bug fixed, it looks like we've eliminated all of the errors from our script. The finalized script with the <code>num</code> evaluation <code>echo</code> statement removed can be seen in <strong>Listing 16</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 16</strong></em></p>
<code>
#!/bin/bash - 

# buggy_script.sh is designed to help us learn about 
# shell script debugging.

if [ -z $1 ] # Space added after first test bracket
then
    echo "TEST"
fi  #The closing fi was added

# Set a value for our variable
VALUE1="1"

# Use of initialized variable
echo "The value is: $VALUE1"

# Finite loop caused by incrementing num
num=0

while [ $num -le 10 ]
do    
    sleep 2
    echo "Testing"
    num=$((num+1))
done
</code>
</pre>

<p class="single">
Now I'll walk you through correcting the same buggy script using BASHDB. As I said above, the <code class="error">unexpected end of file</code> error is best solved by applying your understanding of shell scripting syntax. Because of this, I'm going to start debugging the script right after we notice and fix the unclosed <code>if</code> statement. To start the debugging process, I use the line <code class="commandline">bashdb ./buggy_script.sh</code> to launch BASHDB and have it start to step through the script. If you compiled BASHDB from source and haven't installed it, you'll need to adjust the paths in the command line accordingly.
</p>

<p class="single">
BASHDB starts the script and then stops at line 7, the <code>if</code> statement. I then use the <code>step</code> command to move to the next instruction and get the output in <strong>Listing 17</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 17</strong></em></p>
<code>
$ bashdb ./buggy_script.sh
bash Shell Debugger, release 4.0-0.4

Copyright 2002, 2003, 2004, 2006, 2007, 2008, 2009 Rocky Bernstein
This is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.

(/home/jwright/Documents/Scripts/Learning/buggy_script.sh:7):
7:	if [-z $1 ] # Space left out after first test bracket
bashdb<0> step
./buggy_script.sh: line 7: [-z: command not found
(/home/jwright/Documents/Scripts/Learning/buggy_script.sh:13):
13:	echo "The value is: $VALUE1"
</code>
</pre>

<p class="single">
Notice that until I run the <code>step</code> command, BASHDB doesn't give me an error for line 7. That's because it has stopped on the line 7 instruction, but hasn't executed it yet. When I step through that instruction and on to the next one, I get the same error as the BASH shell gives us (<code class="error">[-z: command not found</code>). As before, we realize that we've left a space out between the test bracket and the statement. To fix this, I type the <code>edit</code> command to open the script in the text editor specified by the <code>EDITOR</code> environment variable. In my case this is <code>vim</code>. I have to type <code>visual</code> to go to normal mode, and then I'm able to edit and save my changes to the script like I would in any <code>vi</code>/<code>vim</code> session. With the space added, I save the file and exit <code>vim</code> which puts me back at the BASHDB prompt. I type the <code>R</code> character and hit the Enter/Return key to restart the script, which also loads my changes. I end up right back at line 7 again.
</p>

<p class="single">
This time when I use the <code>step</code> command, BASHDB moves past the <code>if</code> statement and stops right before executing line 13 (the next instruction). Everything looks good, so I use the <code>step</code> command again by simply hitting the Enter/Return key. The output in <strong>Listing 18</strong> is what I see.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 18</strong></em></p>
<code>
bashdb<1> edit
bashdb<2> R
Restarting with: /usr/local/bin/bashdb ./buggy_script.sh 
bash Shell Debugger, release 4.0-0.4

Copyright 2002, 2003, 2004, 2006, 2007, 2008, 2009 Rocky Bernstein
This is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.

(/home/jwright/Documents/Scripts/Learning/buggy_script.sh:7):
7:	if [ -z $1 ] # Space left out after first test bracket
bashdb<0> step
(/home/jwright/Documents/Scripts/Learning/buggy_script.sh:13):
13:	echo "The value is: $VALUE1"
bashdb<1> 
The value is: 
(/home/jwright/Documents/Scripts/Learning/buggy_script.sh:16):
16:	num=0
</code>
</pre>

<p class="single">
We see that the echo statement ends up not having any text after the colon, which is not what we want. What I'll do is issue an <code>R</code> (<code>restart</code>) command and then step back to line 13 so that I can check the value of the variable. Once I'm back at the <code>echo</code> statement on line 13, I use the command <code class="commandline">print $VALUE1</code> to inspect the value of that variable. A snippet of the output from the <code>print</code> command is in <strong>Listing 19</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 19</strong></em></p>
<code>
7:	if [ -z $1 ] # Space left out after first test bracket
bashdb<0> step
(/home/jwright/Documents/Scripts/Learning/buggy_script.sh:13):
13:	echo "The value is: $VALUE1"
bashdb<1> print $VALUE1

bashdb<2>
</code>
</pre>

<p class="single">
There's a blank line between the <code>bashdb<1> print $VALUE1</code> and <code>bashdb<2></code> lines. This tells me that there is definitely not a value (or there's a blank string) set for the <code>VALUE1</code> variable. To correct this I go back into edit mode, and add the variable declaration <code>VALUE1="1"</code> just above our <code>echo</code> statement. I follow the same edit, save, exit, restart (with the <code>R</code> character) routine as before, and then step down through the <code>echo</code> statement again.
</p>

<p class="single">
This time the output from the <code>echo</code> statement is <code>The value is: 1</code> which is what we would expect. With that error fixed, we continue to step down through the script until we realize that we're stuck in our infinite while loop. We can use the <code>print</code> statement here as well, and with the line <code>print $num</code> we see that the <code>num</code> variable is not being incremented. Once again, we enter edit mode to fix the problem. We add the statement <code>num=$((num+1))</code> at the bottom of our while loop, save, exit, and restart. We now see that the <code>num</code> variable is incrementing properly and that the loop will exit. We can type the <code>continue</code> command to let the loop finish without any more intervention.
</p>

<p class="single">
After the script has run successfully, you'll see the message <code>Debugged program terminated normally. Use q to quit or R to restart.</code> If you haven't been adding comments as you go, it would be a good idea at this point to re-enter edit mode and add those comments to any changes that you made. Make sure to run your script through one more time though to make sure that you didn't break anything during the process of commenting.
</p>

<p class="single">
That's a pretty simple BASHDB session, but my hope is that it will give you a good start. BASHDB is a great tool to add to your shell script development toolbox.
</p>

<h3>Tips and Tricks</h3>
<ul>
<li>If you're like many of us, you may have trouble with quoting in your scripts from time to time. If you need a hint on how quoted sections are being interpreted by the shell, you can replace the command that's acting on the quoted section with the <code>echo</code> command. This will give you output showing how your quotes are being interpreted. This can also be a handy trick to use when you need insight into other issues like shell expansion too.</li>
<li>If you don't indent temporary (debugging) code, it will be easier to find in order to remove it before releasing your script to users. If you don't already make a habit of indenting your scripts in the first place, I would recommend that you start. It greatly increases the readability, and thus maintainability, of your scripts.</li>
<li>You can set the <code>PS4</code> environment variable to include more information with the shell's debugging output. You can add things like line numbers, filenames, and more. For example, you would use the line <code class="commandline">export PS4='$LINENO '</code> to add line numbers to your script's debugging output. The creator of the bashdb script debugger sets the PS4 variable to <code>(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]} - [${SHLVL},${BASH_SUBSHELL}, $?]</code> which gives you very detailed information about where you're at in your script. You can make this change to the variable permanent by adding an <code>export</code> declaration to one of your bash configuration files.</li>
<li>Make sure to use unique names for your shell scripts. You can run into problems if you name your shell script the same as a system or built-in command (i.e. <code>test</code>). I like to make my shell script names distinctive, and for added protection I almost always add a <code>.sh</code> extension onto the end of the filename.</li>
</ul>

<a name="scripting"></a>
<h3>Scripting</h3>
<p class="single">
These scripts are somewhat simplified and in most cases could be done other ways too, but they will work to illustrate the concepts. If you use these scripts, make sure you adapt them to your situation. Never run a script or command without understanding what it will do to your system.
</p>

<p class="single">
Our first script example is going to have two separate parts to it. The first is a script in which we've enclosed our debugging functionality from above. This is a case where it's helpful to create modular code so that other scripts can add debugging functionality simply by sourcing one file. That way you're not duplicating code needlessly for commonly used functionality. The second script implements the debugging script, and uses a command line option (<code class="optionsonly">-d</code>) to enable debugging. The script also uses multiple debugging levels to allow the user to control how verbose the output is by passing an argument to the <code class="optionsonly">-d</code> option.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 20</strong></em></p>
<code>
#!/bin/bash - 

# File:debug_module.sh
# Holds common script debugging functionality

# Set the PS4 variable to add line #s to our debug output
PS4='Line $LINENO : '

# The function that enables the enabling/disabling of 
# debugging in the script, and also takes the user 
# specified debug level into account.
# 0 = No debugging
# 1 = Debug executed statements only
# 2 = Debug all lines and executed statements
function DEBUG() 
{        
    # We need to see what level (0-2) of debugging is set               
    if [ "$1" = "0" ] #User disabled debugging
    then
        echo "Debugging Off"
        set +xv        
        # Set the variable that tracks the debugging state
        _DEBUG=0
    elif [ "$1" = "1" ] #User wants minimal debugging
    then
        echo "Minimal Debugging"
        set -x  
        # Set the variable that tracks the debugging state
        _DEBUG=0      
    elif [ "$1" = "2" ] #User wants maximum debugging
    then
        echo "Maximum Debugging"
        set -xv   
        # Set the variable that tracks the debugging state
        _DEBUG=0
    else #Run/suppress a command line depending on debug level
        # If debugging is turned on, output the line
        # that this function was passed as a parameter
        if [ $_DEBUG -gt 0 ]
        then             
            $@
        fi
    fi
}
</code>
</pre>

<p class="single">
This script has two main purposes. One is to set the <code>PS4</code> variable so that line numbers are added to the debugging output to make it easier to trace errors. The other is to provide a function that takes an argument of either a number (0-2), or a command line and then decides what to do with it. If the argument is a number from 0 to 2, the function sets a debugging level accordingly. Level 0 turns off all debugging (<code class="commandline">set +xv</code>), level 1 turns on execution tracing only (<code class="commandline">set -x</code>), and level 2 turns on execution tracing and line echoing (<code class="commandline">set -xv</code>). Anything else that is passed to the function is treated as a command line that is either run or suppressed depending on what the debugging level is.
</p>

<p class="single">
As always, there are many ways to improve this script. One would be to add more debugging levels to it. I created three (0-2), which accommodated only the <code class="optionsonly">-x</code> and <code class="optionsonly">-v</code> options. You could add another level for the <code class="optionsonly">-u</code> option, or create your own custom levels. <strong>Listing 21</strong> shows an implementation of our simple modular debugging script.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 21</strong></em></p>
<code>
#!/bin/bash - 

# File: debug_module_test.sh
# Used as a test of the debug_module.sh script

# Source the debug_module.sh script so that its
# function(s) will be used as this script's own
. ./debug_module.sh

# Parse the command line options and set this script up for use
while getopts "d:h" opt
do
    case $opt in
    d)    _DEBUG=$OPTARG # Enable debugging
          DEBUG $_DEBUG          
          ;;
    h)    echo "Usage: $0 [-dh]" #Give the user usage info
          echo "  -d   Enables debugging mode"
          echo "  -h   Displays this help message"
          exit 0
          ;;    
    '?')  echo "$0: Invalid Option - $OPTARG"
          echo "Usage: $0 [-dh]"
          exit 1
          ;;   
    esac
done

# Begin our test statements
DEBUG echo "Debugging 1"

DEBUG echo "Debugging 2"

echo "Regular Output Line"

# Turn debugging off
DEBUG 0

# Test to make sure debugging is off
DEBUG echo "Debugging 3"

# You can also create your own custom debugging output sections
_DEBUG=2  #Manually set debugging back to max for last section
[ $_DEBUG -gt 0 ] &#038;&#038; echo "First debugging level"
[ $_DEBUG -gt 1 ] &#038;&#038; echo "Second debugging level"
</code>
</pre>

<p class="single">
The first statement that you see in the <strong>Listing 21</strong> script is a source statement reading the modular debugging script (<code>debug_module.sh</code>). This treats the debugging script as if it was part of the script we're currently running. The next major section that you see is the while loop that parses the command line options and arguments. The main option to be concerned with is "d", since it's the one that enables or disables debugging output. The <code>getopts</code> command requires the <code class="optionsonly">-d</code> option to have an argument on the command line via the <code class="commandline">getopts "d:h"</code> statement. The user passes a 0, 1, or 2 to the option and that in turn sets the debugging level via the <code>_DEBUG</code> variable and the <code>DEBUG</code> function. The <code>DEBUG</code> function is called 4 more times throughout the rest of the script. Three of those times it is used as a switch to run or suppress a line of the script, and once it is used to reset the debugging level to 0 (debugging off).
</p>

<p class="single">
The last three lines of the script are a little different. I put them in there to show how you could implement your own custom debugging functionality. In the first of those lines, the <code>_DEBUG</code> variable is set to 2 (maximum debugging output). The next two lines are used to select how much debugging output you see. When you set <code>_DEBUG</code> to 1, the line "First debugging level" is output. If you set <code>_DEBUG</code> to 2 as in the script, the conditions for both the "First debugging level" (> 0) and the "Second debugging level" (> 1) statements are met, so both lines are output. <strong>Listing 22</strong> shows the output that you get from running this script, and if you look at the bottom you'll see that the lines "First debugging level" and "Second debugging level" are output.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 22</strong></em></p>
<code>
$ ./debug_module_test.sh -d 1
Minimal Debugging
Line 29 : _DEBUG=0
Line 11 : getopts d:h opt
Line 30 : DEBUG echo 'Debugging 1'
Line 18 : '[' echo = 0 ']'
Line 24 : '[' echo = 1 ']'
Line 30 : '[' echo = 2 ']'
Line 39 : '[' 0 -gt 0 ']'
Line 32 : DEBUG echo 'Debugging 2'
Line 18 : '[' echo = 0 ']'
Line 24 : '[' echo = 1 ']'
Line 30 : '[' echo = 2 ']'
Line 39 : '[' 0 -gt 0 ']'
Line 34 : echo 'Regular Output Line'
Regular Output Line
Line 37 : DEBUG 0
Line 18 : '[' 0 = 0 ']'
Line 20 : echo 'Debugging Off'
Debugging Off
Line 21 : set +xv
First debugging level
Second debugging level
</code>
</pre>

<p class="single">
This next script is somewhat like an automated unit test. It's a wrapper script that automatically runs another script with varying combinations of options and arguments so that you can easily look for errors. It takes some time up front to create this script, but it allows you to quickly test how any changes you make to a test script might cause problems for the end user. It could take a lot of time to step through and test all of the option/argument combinations manually on a complex script, and with that extra work (if we're honest) this test might get left out all together. That's where the automation of the script in <strong>Listing 23</strong> comes in.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 23</strong></em></p>
<code>
#!/bin/bash - 

# File unit_test.sh
# A wrapper script that automatically runs another script with
# a varying combination of predefined options and arguments, 
# to help find any errors.

# Variables to make the script a little more readable.
_TESTSCRIPT=$1 #The script that the user wants to test
_OPTSFILE=$2   #The file holding the predefined options
_ARGSFILE=$3   #The file holding the predefined arguments

# Read the options and arguments from their files into arrays.
_OPTSARRAY=($(cat $_OPTSFILE))
_ARGSARRAY=($(cat $_ARGSFILE))

# The string that holds the option/argument combos to try.
_TRIALSTRING=""

# Step through all of the arguments one at a time.
for _ARG in ${_ARGSARRAY[*]}
do
    # The string of multiple command line options that we'll 
    # build as we step through the available options.
    _OPTSTRING=""
    
    # Step through all of the options one at a time.
    for _OPT in ${_OPTSARRAY[*]}
    do
        # Append the new option onto the multi-option string.
        _OPTSTRING="${_OPTSTRING}$_OPT "
        
        # Accumulate the command lines that will be tacked onto
        # the command as we're testing it.
        _TRIALSTRING="${_TRIALSTRING}${_OPT} $_ARG\n" #Single option    
        _TRIALSTRING="${_TRIALSTRING}${_OPTSTRING}$_ARG\n" #Multi-option
    done
done

# Change the Internal Field Separator to avoid newline/space troubles
# with the command list array assignment.
IFS=":"

# Sort the lines and make sure we only have unique entries. This could 
# be taken care of by more clever coding above, but I'm going to let 
# the shell do some extra work for me instead. An array is used to hold
# the command lines.
_CLIST=($(echo -e $_TRIALSTRING | sort | uniq | sed '/^$/d' | tr "\n" ":"))

# Step through each of the command lines that were built.
for _CMD in ${_CLIST[*]}
do
    # We can pipe the full concatenated command string into bash to run it.
    echo $_TESTSCRIPT $_CMD | bash
done
</code>
</pre>

<p class="single">
There are two files that I created to go along with this test script. The first is <code>sample_opts</code>, which holds a single line of possible options separated by spaces (<code>-d -v -q</code>). These options stand for debugging mode, verbose mode, and quiet mode respectively. The second file that I create is <code>sample_args</code>, which contains two possible arguments separated by a space (<code>/etc/passwd /etc/shadow</code>). I'll run our <code>unit_test.sh</code> script by passing it the name of the script to test, the <code>sample_opts</code> argument, and the <code>sample_args</code> argument. For this example, it really doesn't matter what the test script (<code>./test_script.sh</code>) is designed to do. We just provide the options and arguments that we want to test, and that's all the <code>unit_test.sh</code> script needs to know. <strong>Listing 24</strong> shows what happens when I run the test.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 24</strong></em></p>
<code>
$ ./unit_test.sh ./test_script.sh sample_opts sample_args
Debug mode
Debug mode
Debug mode
Verbose mode
Debug mode
Verbose mode
Debug mode
Verbose mode
Quiet mode
The -v and -q options are conflicting.
Debug mode
Verbose mode
Quiet mode
The -v and -q options are conflicting.
Quiet mode
Quiet mode
Verbose mode
Verbose mod
</code>
</pre>

<p class="single">
Notice that the output from the unit test script shows that the <code class="optionsonly">-v</code> and <code class="optionsonly">-q</code> options cause a conflict. I have hard coded that error in the test script for clarity, but in everyday use you would have to look for things like real errors or output that doesn't match what is expected. The error about the <code class="optionsonly">-v</code> and <code class="optionsonly">-q</code> options makes sense in this case because you wouldn't want to run verbose (chatty) mode and quiet (non-chatty) mode at the same time. They are mutually exclusive options that should not be used together. This unit test script not only finds errors that I may miss with manual inspection, it allows you to easily recheck your script whenever you make a change, and ensures that your script is checked the same way every time.
</p>

<p class="single">
There are a lot of improvements that can be made to this unit test script. For starters, the script doesn't check every possible combination of options. It's limited by the order that the options are in the <code>sample_opts</code> file. The script never reorders those options. Another improvement would be to have the script automatically check for common errors like <code class="error">illegal option</code>, <code class="error">file not found</code>, etc. As it stands now though, you can pipe the output of the script to <code>grep</code> in order to look for a specific error yourself.
</p>

<p><a name="troubleshooting"></a></p>
<h3>Troubleshooting</h3>
<p class="single">
The version of BASHDB that came with my chosen Linux distribution had a bug causing an error when a BASHDB function tried to return the value of -1. The problem went away though once I downloaded and compiled the latest version straight from the <a href="http://bashdb.sourceforge.net/">BASHDB website</a>.
</p>

<p class="single">
If a script you're debugging causes BASHDB to hang, you can try the CTRL+C key combination. This should exit from the script you're debugging and return you to the BASHDB prompt.
</p>

<h3>Conclusion</h3>
<p class="single">
There are quite a few tools and methods at your disposal when debugging scripts. From BASH command line options, to a full debugger like BASHDB, to your own custom debugging and test scripts, there's a lot of room for creativity in making your scripts more error free. Better and more thorough debugging of your scripts from the outset will help lessen problems down the line, reducing down time and user frustration. In the future, I'll talk about handling runtime errors and security as the next steps in ensuring the quality and reliability of your shell scripts. Look for another post in this series soon.
</p>
<!-- google_ad_section_end -->

<p><a name="video"></a></p>
<h3>Video</h3>
<p class="single">
To enhance this post, I've provided a video so that you can see a general overview of the provided examples. Browsers supporting the HTML5 video tag should present you with a <a href="http://theora.org/">Theora</a> (ogg/ogv) video, and browsers lacking that support should present you with a Flash substitute via <a href="http://flowplayer.org/">Flowplayer</a>. You can also download the Theora version of the video <a href="/downloads/blog/post2/Device_Or_Resource_Busy_Video.ogg">here</a> by right clicking on the link. 
</p>
<p>
<h4>General Debugging Video</h4>
<video controls><br />
    <source src="/downloads/blog/Better_Scripts_Part_1/Better_Scripts_Part_1_Video_1.ogg" type="video/ogg" style="margin-left:0px;" /><br />
<object id="flowplayer" width="592" height="432" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/Better_Scripts_Part_1/Better_Scripts_Part_1_Video_1.flv", "autoPlay":false}}' />             
</object>
</video>
<br />
<br />

<h4>BASHDB Video</h4>
<video controls><br />
    <source src="/downloads/blog/Better_Scripts_Part_1/Better_Scripts_Part_1_Video_2.ogg" type="video/ogg" style="margin-left:0px;" /><br />
<object id="flowplayer" width="592" height="432" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/Better_Scripts_Part_1/Better_Scripts_Part_1_Video_2.flv", "autoPlay":false}}' />             
</object>
</video>
<br />
</p>

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
I have provided an audio transcript of this post with commentary on each of the listings. Click on a format below to listen in a new window or right click and save the audio to listen to later.
</p>
<p>Get An Audio Transcript Of This Post With Author's Commentary On Listings<br />
<a href="/downloads/blog/Better_Scripts_Part_1/Better_Scripts_Part_1.ogg" target="_blank">ogg</a> (41 MB) | <a href="/downloads/blog/Better_Scripts_Part_1/Better_Scripts_Part_1.mp3" target="_blank">mp3</a> (28.6 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<ol>
<li><a href="http://www.amazon.com/gp/product/143021841X?ie=UTF8&#038;tag=innovatechn08-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=143021841X">Expert Shell Scripting (Expert's Voice in Open Source) Book</a><img src="http://www.assoc-amazon.com/e/ir?t=innovatechn08-20&#038;l=as2&#038;o=1&#038;a=143021841X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />
</li>
<li><a href="http://www.amazon.com/gp/product/0596009658?ie=UTF8&#038;tag=innovatechn08-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=0596009658">Learning the bash Shell: Unix Shell Programming (In a Nutshell (O'Reilly))</a><img src="http://www.assoc-amazon.com/e/ir?t=innovatechn08-20&#038;l=as2&#038;o=1&#038;a=0596009658" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />
</li>
<li><a href="http://www.sun.com/bigadmin/content/submitted/script_debug.jsp">BigAdmin Community Debugging Tip</a></li>
<li><a href="http://docstore.mik.ua/orelly/unix/upt/ch46_01.htm">Shell Script Debugging Gotchas</a></li>
<li><a href="http://www.cyberciti.biz/tips/debugging-shell-script.html">NixCraft Debugging Article</a></li>
<li><a href="http://www.linuxjournal.com">Linux Journal, April 2010, Work The Shell, By Dave Taylor, "Our Twitter Autoresponder Goes Live!", pp 24-26</a></li>
<li><a href="http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_02_03.html">The Linux Documentation Project Debugging Article</a></li>
<li><a href="http://bashdb.sourceforge.net/">BASHDB Homepage</a></li>
<li><a href="http://bashdb.sourceforge.net/bashdbOutline.html">BASHDB Documentation</a></li>
<li><a href="http://raz.cx/blog/2005/08/handy-bash-debugging-trick.html">Line Number Output In set -x Debugging Output</a></li>
<li><a href="http://www.dedoimedo.com/computers/bash-tricks.html">6 Cool Bash Tricks Article</a></li>
<li><a href="http://www.linux.com/archive/articles/114359">Using VIM as a BASH IDE</a></li>
<li><a href="http://www.softpanorama.org/Scripting/Shellorama/bash_debugging.shtml">General BASH Debugging Info</a></li>
<li><a href="http://www.linuxtopia.org/online_books/advanced_bash_scripting_guide/debugging.html">Good Debugging Reference With Sample Error-Filled Scripts</a></li>
<li><a href="http://wiki.bash-hackers.org/scripting/debuggingtips">Good Debugging Tips Page By Bash-Hackers</a></li>
<li><a href="http://linux-faq.blogspot.com/2009/05/debugging-shell-scripts-on-unixlinux.html">Modularizing The Debug Function To A Separate Script</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=1395</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Innovations Adds LPIC Classes and Social Media</title>
		<link>http://www.innovationsts.com/blog/?p=1291</link>
		<comments>http://www.innovationsts.com/blog/?p=1291#comments</comments>
		<pubDate>Thu, 27 May 2010 14:06:32 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[General Info]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[updates]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=1291</guid>
		<description><![CDATA[
Here are a couple of important updates about what&#8217;s going on at Innovations Technology Solutions.
Linux Training
First of all, I&#8217;m proud to announce that Innovations Technology Solutions is now a Linux Professional Institute Approved Training Partner. If you take a look in our navigation bar to the left (also Figure 1), you&#8217;ll notice that there is [...]]]></description>
			<content:encoded><![CDATA[<br />
<p>Here are a couple of important updates about what&#8217;s going on at Innovations Technology Solutions.</p>
<h3>Linux Training</h3>
<p><a href="http://www.lpi.org"><img src="/images/lpi_atp_small_medium.png" style="float:right;" /></a>First of all, I&#8217;m proud to announce that Innovations Technology Solutions is now a Linux Professional Institute Approved Training Partner. If you take a look in our navigation bar to the left (also <strong>Figure 1</strong>), you&#8217;ll notice that there is a new button called &#8220;Training&#8221;. Clicking this button will take you to a page where you can find the latest information on upcoming classes and sign up to attend one. There is also a change that has been made to the homepage so that it features a short introductory section on training at Innovations (<strong>Figure 2</strong>).
</p>
<p>Innovations will begin by offering two classes starting in July, with each class corresponding to one of the tests required to get your LPIC-1 certification. The classes will utilize the excellent courseware developed by <a href="http://www.gurulabs.com">Guru Labs</a>, and a <a href="http://www.prometric.com">Prometric</a> test voucher will be included in the price of each class.
</p>
<table>
<tr>
<td style="width:40%;"><img src="/images/blog/post_4/Figure_1.png" /></td>
<td style="width:20%;"></td>
<td style="width:40%;"><img src="/images/blog/post_4/Figure_2.png" /></td>
</tr>
<tr>
<td style="color:black;text-align:center;width:40%"><strong style="color:grey;"><em>Figure 1</em></strong></td>
<td style="width:20%;"></td>
<td style="color:black;text-align:center;width:40%"><strong style="color:grey;"><em>Figure 2</em></strong></td>
</tr>
</table>

<p><h3>Social Media</h3>
<table style="float:right;"><tr><td><a href="http://www.innovationsts.com/index.html#socialmedia"><img src="/images/social_media_addition_small.png" style="float:right;" /></a></td></tr><tr><td style="text-align:center;"><strong style="color:grey;"><em>Figure 3</em></strong></td></tr></table>Next, you might have already noticed that social media badges have been added to the homepage (<strong>Figure 3</strong>) and to this blog (left). This is so that you can have the easiest access possible to all of the tips, tricks, how-tos, and updates coming from Innovations. Subscribe, become a fan, follow, and always stay up-to-date with the latest news and information.</p>

<h3>Conclusion</h3>
<p>We hope that these changes will further our mission to help you use Linux and other open source technologies in the most practical, productive, and profitable ways possible. Click <a href="http://www.innovationsts.com/index.html#socialmedia">here</a>, or on the image in <strong>Figure 3</strong> to go to the homepage and see the changes there for yourself.</p>

<p>As always, we value your input. Please let us know what you think about the changes.</p>
<br />
<br />
<br /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=1291</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Shared Library Issues In Linux</title>
		<link>http://www.innovationsts.com/blog/?p=1042</link>
		<comments>http://www.innovationsts.com/blog/?p=1042#comments</comments>
		<pubDate>Fri, 09 Apr 2010 16:54:05 +0000</pubDate>
		<dc:creator>Jeremy Mack Wright</dc:creator>
				<category><![CDATA[Fixes]]></category>
		<category><![CDATA[How-Tos]]></category>
		<category><![CDATA[System Administration]]></category>

		<guid isPermaLink="false">http://www.innovationsts.com/blog/?p=1042</guid>
		<description><![CDATA[

Quick Start

If you just want enough information to fix your problem quickly, you can read the How-To section of this post and skip the rest. I would highly recommend reading everything though, as a good understanding of the concepts and commands outlined here will serve you well in the future. We also have Video and [...]]]></description>
			<content:encoded><![CDATA[<!-- google_ad_section_start -->
<br />
<h3>Quick Start</h3>
<p class="narrow">
If you just want enough information to fix your problem quickly, you can read the <a href="#howto">How-To</a> section of this post and skip the rest. I would highly recommend reading everything though, as a good understanding of the concepts and commands outlined here will serve you well in the future. We also have <a href="#video">Video</a> and <a href="#audio">Audio</a> included with this post that may be a good quick reference for you. Don&#8217;t forget that the man and info pages of your Linux/Unix system can be an invaluable resource as well when you&#8217;re trying to solve problems.
</p>

<h3>Preface</h3>
<p class="narrow">
To make things easier on you, all of the black command line and script areas are set up so that you can copy the text from them. This does make using the commands easier, but if you&#8217;re not already familiar with the concepts presented here, typing the commands yourself and working through why you&#8217;re typing them will help you learn more. If you hit problems along the way, take a look at the <a href="#troubleshooting">Troubleshooting</a> section near the end of this post for help.
</p>
<p class="narrow">
There are formatting conventions that are used throughout this post that you should be aware of. The following is a list outlining the color and font formats used.
</p>
<p>
<code>
    Command Name or Directory Path
</code>
<br />
<code class="warnerror">
    Warning or Error
</code>
<br />
<code class="commandline">
    Command Line Snippet With Commands/Options/Arguments
</code>
<br />
<code class="optionsonly">
    Command Options and Their Arguments Only
</code>
<br />
    <span style="color:#FFAF3E;">Hyperlink</span>
</p>

<p class="narrow">
Where listings on command options are made available, anything with square brackets around it (&#8220;<code>[</code>" and "<code>]</code>&#8220;) is an argument to the option, and a pipe (&#8220;<code>|</code>&#8220;) means that you can choose one of two alternatives (<code>[4|6]</code> means choose 4 or 6).
</p>

<h3>Overview</h3>
<p class="narrow">
This post is geared more toward system administrators than software developers, but anyone can make good use of the information that you&#8217;re going to see here. The <a href="#resources">Resources</a> section holds links to take your study further, even into the developer realm. I&#8217;m going to start off by giving you a brief background on shared libraries and some of the rules that apply to their use. <strong>Listing 1</strong> shows an example of an error you might see after installing PostgreSQL via a bin installer file. In this post, I&#8217;m going to step through some commands and techniques to help you deal with this type of shared library problem. I&#8217;ll also work through resolving the error in <strong>Listing 1</strong> as an example, and give you some tips and tricks as well as items to help you if you get stuck.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 1</strong></em></p>
<code>
$ ./psql 
./psql: error while loading shared libraries: libpq.so.5: cannot open shared object file: No such file or directory
</code>
</pre>

<a name="background"></a>
<h3>Background</h3>
<p class="narrow">
Shared libraries are one of the many strong design features of Linux, but can lead to headaches for inexperienced users, and even experienced users in certain situations. Shared libraries allow software developers keep the size of their application storage and memory footprints down by using common code from a single location. The <code>glibc</code> library is a good example of this. There are two standardized locations for shared libraries on a Linux system, and these are the <code>/lib</code> and <code>/usr/lib</code> directories. On some distributions <code>/usr/local/lib</code> is included, but check the documentation for your specific distribution to be sure. These are not the only locations that you can use for libraries though, and I&#8217;ll talk about how to use other library directories later. According to the Filesystem Hierarchy Standard (FHS), <code>/lib</code> is for shared libraries and kernel modules that are required for startup and running in the root filesystem (<code>/bin</code> and <code>/sbin</code>), and <code>/usr/lib</code> holds most of the internal libraries that are not meant to be executed directly by users or shell scripts. The <code>/usr/local/lib</code> directory is not defined in the latest version of the FHS, but if it exists on a distribution it normally holds libraries that aren&#8217;t a part of the standard distribution, including libraries that the system administrator has compiled/installed after the initial setup. There are some other directories like <code>/lib/security</code> that holds PAM modules, but for our discussion we&#8217;ll focus on <code>/lib</code> and <code>/usr/lib</code>.
</p>

<p class="narrow">
The counterpart to the dynamically linked (shared) library is the statically linked library. Whereas dynamically linked libraries are loaded and used as they are needed by the applications, statically linked libraries are either built into, or closely associated with a program at the time it is compiled. A couple of the situations where static libraries are used is when you&#8217;re trying to work around an odd/outdated library dependency, or when you&#8217;re building a self-contained rescue system. Static linking typically makes the resulting application faster and more portable, but increases the size (and thus the memory and storage footprint) of the binary. There is also a multiplication of the size of a static library&#8217;s footprint if more than one program uses it. For instance, one program using a library that is 10 MB in size just consumes 10 MB of memory (1 program x 10 MB), but if you run 10 programs with the same library compiled into them, you end up with 100 MB of memory consumed (10 programs x 10 MB). Also, when programs are statically linked, they can&#8217;t take advantage of updates made to the libraries that they depend on. They are locked into whatever version of the library they were compiled with. Programs that depend on dynamically linked libraries refer to a specific file on the Linux file system, and so when that file is updated, the program can automatically take advantage of the new features and fixes the next time it loads.
</p>

<p class="narrow">
Shared libraries typically have the extension <code>.so</code> which stands for Shared Object. Library file names are followed by a version numbering scheme which can include major and minor version numbers. A system of symbolic links are used to point the majority of programs to the latest and greatest library version, while still allowing a minority of programs to use older libraries. <strong>Listing 2</strong> shows output that I modified to illustrate this point.
</p>
<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 2</strong></em></p>
<code>
$ ls -l | grep libread
lrwxrwxrwx  1 root root      18 2010-03-03 11:11 libreadline.so.5 -> libreadline.so.5.2
-rw-r--r--  1 root root  217188 2009-08-24 19:10 libreadline.so.5.2
lrwxrwxrwx  1 root root      18 2010-02-02 09:34 libreadline.so.6 -> libreadline.so.6.0
-rw-r--r--  1 root root  225364 2009-09-23 08:16 libreadline.so.6.0
</code>
</pre>

<p class="narrow">
You can see in the output that there are two versions of <code>libreadline</code> installed side-by-side (5.2 and 6.0). The version numbers are in the form <em>major</em>.<em>minor</em>, so 5 and 6 are major version numbers, with 2 and 0 being minor version numbers. You can usually mix and match libraries with the same major version number and differing minor numbers, but it can be a bad idea to use libraries with different major numbers in place of one another. Major version number changes usually represent significant changes to the interface of the library, which are incompatible with previous versions. Minor version numbers are only changed when an update such as a bug fix is added without significantly changing how the library interacts with the outside world. Another thing that you&#8217;ll notice in <strong>Listing 2</strong> is that there are links created from <code>libreadline.so.5</code> to <code>libreadline.so.5.2</code> and from <code>libreadline.so.6</code> to <code>libreadline.so.6.0</code>. This is so that programs that depend on the 5 or 6 series of the libraries don&#8217;t have to figure out where the newest version of the library is. If an application works with major version 6 of the library, it doesn&#8217;t care if it grabs 6.0, 6.5, or 6.9 as long as it&#8217;s compatible, so it just looks at the base name of the library and takes whatever that&#8217;s linked to. There are also a couple of other situations that you&#8217;re likely to encounter with this linking scheme. The first is that you may see a link file name containing no version numbers (<code>libreadline.so</code>) that points to the actual library file (<code>libreadline.so.6.0</code>). Also, even though I said that libraries with different major version numbers are risky to mix, there are situations where you will see an earlier major version number (<code>libreadline.so.5</code>) linked to a newer version number of the library (<code>libreadline.so.6.0</code>). This should only happen when your distribution maintainers or system administrators have made sure that nothing will break by doing this. <strong>Listing 3</strong> shows an example of the first situation. 
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 3</strong></em></p>
<code>
$ ls -l | grep ".*so "
lrwxrwxrwx  1 root root      18 2010-02-02 14:48 libdbus-1.so -> libdbus-1.so.3.4.0
</code>
</pre>

<p class="narrow">

</p>
<p class="narrow">
All things considered, the shared library methodology and numbering scheme do a good job of ensuring that your software can maintain a smaller footprint, make use to the latest and greatest library versions, and still have backwards compatibility with older libraries when needed. With this said, the shared library model isn&#8217;t perfect. There are some disadvantages to using them, but those disadvantages are typically considered to be outweighed by the benefits. One of the disadvantages is that shared libraries can slow the load time of a program. This is only a problem the first time that the library is loaded though. After that, the library is in memory and other applications that are launched won&#8217;t have to reload it. One of the most potentially dangerous drawbacks of shared libraries is that they can create a central point of failure for you system. If there is a library that a large set of your programs rely on and it gets corrupted, deleted, over-written, etc, all of those programs are probably going to break. If any of those programs that were just taken down are needed to boot your Linux system, you&#8217;ll be dead in the water and in need of a rescue CD. 
</p>

<p class="narrow">
While I would argue that dependency chains are not really a &#8220;problem&#8221;, they can work hardships on a system administrator. A dependency chain happens when one library depends on another library, then that one depends on another, and another, and so on. When dealing with a dependency chain, you may have satisfied all of the first level dependencies, but your program still won&#8217;t run. You have to go through and check each library in turn for a dependency chain, and then follow that chain all the way through, filling in the missing dependencies as you go.
</p>

<p class="narrow">
One final problem with shared libraries that I&#8217;ll mention again is version compatibility issues. You can end up with a situation where two different applications require different versions of the same library &#8211; that aren&#8217;t compatible. That is the reason for the version numbering system that I talked about above, and robust package management systems have helped ease shared library problems from the user&#8217;s perspective, but they still exist in certain situations. Any time that you compile and/or install an application/library yourself on your Linux system, you have to keep an eye out for problems since you don&#8217;t have the benefit of a package manager ensuring library compatibility.
</p>

<h3>Introducing ld-linux.so</h3>
<p class="single">
<code>ld-linux.so</code> (or <code>ld.so</code> for older a.out binaries) is itself a library, and is responsible for managing the loading of shared libraries in Linux. For the purposes of this post, we&#8217;ll be working with <code>ld-linux.so</code>, and if you need or want to learn more about the older style a.out loading/linking, have a look at the <a href="#resources">Resources</a> section. The <code>ld-linux.so</code> library reads the <code>/etc/ld.so.cache</code> file which is a non-human readable file that is updated when you run the <code>ldconfig</code> command. The way that shared libraries are loaded is that <code>ld-linux.so</code> checks to see what paths to look for the libraries in by checking the value of the <code>LD_LIBRARY_PATH</code> environment variable, then the contents of the <code>/etc/ld.so.cache</code> file, and finally the default path of <code>/lib</code> followed by the <code>/usr/lib</code> directory.
</p>

<p class="single">
The <code>LD_LIBRARY_PATH</code> environment variable is a colon separated list that preempts all of the other library paths in the <code>ldconfig</code> search order. This means that you can use it to temporarily alter library paths when you&#8217;re trying to test a new library before rolling it out to the entire system, or to work around problems. This variable is typically not set by default on Linux distributions, and should not be used as a permanent fix. Use it with care, and preference should be given to the other library search path configuration methods. A handy thing about the LD_LIBRARY_PATH variable is that since it&#8217;s an environment variable, you can set it on the same line as a command and the new value will only effect the command, and not the parent environment. So, you would issue a command line like <code class="commandline">LD_LIBRARY_PATH="/home/user/lib" ./program</code> to run <code>program</code> and force it to use the experimental shared libraries in <code>/home/user/lib</code> in preference to any others on the system. The shell that you run <code>program</code> in never sees the change to <code>LD_LIBRARY_PATH</code>. Of course you can also use the <code>export</code> command to set this variable, but be careful because doing this will affect your entire system. One final thing about the <code>LD_LIBRARY_PATH</code> variable is that you don&#8217;t have to run <code>ldconfig</code> after changing it. The changes take effect immediately, unlike changes to <code>/lib</code>, <code>/usr/lib</code>, and <code>/etc/ld.so.conf</code>. I&#8217;ll explain more about <code>ldconfig</code> later.
</p>

<p class="single">
You can use the <code>ld-linux.so</code> library by itself to list which libraries a program depends on. It&#8217;s behavior is very much like the <code>ldd</code> command that we&#8217;ll talk about next because <code>ldd</code> is actually a wrapper script that adds more sophisticated behavior to <code>ld-linux.so</code>. In most cases <code>ldd</code> should be your preferred command for listing required shared libraries. In order to use ld-linux.so.2 to get a listing of the depended upon libraries for the <code>ls</code> command, you would type <code class="commandline">/lib/ld-linux.so.2 --list /bin/ls</code> swapping the 2 out for whatever major version of the library that your system is running. I&#8217;ve shown some of the command line options for <code>ld-linux.so</code> in <strong>Listing 4</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 4</strong></em></p>
<code>
--list                   Lists all library dependencies for the executable

--verify                 Verifies that the program is dynamically linked and that 
                         the ld-linux.so linker can handle it

--library-path [PATH]    Overrides the LD_LIBRARY_PATH environment variable and 
                         uses <strong>PATH</strong> instead
</code>
</pre>

<p class="single">
You can start a program directly with <code>ld-linux.so</code> by using the following command line form <code class="commandline">/lib/ld-linux.so.2 --library-path FULL_LIBRARY_PATH FULL_EXECUTABLE_PATH</code> , where you replace 2 with whatever version of the library you are using. An example would be <code class="commandline">/lib/ld-linux.so.2 --library-path /home/user/lib /home/bin/program</code> which would run <code>program</code> using <code>/home/user/lib</code> as the location to look for required libraries. This should be used for testing purposes only, and not for a permanent fix on a production system though.
</p>

<a name="introldd"></a>
<h3>Introducing ldd</h3>
<p class="single">
The name of the <code>ldd</code> command comes from its function, which is to &#8220;List Dynamic Dependencies&#8221;. As mentioned in the previous section, by default the <code>ldd</code> command gives you the same output as issuing the command line <code class="commandline">/lib/ld-linux.so.2 --list FULL_EXECUTABLE_PATH</code>. Each library entry in the output includes a hexadecimal number which is the load address of the library, and can change from run to run. Chances are that system administrators will never even need to know what this value is, but I&#8217;ve mentioned it here because some people may be curious. <strong>Listing 5</strong> shows a few of the options for <code>ldd</code> that I use the most.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 5</strong></em></p>
<code>
-d --data-relocs        Perform data relocations and report any missing objects

-r --function-relocs    Perform relocations for both data objects and functions,
                        and report any missing objects or functions

-u --unused             Print unused direct dependencies

-v --verbose            Print all information, including e.g. symbol versioning
                        information
</code>
</pre>

<p class="single">
Keep in mind that you have to give <code>ldd</code> the full path to the binary/executable for it to work. The only way to work around giving <code>ldd</code> the full path is to use <code>cd</code> to change into the directory where the binary is. Otherwise you get an error like <code class=warnerror>ldd: ./ls: No such file or directory</code>. The only time that you would need to run <code>ldd</code> with root privileges would be if the binary has restrictive permissions placed on it. 
</p>

<p class="single">
As I mentioned in the <a href="#background">Background</a> section, you need to be aware of dependency chains when using shared libraries. Just because you&#8217;ve run the <code>ldd</code> command on an executable and satisfied all of it&#8217;s top level dependencies doesn&#8217;t mean that there aren&#8217;t more dependencies lurking underneath. If your program still won&#8217;t run, you should check each of the top level libraries to see if any of them have their own library dependencies that are unmet. You continue that process, running <code>ldd</code> on each library in each layer until you&#8217;ve satisfied all of the dependencies.
</p>

<h3>Introducing ldconfig</h3>
<p class="single">
Any time that you make changes to the installed libraries on your system, you&#8217;ll want to run the <code>ldconfig</code> command with root privileges to update your library cache. <code>ldconfig</code> rebuilds the <code>/etc/ld.so.cache</code> file of currently installed libraries based on what it first finds in the directories listed in the <code>/etc/ld.so.conf</code> file, and then in the <code>/lib</code> and <code>/usr/lib</code> directories. The <code>/etc/ld.so.cache</code> file is formatted in binary by <code>ldconfig</code> and so it&#8217;s not designed to be human readable, and should not be edited by hand. Formatting the <code>ld.so.cache</code> file in this way makes it more efficient for the system to retrieve the information. The <code>ld.so.conf</code> file may include a directive that reads <code>include /etc/ld.so.conf.d/*.conf</code> that tells <code>ldconfig</code> to check the <code>ld.so.conf.d</code> directory for additional configuration files. This allows the easy addition of configuration files to load third-party shared libraries such as those for MySQL. On some distributions, this <code>include</code> directive may be the only line you find in the <code>ld.so.conf</code> file.
</p>

<p class="single">
You often need to run <code>ldconfig</code> manually because a Linux system cannot always know when you have made changes to the currently installed libraries. Many package management systems run <code>ldconfig</code> as part of the installation process, but if you compile and/or install a library without using the package management system, the system software may not know that there is a new library present. The same applies when you remove a shared library.
</p>

<p class="single">
<strong>Listing 6</strong> holds several options for the <code>ldconfig</code> command. This is by no means an exhaustive list, so be sure to check the man page for more information.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 6</strong></em></p>
<code>
-C [file]           Specifies an alternate cache <strong>file</strong> other than ld.so.cache

-f [file]           Specifies an alternate configuration <strong>file</strong> other than 
                    ld.so.conf

-n                  Rebuilds the cache using only directories specified on the 
                    command line, skipping the standard directories and ld.so.conf

-N                  Only updates the symbolic links to libraries, skipping the 
                    cache rebuilding step

-p --print-cache    Lists the shared library cache, but needs to be piped to the 
                    less command because of the amount of output

-v --verbose        Gives output information about version numbers, links 
                    created, and directories scanned

-X                  Opposite of -N, it rebuilds the library cache and skips 
                    updating the links to the libraries</code>
</pre>

<p class="single">
<code>ldconfig</code> is not the only method used to rebuild the library cache. Gentoo handles this task in a slightly different way, which I&#8217;ll talk about next.
</p>

<h3>Introducing env-update</h3>
<p class="single">
Gentoo takes a slightly different path to updating the cache of installed libraries which includes the use of the <code>env-update</code> script. <code>env-update</code> reads library path configuration files from the <code>/etc/env.d</code> directory in much the same way that <code>ldconfig</code> reads files from <code>/etc/ld.so.conf.d</code> via the <code>ld.so.conf</code> <code>include</code> directive. <code>env-update</code> then creates a set of files within <code>/etc</code> including <code>ld.so.conf</code> . After this, <code>env-update</code> runs <code>ldconfig</code> so that it reloads the cache of libraries into the <code>/etc/ld.so.cache</code> file.
</p>

<p><a name="howto"></a></p>
<h3>How-To</h3>
<p class="single">
Hopefully by the point you&#8217;re reading this section you either have, or are beginning to get a pretty good understanding of the commands used when dealing with shared libraries. Now I&#8217;m going to take you through a sample scenario of a PostgreSQL installation running on Red Hat 5.4 to demonstrate how you would use these commands.
</p>

<p class="single">
I have downloaded a bin installer to use on my CentOS installation instead of the PostgreSQL Yum repository because I wanted to install a specific older version of Postgres outside of the package management system. In most cases you&#8217;ll want to use a repository with your package management system though, as you&#8217;ll get a more integrated installation that can be kept up to date more easily. That&#8217;s assuming that your Linux distribution offers the repository mechanism for installing and updating packages, and many distributions don&#8217;t.
</p>

<p class="single">
After installing Postgres via the bin file, I take a look around and see that the majority of the PostgreSQL files are in the <code>/opt/PostgreSQL</code> directory. I decide to experiment with the binaries under the <code>pgAdmin3</code> directory, and so I use the cd command to move to <code>/opt/PostgreSQL/8.4/pgAdmin3/bin</code>. Once I&#8217;m there, I try to run the <code>psql</code> command and get the output in <strong>Listing 7</strong> (same as <strong>Listing 1</strong>).
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 7</strong></em></p>
<code>
$ ./psql 
./psql: error while loading shared libraries: libpq.so.5: cannot open shared object file: No such file or directory
</code>
</pre>

<p class="single">
There might be some of you reading this who will realize that I could have probably avoided the library error in <strong>Listing 7</strong> by running the <code>psql</code> command from the <code>/opt/PostgreSQL/8.4/bin</code> directory. While this is true, for the sake of this example I&#8217;m going to forge ahead trying to figure out why it won&#8217;t run under the <code>pgAdmin3</code> directory.
</p>

<p class="single">
The main thing that I take away from the output in <strong>Listing 7</strong> is that there is a shared library named <code>libpq.so.5</code> that cannot be found by <code>ld-linux.so</code>. To dig just a little bit deeper, I use the <code>ldd</code> command and get the output in <strong>Listing 8</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 8</strong></em></p>
<code>
$ ldd ./psql
	linux-gate.so.1 =>  (0x003fc000)
	libpq.so.5 => not found
	libxml2.so.2 => /usr/lib/libxml2.so.2 (0x00845000)
	libpam.so.0 => /lib/libpam.so.0 (0x0054f000)
	libssl.so.4 => not found
	libcrypto.so.4 => not found
	libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0x00706000)
	libz.so.1 => /usr/lib/libz.so.1 (0x003d5000)
	libreadline.so.4 => not found
	libtermcap.so.2 => /lib/libtermcap.so.2 (0x00325000)
	libcrypt.so.1 => /lib/libcrypt.so.1 (0x004a3000)
	libdl.so.2 => /lib/libdl.so.2 (0x0031f000)
	libm.so.6 => /lib/libm.so.6 (0x0033f000)
	libc.so.6 => /lib/libc.so.6 (0x001d7000)
	libaudit.so.0 => /lib/libaudit.so.0 (0x00532000)
	libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0x0079e000)
	libcom_err.so.2 => /lib/libcom_err.so.2 (0x0052d000)
	libkrb5support.so.0 => /usr/lib/libkrb5support.so.0 (0x006f6000)
	libkeyutils.so.1 => /lib/libkeyutils.so.1 (0x005ae000)
	libresolv.so.2 => /lib/libresolv.so.2 (0x00518000)
	/lib/ld-linux.so.2 (0x001b9000)
	libselinux.so.1 => /lib/libselinux.so.1 (0x003bb000)
	libsepol.so.1 => /lib/libsepol.so.1 (0x00373000)
</code>
</pre>

<p class="single">
Notice that the error given in <strong>Listing 7</strong> only gives you the first shared library that&#8217;s missing. As you can see in <strong>Listing 8</strong>, this doesn&#8217;t mean that other libraries won&#8217;t be missing as well.
</p>

<p class="single">
My next step is to see if the missing libraries are already installed somewhere on my system using the <code>find</code> command. If the libraries are not already installed, I&#8217;ll have to use the package management system or the Internet to see which package(s) I need to install to get them. The output in <strong>Listing 9</strong> shows the output from the <code>find</code> command.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 9</strong></em></p>
<code>
$ sudo find / -name libpq.so.5
/opt/PostgreSQL/8.4/lib/libpq.so.5
/opt/PostgreSQL/8.4/pgAdmin3/lib/libpq.so.5
</code>
</pre>

<p class="single">
After looking in both of the directories shown in the output, I notice that all of my other missing libraries are housed within them. If you were just temporarily testing some new features of the <code>psql</code> command, you could use the export command to set the <code>LD_LIBRARY_PATH</code> environment variable as I have in <strong>Listing 10</strong>.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 10</strong></em></p>
<code>
$ export LD_LIBRARY_PATH="/opt/PostgreSQL/8.4/lib/"
bash-3.2$ ./psql 
Password: 
psql (8.4.3)
Type "help" for help.

postgres=#
</code>
</pre>

<p class="single">
You can see that once I&#8217;ve set the <code>LD_LIBRARY_PATH</code> variable, all I have to do is enter my PostgreSQL password and I&#8217;m greeted with the <code>psql</code> command line interface. I&#8217;ve used the <code>/opt/PostgreSQL/8.4/lib/</code> library directory instead of the one beneath the <code>pgAdmin3</code> directory as a matter of preference. In this case both directories include the same required libraries. For a permanent solution, we can add the path via the <code>ld.so.conf</code> file.
</p>

<p class="single">
I could just add <code>/opt/PostgreSQL/8.4/lib/</code> directly to the ld.so.conf file on its own line, but since the <code>ld.so.conf</code> file on my installation has the <code>include ld.so.conf.d/*.conf</code> directive, I&#8217;m going to add a separate conf file instead. In <strong>Listing 11</strong> you can see that I&#8217;ve echoed the PostgreSQL library path into a file called <code>postgres-i386.conf</code> under the <code>/etc/ld.so.conf.d</code> directory. After checking to make sure the file has the directory in it, I run the <code>ldconfig</code> command to update the library cache.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 11</strong></em></p>
<code>
$ sudo -s
Password: 
[root@localhost bin]# echo /opt/PostgreSQL/8.4/lib > /etc/ld.so.conf.d/postgres-i386.conf
[root@localhost bin]# cat /etc/ld.so.conf.d/postgres-i386.conf
/opt/PostgreSQL/8.4/lib
[root@localhost bin]# /sbin/ldconfig
[root@localhost bin]# exit
exit
</code>
</pre>

<p class="single">
Make sure that you unset the <code>LD_LIBRARY_PATH</code> variable though so that you can make sure that it was your ld.so.conf configuration file changes that fixed the problem, and not the environment variable. Issuing a command line such as <code class="commandline">unset LD_LIBRARY_PATH</code> will accomplish this for you.
</p>

<p class="single">
There are many scenarios beyond the one in this example, but it gives you the concepts used to work through the majority of shared library problems that you&#8217;re likely to come up against as a system administrator. If you&#8217;re interested in delving more deeply though, there are several links in the <a href="#resources">Resources</a> section that should help you.
</p>

<h3>Tips and Tricks</h3>
<ul>
<li>
I have read that running <code>ldd</code> on an untrusted program can open your system up to a malicious attack. This happens when an executable&#8217;s embedded ELF information is crafted in such a way that it will run itself by specifying its own loader. The man pages on the Ubuntu and Red Hat systems that I checked don&#8217;t mention anything about this security concern, but you&#8217;ll find a very good article by Peteris Krumins in the <a href="#resources">Resources</a> section of this post. I would suggest at least skimming Peteris&#8217; post so that you&#8217;re aware of the security implications of running <code>ldd</code> on unverified code.
</li>
<li>
Although it&#8217;s a little bit beyond the scope of this post, you can compile a program from source and manually control which libraries it links to. This is yet another way to work around library compatibility issues. You use the GNU C Compiler/GNU Compiler Collection (gcc) along with its <code class="optionsonly">-L</code>, and <code class="optionsonly">-l</code> options to accomplish this. Have a look at item 13 (the YoLinux tutorial) in the <a href="#resources">Resources</a> section for an example, and the <code>gcc</code> man page for details on the options.
</li>
<li>
Have a look at the <code>readelf</code> and <code>nm</code> commands if you want a more in-depth look at the internals of the binaries and libraries that you&#8217;re working with. <code>readelf</code> shows you some extra information on your ELF files by reading and parsing their internal information, and <code>nm</code> lists the symbols (functions, etc) within an object file.
</li>
<li>
You can temporarily preempt your current set of libraries and their functions with the <code>LD_PRELOAD</code> environment variable and/or the <code>/etc/ld.so.preload</code> file. Once these are set, the dynamic library loader will use the preload libraries/functions in preference to the ones that you have cached using <code>ldconfig</code>. This can help you work around shared library problems in a few instances.
</li>
<li>
If you run into a program that has its required library path(s) hard coded into it, you can create symbolic links from each one of the missing libraries to the location that&#8217;s expected by the executable. This technique can also help you work around incompatibilities in the naming conventions between what your system software expects, and what libraries are actually named. I talk about using symbolic links in this way a little more in the <a href="#troubleshooting">Troubleshooting</a> section.
</li>
</ul>

<h3>Scripting</h3>
<p class="single">
These scripts are somewhat simplified and in most cases could be done other ways too, but they will work to illustrate the concepts. If you use these scripts, make sure you adapt them to your situation. Never run a script or command without understanding what it will do to your system.
</p>

<p class="single">
The first script shown in <strong>Listing 12</strong> can be used to search directory trees for binaries with missing libraries. It makes use of the <code>ldd</code> and <code>find</code> commands to do the bulk of the work, looping through their output. Since I have heavily commented the scripts in <strong>Listing 12</strong> and <strong>Listing 13</strong>, I won&#8217;t explain the details of how they work in this text.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 12</strong></em></p>
<code>
#!/bin/bash - 

# These variables are designed to be changed if your Linux distro's ldd output
# varies from Red Hat or Ubuntu for some reason
iself="not a dynamic executable" # Used to see if executable is not dynamic
notfound="not.*found"            # Used to see if ldd doesn't find a library

# Step through all of the executable files in the user specified directory
for exe in $(find $1 -type f -perm /111)
do      
    # Check to see if ldd can get any information from this executable. It won't
    # if the executable is something like a script or a non-ELF executable.
    if [ -z "$(ldd $exe | grep -i "$iself")" ]
    then                                   
        # Step through each of the lines of output from the ldd command
        # substituting : for a delimiter instead of a space
        for line in $(ldd $exe | tr " " ":")        
        do
            # If ldd gives us output with our "not found" variable string in it,
            # we'll need to warn the user that there is a shared library issue
            if [ -n "$(echo "$line" | grep -i "$notfound")" ]
            then
                # Grab the first field, ignoring the words "not" or "found".
                # If we don't do this, we'll end up grabbing a field with a
                # word and not the library name.                               
                library="$(echo $line | cut -d ":" -f 1)"
                
                printf "Executable %s is missing shared object %s\n" $exe $library
            fi                                                
        done                                 
    fi 
done
</code>
</pre>

<p class="single">
When run on the <code>/opt/PostgreSQL</code> directory mentioned above, it finds all of the programs that exhibit our missing library problem. As it stands now, this script will only check the first layer of library dependencies. One way to improve it would be to make the script follow the dependency chain of every library to the end, making sure that there is not a library farther down the chain that is missing. Better yet, you could add a &#8220;max-depth&#8221; option so that the user could specify how deeply into the dependency chain they wanted the script to check before moving on. A max-depth setting of &#8220;0&#8243; would allow the user to specify that they wanted the script to follow the dependency chain to the very end.
</p>

<p class="single">
In <strong>Listing 13</strong>, I have created a wrapper script that could be used when developing new software, or as a last ditch effort to work around a really tough shared library problem. It utilizes the shell&#8217;s feature of temporarily setting an environment variable for a command on the same line as the command designation. That way we&#8217;re not setting <code>LD_LIBRARY_PATH</code> for the overall environment, which could cause problems for other programs if there are library naming conflicts.
</p>

<pre class="cliwide">
<p style="margin-left:495px;display:inline;"><em><strong>Listing 13</strong></em></p>
<code>
#!/bin/bash - 

# Set up the variables to hold the PostgreSQL lib and bin paths. These paths may
# vary on your system, so change them accordingly.
LIB_PATH=/opt/PostgreSQL/8.4/lib                    # Postgres library path
BIN_FILE=/opt/PostgreSQL/8.4/pgAdmin3/bin/psql      # The binary to run

# Start the specified program with the library path and have it replace this 
# process. Note that this will not change LD_LIBRARY_PATH in the parent shell.
exec $(LD_LIBRARY_PATH="$LIB_PATH" "$BIN_FILE")
</code>
</pre>

<p class="single">
I&#8217;ve broken the library and binary paths out into variables to make it easier for you to adapt this script for use on your system. This script could easily serve as a template for other wrapper scripts as well, anytime that you need to alter the environment before launching a program. Remember though that this wrapper script should not be used for a permanent solution to your shared library problems unless you have no other choice.
</p>

<p><a name="troubleshooting"></a></p>
<h3>Troubleshooting</h3>
<p class="single">
In some cases, a program may have been hard coded to look for a specific library on your system in a certain path, thus ignoring your local library settings. In order to fix this problem, you can research what version/path of the library the program is looking for and then create a symbolic link between the expected library location and a compatible library. In some cases you can recompile the program with options set to change how/where it looks for libraries. If the programmer was really kind, they may have included a command line option to set the library location, but this would be the exception rather than the rule when library locations are hard coded.
</p>

<p class="single">
The <code>ldd</code> command will not work with older style a.out binaries, and will probably give output mentioning &#8220;DLL jump&#8221; if it encounters one. It&#8217;s a good idea not to trust what <code>ldd</code> tells you when you&#8217;re running it on these types of binaries because the output is unpredictable and inaccurate. Newer ELF binaries have support for <code>ldd</code> built into them via the compiler, which is why they work.
</p>

<p class="single">
Just because the dynamic linker finds a library doesn&#8217;t mean that the library isn&#8217;t missing &#8220;symbols&#8221; (things like functions/subroutines). If this happens, you may be able to match the <code>ldd</code> command output to libraries that are installed, but your program will still have unpredictable behavior (like not starting or crashing) when it tries to access the symbol(s) that are missing. In this case the <code>ldd</code> command&#8217;s <code class="optionsonly">-d</code> and <code class="optionsonly">-r</code> options can give you more information on the missing symbols, and you&#8217;ll need to dig deeper into the software developer&#8217;s documentation to see if there are compatibility issues with the specific version of the library that you&#8217;re running. Remember that you can always use the <code>LD_LIBRARY_PATH</code> variable to temporarily test different versions of the library to see if they fix your problem.
</p>

<p class="single">
There may be some rare cases where <code>ldconfig</code> may not be able to determine a library type (libc4, 5, or 6) from it&#8217;s embedded information. If this happens, you can specify the type manually in the <code>/etc/ld.so.conf</code> file with a directive like <code class="commandline">dirname=TYPE</code> where type can be <code>libc4</code>, <code>libc5</code>, or <code>libc6</code>. According to the man page for <code>ldconfig</code>, you can also specify this information directly on the command line to keep the change on a temporary basis.
</p>

<p class="single">
If you have stubborn library problems that you just can&#8217;t seem to get a handle on, you might try setting the <code>LD_DEBUG</code> environment variable. Try typing <code class="commandline">export LD_DEBUG="help"</code> first and then run a command (like <code>ls</code>) so that you can see what options are available. I normally use &#8220;<code>all</code>&#8220;, but you can be more selective on your choices. The next time that you run a program, you&#8217;ll see output that is like a stack trace for the library loading process. You can follow this output through to see where exactly your library problem is occurring. Issue <code class="commandline">unset LD_DEBUG</code> to disable this debugging output again.
</p>

<h3>Conclusion</h3>
<p class="single">
I hope that this post has armed you with the knowledge that you need to solve any shared library problems that you might come up against. Work through shared library problems step-by-step by determining what library/libraries are needed, finding out if they&#8217;re already installed, installing any missing libraries, and making sure that your Linux distribution can find the libraries, and you should have no problem fixing most of your dynamic library issues. If you have any questions, or have any information that should be added to this post, leave a comment or drop me an email. I welcome your feedback.
</p>
<!-- google_ad_section_end -->

<p><a name="video"></a></p>
<h3>Video</h3>
<p class="single">
To enhance this post, I&#8217;ve provided a video so that you can see a general overview of the provided examples. Browsers supporting the HTML5 video tag should present you with a <a href="http://theora.org/">Theora</a> (ogg/ogv) video, and browsers lacking that support should present you with a Flash substitute via <a href="http://flowplayer.org/">Flowplayer</a>. You can also download the Theora version of the video <a href="/downloads/blog/post3/Shared_Library_Issues_In_Linux_Video_Rev_1.ogg">here</a> by right clicking on the link. 
</p>
<p><video controls><br />
    <source src="/downloads/blog/post3/Shared_Library_Issues_In_Linux_Video_Rev_1.ogg" type="video/ogg" style="margin-left:0px;" /><br />
<object id="flowplayer" width="592" height="432" data="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf"  
    type="application/x-shockwave-flash"> 
     
    <param name="movie" value="http://releases.flowplayer.org/swf/flowplayer-3.1.5.swf" />  
    <param name="allowfullscreen" value="true" />  
    <param name="bgcolor" value="#000000" />  
    
    <param name="flashvars"   
        value='config={"clip":{"url":"http://www.innovationsts.com/downloads/blog/post3/Shared_Library_Issues_In_Linux_Video_Rev_1.flv", "autoPlay":false}}' />             
</object>
</video>
<br />

<a name="audio"></a>
<h3>Audio</h3>
<p class="single">
I have provided an audio transcript of this post with commentary on each of the listings. Click on a format below to listen in a new window or right click and save the audio to listen to later.
</p>
<p>Get An Audio Transcript Of This Post With Author&#8217;s Commentary On Listings<br />
<a href="/downloads/blog/post3/Shared_Libraries_Post_Final_Rev_1.ogg" target="_blank">ogg</a> (45 MB) | <a href="/downloads/blog/post3/Shared_Libraries_Post_Final_Rev_1.mp3" target="_blank">mp3</a> (44 MB)
</p>

<p><a name="resources"></a></p>
<h3>Resources</h3>
<ol>
<li><a href="http://www.ibm.com/developerworks/web/library/l-shlibs.html">IBM developerWorks Article On Shared Libraries By Peter Seebach</a></li>
<li><a href="http://ldn.linuxfoundation.org/article/anatomy-linux-dynamic-libraries-3">Linux Foundation Reference On Statically And Dynamically Linked Libraries  (Developer Oriented)</a></li>
<li><a href="http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470404833.html">LPIC-1 : Linux Professional Institute Certification Study Guide By Roderick W. Smith</a></li>
<li><a href="http://www.amazon.com/LPIC-1-Depth-Michael-Jang/dp/1598639676">LPIC-1 In Depth By Michael Jang</a></li>
<li><a href="http://www.linux.org/docs/ldp/howto/Program-Library-HOWTO/shared-libraries.html">How-To On Shared Libraries From Linux Online</a></li>
<li><a href="http://stackoverflow.com/questions/195741/shared-library-problems-on-linux">Stack Overflow Post Giving An Example Of A Shared Library Problem</a></li>
<li><a href="http://www.openguru.org/2009/04/how-to-fix-shared-library-load-problem.html">OpenGuru Post On Shared Library Problem Caused By Not Having /usr/local/lib in /etc/ld.so.conf</a></li>
<li><a href="http://www.linuxjournal.com/article/1059">An Introduction To ELF Binaries By Eric Youngdale (Linux Journal)</a></li>
<li><a href="http://www.linux-m68k.org/faq/howtellaoutelf.html">Short Explanation Of How To Tell a.out and ELF Binaries Apart</a></li>
<li><a href="http://www.catonmat.net/blog/ldd-arbitrary-code-execution/">Post On ldd Arbitrary Code Execution Security Issues By Peteris Krumins</a></li>
<li><a href="http://www.pathname.com/fhs/pub/fhs-2.3.txt.gz">The Text Version Of The Filesystem Hierarchy System (Version 2.3)</a></li>
<li><a href="http://www.faqs.org/docs/Linux-HOWTO/GCC-HOWTO.html">A Linux Documentation Project gcc Reference Covering Shared Libraries</a></li>
<li><a href="http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html">YoLinux Library Tutorial Including A gcc Linking Example</a></li>
<li><a href="http://www.trilithium.com/johan/2005/08/linux-gate/">Article By Johan Petersson Explaining What linux-gate.so.1 Is And Why You&#8217;ll Never Find It</a></li>
</ol><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://www.innovationsts.com/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.innovationsts.com/blog/?feed=rss2&amp;p=1042</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

